The Conventional Wisdom
The AI community treats vector databases like they're all interchangeable tools for semantic search. Pick one with good benchmarks, load your embeddings, run similarity queries, and you're done. The tutorials make it look simple: spin up a container, ingest some PDFs, build a RAG chatbot, ship it.
In the energy sector, we heard this pitch repeatedly from vendors and consultants. "Just use Pinecone or managed Weaviate, it scales infinitely, you'll be operational in days." The promise was straightforward—turn decades of operational manuals, incident reports, and engineering drawings into conversational AI that operators could query in natural language.
What Actually Happens in Practice
We've deployed vector databases across four utility control centers, two offshore platforms, and multiple renewable energy operations centers over the past three years. Here's what the tutorials don't tell you.
First, data sovereignty isn't optional in energy. NERC CIP-011 requires information protection for BES Cyber System Information. When a major Midwestern utility asked us to build semantic search over their substation engineering drawings, the conversation ended the moment we mentioned cloud-hosted solutions. Their compliance officer physically walked into the meeting and said no. Not "we need to evaluate risk"—just no.
This killed Pinecone immediately. We pivoted to self-hosted options, which meant Qdrant, Weaviate, Milvus, or ChromaDB. The choice mattered more than we expected.
Second, your SCADA historians and OSIsoft PI data don't arrive as clean text. We spent six weeks at a West Texas wind farm trying to vectorize 15 years of turbine maintenance logs. The data was a mix of structured CSV exports, handwritten PDFs scanned at 150dpi, and free-text operator notes with abbreviations only the night shift understood. "VFD tripped, reset per SOP-447" doesn't embed usefully without context about what VFD model, which turbine, and what SOP-447 actually contains.
Vector databases assume you have embeddings. Creating quality embeddings from operational technology data is 80% of the actual work. We learned this after watching our first RAG system confidently hallucinate that a circuit breaker failure in 2019 was related to a transformer issue in 2015 because the embedding model thought "B-phase fault" and "phase imbalance" were semantically identical.
Third, air-gapped deployments change everything. A Gulf Coast refinery needed semantic search across 40 years of process safety incident reports for their alkylation unit. The control network is completely isolated—no internet, no model API calls, no convenient managed services. We had to bundle Qdrant, Ollama with a local embedding model, and all dependencies into containers that could be transferred via approved USB drives through their security process.
Qdrant handled this gracefully. Single binary, minimal dependencies, runs on a 4-core VM with 16GB RAM. We used Ollama's nomic-embed-text model locally, which gave us 768-dimensional embeddings without external API calls. The entire stack fit on their approved hardware, passed security review, and has been running for 14 months with zero internet connectivity.
Weaviate would have worked too, but required more memory and had a more complex deployment surface. Milvus wanted etcd and MinIO, which meant additional security approval for three separate components instead of one. ChromaDB was too lightweight—we needed concurrent users and persistence guarantees that their architecture didn't provide at the time.
Lessons from Real Deployments
Our Qdrant deployments taught us that collections are your friend. We built separate collections for different operational contexts: one for equipment manuals, one for incident reports, one for maintenance procedures, one for engineering standards. When an operator searches "how to reset protection relay," we query the procedures collection first, then manuals, then incidents. This context-aware routing reduced irrelevant results by 60% compared to dumping everything into a single collection.
Collection-level metadata filtering is critical. At a municipal utility, we needed to ensure operators only saw information relevant to their certification level and facility clearance. Qdrant's payload filtering let us attach security classifications to each vector and filter at query time. A distribution operator searching for "transformer oil sampling procedure" only sees results tagged for their voltage class and facility. This wasn't possible with ChromaDB's simpler filtering, and Weaviate's approach required more complex schema design.
Hybrid search matters more than pure vector similarity. We deployed Weaviate at a natural gas pipeline operator specifically because they needed to combine semantic search with exact keyword matching. When searching incident reports, "pressure exceeded 1440 PSIG" needs to match that exact threshold, not semantically similar phrases like "high pressure event" or "pressure spike." Weaviate's built-in hybrid search with BM25 and vector similarity handled this elegantly.
Milvus shines when you have genuinely massive scale. We use it for a regional transmission organization that indexes every single SCADA alarm, operator action, and system event across their entire footprint—billions of records. Milvus's distributed architecture and GPU acceleration make queries feasible at that scale. But for a single substation or plant, it's overengineered. The operational complexity isn't worth it until you're past 100 million vectors.
ChromaDB found its place in our development workflow. We use it for rapid prototyping and testing embedding strategies before committing to production infrastructure. When evaluating whether to chunk technical documents at the paragraph, section, or page level, iterating with ChromaDB's simple API is faster than setting up Qdrant collections. But we've never put ChromaDB into production for a utility—the persistence model and concurrency characteristics don't meet operational requirements.
Graph databases intersect with vectors in unexpected ways. We deployed Neo4j alongside Qdrant at a power utility building a knowledge graph of their electrical infrastructure. The graph stores relationships—this transformer feeds these breakers, which protect these feeders, which serve these customers. Qdrant stores embeddings of maintenance procedures, incident narratives, and troubleshooting guides. When an operator asks "what happened last time breaker B-47 tripped," we use vector search to find relevant incidents, then traverse the graph to understand cascading impacts and related equipment.
This hybrid approach—graph for structured relationships, vectors for semantic search—proved more powerful than either alone. Neo4j's Cypher queries can incorporate vector similarity as part of graph traversal, but we found it cleaner to keep them separate and orchestrate with n8n workflows.
What We'd Do Differently
Start with embedding quality, not database selection. We wasted months optimizing Qdrant configurations before realizing our embeddings were garbage. The nomic-embed-text model works well for general text, but energy sector technical content has domain-specific terminology that generic models miss. We fine-tuned a small embedding model on 50,000 annotated equipment descriptions and maintenance logs. That 3-week effort improved retrieval accuracy more than any database optimization.
Plan for embedding model versioning from day one. When we upgraded from a 384-dimensional model to 768-dimensional, we had to re-embed and re-index 12 million documents. Qdrant handled the data migration fine, but we should have architected collections to support side-by-side model versions and gradual cutover. Now we maintain parallel collections during model transitions.
Build observability before you need it. Vector databases fail silently in ways relational databases don't. A corrupted index might return plausible but wrong results. We didn't catch a Qdrant index corruption for three weeks because the semantic search still worked—it just missed 15% of relevant documents. We now export query result counts, latency distributions, and collection statistics to our monitoring stack and alert on anomalies.
Test failure modes under air-gapped constraints. Our first air-gapped deployment failed spectacularly when the VM ran out of disk space during a large ingestion job. In a cloud environment, we'd resize storage in minutes. In an air-gapped control center, we needed a change control window, security approval for new hardware, and physical installation. We now size air-gapped deployments for 3x expected data volume and test disk-full scenarios explicitly.
Document your chunking and embedding strategy in excruciating detail. When the engineer who built the original system left, we discovered nobody understood why technical drawings were chunked at 512 tokens but incident reports at 1024 tokens, or why we used cosine similarity for procedures but dot product for equipment manuals. We've since adopted a standard template that documents chunking rationale, embedding model choice, similarity metric, and collection schema for every deployment.
The Verdict
For energy sector AI deployments, Qdrant is our default choice. It hits the sweet spot of operational simplicity, air-gapped compatibility, and NERC CIP compliance. The Rust implementation is rock-solid—we've had Qdrant instances run for 18 months without restart in harsh industrial environments. Collection-level isolation and payload filtering give us the security controls we need without complex schemas.
Weaviate gets the nod when hybrid search is non-negotiable. Pipeline operators, refineries, and any operation where exact keyword matching matters alongside semantic similarity should start with Weaviate. The built-in vectorization is convenient for cloud deployments, but we disable it for air-gapped work and use local embedding models.
Milvus is for the 1% of cases with truly massive scale. If you're a large ISO, RTO, or multinational energy company dealing with billions of vectors, Milvus's distributed architecture and GPU support justify the operational complexity. For everyone else, it's overengineered.
ChromaDB stays in development and testing. It's great for prototyping embedding strategies and validating retrieval approaches before committing to production infrastructure. We've never deployed it to an operational environment.
Neo4j complements rather than replaces vector databases. Knowledge graphs of infrastructure relationships plus semantic search of operational content is more powerful than either alone. We run them side by side and orchestrate with workflow tools.
The hardest lesson: vector databases are infrastructure, not magic. They don't fix bad embeddings, they don't clean your data, and they don't understand your operational context. We've seen more projects fail from poor chunking strategies and generic embedding models than from database choice. Pick Qdrant for most cases, invest in domain-specific embeddings, and build observability from day one. Everything else is optimization.