What Vector Databases Actually Do
Vector databases store numerical representations of unstructured data—text, images, sensor telemetry—as high-dimensional vectors. Unlike traditional databases that search for exact matches, vector databases find semantically similar content. When an engineer searches 'transformer overheating events,' the system retrieves incidents about thermal runaway, cooling system failures, and abnormal temperature readings, even if those exact words never appeared in the original logs.
In our deployments across utilities and pipeline operators, we've seen this capability transform how operations teams access institutional knowledge. Thirty years of maintenance records, engineering change orders, incident reports, and tribal knowledge sitting in SharePoint folders and retired engineers' heads—vector databases make it queryable through natural language.
The core mechanism: embedding models convert text or images into vectors (arrays of floating-point numbers). Similar content produces similar vectors. The database indexes these vectors using algorithms like HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index) that enable fast approximate nearest neighbor search across millions or billions of vectors. Query time stays under 10ms even with datasets that would choke traditional full-text search.
Why Energy Operations Need This Now
We're deploying vector databases because energy organizations have three specific problems that traditional databases can't solve:
First, unstructured operational data. A typical utility generates thousands of work orders monthly, each containing free-text descriptions, photos, PDF attachments. An operator troubleshooting a protection relay fault needs to find similar past incidents across 20 years of records. Traditional keyword search returns nothing useful because engineers describe the same failure mode fifty different ways. Vector search surfaces the relevant incidents regardless of terminology.
Second, RAG (Retrieval Augmented Generation) for LLMs. We build chatbots that answer questions about NERC standards, equipment manuals, operating procedures. The LLM needs grounding data to avoid hallucination. Vector databases retrieve the most relevant context chunks from your document corpus, feed them to the LLM, and the response cites actual source material. This is the only architecture we trust for compliance-critical domains.
Third, AI memory systems. An AI agent monitoring grid operations needs to remember similar scenarios, decisions, and outcomes. Vector databases store episodic memory—each decision context as a vector—enabling the agent to recall analogous situations and learn from past actions. This is how we build systems that improve over time instead of repeating mistakes.
Qdrant: Our Default Choice
We deploy Qdrant for most energy sector projects. Written in Rust, it runs efficiently on modest hardware—critical when you're deploying on-premises in substations or control centers with limited IT infrastructure. A single Qdrant instance handles 10-50 million vectors on a 16GB RAM server while maintaining sub-5ms query latency.
Qdrant's filtering capabilities matter enormously in operational contexts. You can filter vectors by metadata before similarity search—'find similar incidents but only from this substation, this equipment type, and this time range.' The query becomes 'semantically similar AND meets these constraints,' which is exactly how operations teams think.
The snapshot and replication features align with NERC CIP requirements. Qdrant supports incremental backups, point-in-time recovery, and multi-node replication. We've configured Qdrant clusters with read replicas in DR sites, maintaining sub-second replication lag. When CIP auditors ask about data protection, we show them Qdrant's append-only WAL and consistent snapshot mechanisms.
Quantization support reduces memory footprint by 4-8x without meaningful accuracy loss. We typically deploy with scalar quantization enabled, storing 32-bit float embeddings as 8-bit integers. A 40 million vector collection drops from 160GB to 40GB RAM, letting us run larger datasets on edge hardware.
One limitation: Qdrant lacks built-in vectorization. You must generate embeddings separately using Ollama, a commercial API, or another embedding service. This adds architectural complexity but gives you control over which embedding model to use—important when you want to switch models without rebuilding the entire database.
Weaviate: When You Need Hybrid Search
Weaviate excels at hybrid search—combining vector similarity with traditional BM25 keyword search. In practice, this matters when users search for specific identifiers, part numbers, or regulatory citations mixed with conceptual queries. 'Show me incidents involving transformer 345-TR-02 related to cooling system problems' requires exact matching on the asset ID plus semantic search on the description.
Weaviate includes built-in vectorization modules that automatically embed data during ingestion. Point it at an OpenAI API key or a local model, and it handles embedding generation transparently. This simplifies architecture—one less service to deploy and maintain. The trade-off: you're coupled to Weaviate's supported embedding providers unless you manage external vectorization yourself.
The GraphQL API feels overengineered for simple use cases but becomes valuable in complex applications. We've built multi-tenant systems where different business units access different data collections with fine-grained authorization rules. Weaviate's GraphQL schema makes these access patterns explicit and enforceable.
Weaviate's resource footprint is heavier than Qdrant. Expect to provision 2-3x the RAM for equivalent performance. A 10 million vector deployment that runs comfortably on 16GB with Qdrant wants 32-48GB with Weaviate. This matters in edge deployments where hardware is constrained.
Milvus: Trillion-Scale But Operationally Heavy
Milvus targets massive scale—billions to trillions of vectors—using distributed architecture with separate query and data nodes. If you're indexing every sensor reading, every SCADA event, every waveform capture across an entire grid operator's history, Milvus handles it.
We've deployed Milvus once, for a large ISO with 50+ billion time-series vectors representing historical system states. The scale capabilities are real. But Milvus requires Kubernetes, object storage (MinIO or S3), message queues (Pulsar or Kafka), and a metadata store (etcd). You're running a distributed system with all the operational complexity that entails.
For typical utility deployments—10-100 million vectors, on-premises, air-gapped—Milvus is overkill. The operational overhead doesn't justify the scale benefits you won't use. We only recommend Milvus when you've proven you need multi-node horizontal scaling and have the infrastructure team to support it.
ChromaDB: Development and Prototyping
ChromaDB is the vector database equivalent of SQLite—simple, embedded, perfect for development and small deployments. We use it extensively for prototyping RAG applications, testing embedding models, and building proof-of-concept demos.
The Python API is exceptionally clean. You can have a working vector search application in 20 lines of code. For data scientists exploring LLM applications, ChromaDB removes infrastructure friction. It stores data in a local directory, requires no server process, and 'just works.'
ChromaDB's server mode supports multi-user applications, but we don't deploy it for production operational systems. Performance degrades past 5-10 million vectors. Backup and replication are manual. There's no built-in monitoring or observability. These limitations are acceptable for development tooling but problematic for systems that operations teams depend on.
Where ChromaDB shines: analyst workstations running local LLM experiments, departmental knowledge bases with 10K-100K documents, and embedded applications where the vector database is hidden inside a larger system.
What About Neo4j?
Neo4j is a graph database, not a vector database, but it warrants mention because it solves adjacent problems. Vector databases find similar content. Graph databases model relationships between entities—assets, systems, people, procedures.
We often deploy both. Qdrant stores embedded operational documents for semantic search. Neo4j models the knowledge graph: which assets connect to which systems, which procedures apply to which equipment, which engineers have expertise in which domains. Queries combine both: 'Find similar incidents (Qdrant) involving assets connected to this substation (Neo4j) where the assigned engineer (Neo4j) has seen this failure mode before (Qdrant).'
Neo4j now includes vector indexing capabilities for graph-vector hybrid search. We're testing this in several projects. If it matures, it could collapse two databases into one, simplifying architecture. Today, we still deploy them separately—Qdrant for embeddings, Neo4j for relationships.
Air-Gapped and NERC CIP Considerations
Vector databases fit naturally into air-gapped architectures. Unlike cloud vector services, you control all data and dependencies. Qdrant, Weaviate, Milvus, and ChromaDB run entirely on-premises with no external connectivity required.
The critical dependency is embedding models. You need local inference—typically Ollama running text-embedding models like nomic-embed-text or BGE. Generate embeddings inside the security perimeter, store them in the vector database, never send operational data outside.
For NERC CIP compliance, focus on three requirements: access controls, audit logging, and data protection. Qdrant and Weaviate support API key authentication and optional TLS. Audit logging requires application-level implementation—log who queries what, when. Data protection comes from the database's snapshot and replication features plus your standard backup procedures.
We document these controls in CIP compliance matrices, mapping vector database capabilities to specific CIP requirements. Auditors accept this when you demonstrate operational oversight—monitoring, regular testing, documented procedures.
Performance Characteristics That Matter
Query latency: Qdrant typically 2-5ms, Weaviate 5-10ms, Milvus 10-20ms (distributed overhead), ChromaDB 5-15ms on small datasets. These numbers assume warm cache, HNSW indexing, and properly sized hardware.
Ingestion throughput: Qdrant handles 5K-10K vectors/second on a single node. Weaviate 2K-5K vectors/second. Milvus scales horizontally—we've sustained 50K vectors/second across a four-node cluster. ChromaDB is slower, 500-2K vectors/second, but ingestion is rarely a bottleneck in operational deployments.
Memory footprint: Plan 4 bytes per dimension per vector without quantization. 1536-dimension OpenAI embeddings consume 6KB per vector—10 million vectors need 60GB RAM plus indexing overhead. With quantization, divide by 4-8. Disk storage is cheaper but query latency increases 10-50x.
Accuracy vs. speed trade-offs: HNSW indexing has two parameters—M (graph connectivity) and ef (search breadth). Higher values improve accuracy but increase memory and query time. We typically run M=16, ef=100 for 95-98% recall with acceptable latency. Mission-critical systems bump to M=32, ef=200 for 99%+ recall.
The Verdict
Deploy Qdrant for production energy operations. It's fast, resource-efficient, and operationally simple. The Rust implementation means predictable performance and minimal memory overhead. We run Qdrant in substations, control centers, and regional data centers without issues.
Use ChromaDB for development, prototyping, and analyst tools. The simplicity accelerates experimentation. When the application proves valuable, migrate to Qdrant for production.
Choose Weaviate only if you require hybrid search or prefer built-in vectorization. The extra resource consumption is justified when these features are critical.
Avoid Milvus unless you've proven you need distributed scale. Most energy organizations don't reach the billions-of-vectors threshold where Milvus's complexity pays off.
The real work isn't choosing the database—it's embedding strategy, chunking approach, metadata design, and query patterns. The vector database is infrastructure. Focus on the operational workflows it enables: engineers finding relevant past experience, AI agents grounding decisions in real data, compliance teams surfacing applicable standards. That's where the value lives.