Vector database
- Definition
- A database optimized for storing and querying high-dimensional vectors (embeddings). Vector databases like Pinecone, Weaviate, and Chroma are critical infrastructure for RAG, search, and recommendation systems.
- Why it matters
- Vector databases are the infrastructure layer that makes RAG possible at scale. While you can store vectors in any database, specialized vector databases provide approximate nearest-neighbor search algorithms (HNSW, IVF) that can find the most similar vectors among billions in milliseconds. For AI-powered search, recommendations, and knowledge retrieval, vector database performance directly determines end-user experience. The market has grown rapidly but is also consolidating: traditional databases (PostgreSQL with pgvector, Redis with vector search) are adding vector capabilities, challenging dedicated vector databases. The strategic question is whether to use a specialized vector database or add vector search to your existing database.
- In practice
- Pinecone, the leading managed vector database, serves billions of queries per day and raised $100M in 2023. Weaviate and Qdrant offer open-source alternatives with managed cloud offerings. Chroma targets lightweight, embedded use cases. PostgreSQL's pgvector extension has become popular for teams that want to avoid adding another database to their stack. In practice, a typical RAG deployment indexes 100K-10M documents as vectors, with query latency under 50ms. Advanced features include: metadata filtering (e.g., only search documents from the last 30 days), hybrid search (combining vector and keyword search), and namespace isolation (multi-tenant support). The integration pattern is well-established: embed documents at ingestion, store in vector DB, retrieve at query time.
We cover infrastructure & compute every week.
Get the 5 AI stories that matter — free, every Friday.
Related terms
Vector
An ordered list of numbers that represents data in a high-dimensional space. In AI, vectors (embeddings) encode semantic meaning, enabling similarity search, clustering, and retrieval-augmented generation.
Embedding
A numerical vector representation of text, images, or other data that captures semantic meaning. Embeddings power search, recommendations, and RAG systems by letting you find conceptually similar content.
RAG (Retrieval-Augmented Generation)
A technique that retrieves relevant documents from an external knowledge base and feeds them to a model alongside the user's query. RAG reduces hallucination and keeps responses grounded in current, factual data.
Retrieval
The process of finding and fetching relevant information from a knowledge base, database, or document store to provide context for an AI model. Retrieval quality is the single biggest determinant of RAG system performance.
Know the terms. Know the moves.
Get the 5 AI stories that matter every Friday — free.
Free forever. No spam.