Source: Karpukhin et al., "Dense Passage Retrieval" (2020); Reimers and Gurevych, sentence-transformers; production implementations in Pinecone, Weaviate, Qdrant, Elasticsearch k-NN, Vespa

Classification — Scoring function for dense vector retrieval and ranking — cosine similarity, dot product, learned similarity functions.

Intent

Score query-document pairs by similarity in a learned embedding space, where queries and documents are encoded as dense vectors and similarity captures semantic relationships beyond lexical overlap.

Motivating Problem

BM25 family captures lexical match but misses semantic similarity. A query for "pain reliever" doesn't lexically match documents about "analgesic"; a BM25-only system misses the relationship. Vector similarity captures semantic relationships by encoding queries and documents in a learned space where conceptually-related inputs produce similar vectors. The pattern is the foundation of dense retrieval (Volume 1 Section B) and one of the standard feature inputs for modern LTR models.

How It Works

Cosine similarity. The most common vector similarity function. cos(q, d) = (q · d) / (||q|| × ||d||). The dot product divided by the product of magnitudes produces a value in [-1, 1] where 1 is identical direction and -1 is opposite. For normalized vectors (unit length), cosine equals dot product, which is computationally cheaper. Most production embedding models produce normalized vectors and use dot product as the similarity measure; cosine and dot product are interchangeable for normalized vectors.

Dot product. q · d = sum of (q_i × d_i) over all dimensions. The simplest and fastest similarity function. For normalized vectors, dot product equals cosine; for unnormalized vectors, dot product is sensitive to vector magnitudes. Most production deployments use normalized vectors and dot product; the choice between explicit cosine and dot product is implementation detail rather than fundamental design.

Euclidean distance. The geometric distance between vectors: ||q - d||. Smaller distance means more similar; conventions vary on whether to invert this to produce a similarity score. Less common in production than cosine/dot product because cosine's magnitude invariance is typically more aligned with what "similarity" should mean in semantic space.

Learned similarity functions. Beyond geometric similarity, learned functions can produce better quality in some contexts. The most common pattern: project query and document embeddings through a learned linear layer or small neural network before comparing. The projection can be tuned on labeled relevance data. The pattern is less common in retrieval (where simple cosine/dot product allows ANN indexing) and more common in ranking (where the projection can be applied to the small candidate set without retrieval-scale constraints).

Vector similarity as LTR features. In LTR pipelines, vector similarity scores are typically among the more important features. Common feature decompositions: similarity between query and full document embedding; similarity between query and per-section document embeddings (title embedding, body embedding); similarity using different embedding models (OpenAI text-embedding-3 similarity AND BGE similarity as separate features). The decomposition lets the LTR model weight the embedding signals per query class.

Embedding model selection trade-offs. The quality of vector similarity depends primarily on the embedding model. General-purpose models (OpenAI text-embedding-3-large, BGE-large, Voyage 3, Cohere embed v3) work for many use cases. Domain-specific models (LegalBERT for legal, BioBERT for medical, code-specific embeddings for code) outperform general models on their domains. Fine-tuned models on domain-specific labeled data outperform off-the-shelf models when training data is available. The MTEB leaderboard (huggingface.co/spaces/mteb/leaderboard) provides comparison points; production evaluation on the actual workload (Volume 5 Section B) is essential.

When to Use It

Production search with semantic matching needs beyond what synonym engineering provides. Modern hybrid retrieval (Volume 1 Section C). LTR models where semantic signals complement lexical signals. RAG pipelines for agentic systems.

Alternatives — BM25-family scoring (prior entry) for use cases dominated by lexical matching. Late-interaction models (Section D) for cases where simple similarity isn't sufficient but full cross-encoder is too expensive. Pure vector scoring is rarely optimal alone; combined with lexical scoring in hybrid retrieval is the dominant production pattern.

Sources

Karpukhin et al., "Dense Passage Retrieval for Open-Domain QA" (2020)
Reimers and Gurevych, sentence-transformers documentation
BEIR benchmark for retrieval (github.com/beir-cellar/beir)
MTEB leaderboard for embedding comparison

Vector similarity scoring