RelevantSearch.AI
Pattern · Volume 01 · Section B --- Dense vector retrieval patterns · Updated May 2026

Sparse-learned retrieval (SPLADE, BGE-sparse)

Source: Formal et al., SPLADE (Naver Labs, 2021); BGE-M3 sparse component (BAAI, 2024); production support in Vespa, OpenSearch, custom implementations

Classification — Retrieval pattern combining lexical-style sparse representations with learned term expansion.

Intent

Retrieve documents using sparse vector representations — where each dimension corresponds to a vocabulary term — with learned weights that include implicit term expansion, bridging the interpretability of lexical retrieval and the semantic capability of dense retrieval.

Motivating Problem

Dense vector retrieval is semantically powerful but opaque: it's hard to explain why a specific document matched a specific query, and the embedding space is not directly interpretable. Lexical retrieval is interpretable but limited to exact and configured-synonym matches. Sparse-learned retrieval addresses both: it produces sparse representations (most dimensions zero) that look like extended vocabularies, with learned weights that automatically expand terms based on context. The matches are explainable (which expanded terms triggered the match); semantic capability approaches dense retrieval; the architecture preserves inverted-index efficiency.

How It Works

Model architecture. A transformer-based model (typically BERT-derived) processes the input text and produces a sparse vector where each dimension corresponds to a token in the model's vocabulary. Most dimensions are zero (the sparsity); non-zero dimensions represent terms the model considers semantically present, including terms not literally in the text. The output for "running shoes" might activate the terms "running" and "shoes" explicitly and also "athletic," "sneakers," "footwear" implicitly with learned weights.

Index construction. Sparse-learned vectors are stored in inverted-index-style structures (since the representations are sparse). The index format resembles a lexical inverted index but with the vocabulary expanded to model token space (~30K terms for BERT-style models) and weighted by the model rather than by BM25 statistics.

Query-time retrieval. The query is processed through the same model to produce a sparse vector. Retrieval is similar to BM25 over the expanded vocabulary: the query's non-zero dimensions are looked up in the inverted index; documents matching multiple query dimensions accumulate scores. The expansion happens at both query and document time, so a query for "pain reliever" may activate "analgesic" and match documents that activated the same term during indexing.

Strengths. Interpretable matches — you can see which terms (including expanded ones) contributed to a match. Implicit term expansion — no manual synonym lists needed. Inverted-index efficiency — the retrieval architecture resembles lexical retrieval rather than vector retrieval. Cold-start without supervised training data — the underlying model is pre-trained.

Limitations. The pattern is newer (post-2021) and less universally supported across platforms than BM25 or dense retrieval. Model dependency creates the same versioning challenges as dense retrieval. The expansion is fixed by the model; it can't be customized at runtime the way explicit synonym lists can. Some platforms (Elasticsearch, OpenSearch) support sparse-learned retrieval through plugins or specific APIs; others (Vespa) support it natively; some (Coveo, Algolia) have varying support as of 2026.

When to Use It

Production search where interpretability of matches matters (e-commerce explaining why a product matched, customer service search where reviewing the match logic matters, regulated domains). Cases where dense retrieval's quality is needed but the opacity is a concern. Workloads with high update rates where the inverted-index-style architecture handles updates more cleanly than dense vector indexes.

Alternatives — dense retrieval (prior entry) when interpretability isn't needed and dense models are stronger for the use case. Hybrid retrieval (Section C) combining sparse-learned with other paths. Pure lexical retrieval for use cases where learned expansion doesn't justify the model dependency.

Sources
  • Formal et al., "SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking" (2021)
  • BAAI BGE-M3 (multi-functionality embedding including sparse output)
  • Vespa sparse vector documentation
  • OpenSearch neural sparse search documentation

Read in context within Volume 01 →