RelevantSearch.AI
Pattern · Volume 04 · Section F --- Diversification and result quality · Updated May 2026

Maximal Marginal Relevance (MMR) and diversification

Source: Carbonell and Goldstein, "The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries" (SIGIR 1998); production implementations across search platforms

Classification — Diversification algorithm that iteratively selects results balancing relevance against similarity to already-selected results.

Intent

Produce ranked result lists that balance relevance to the query against diversity of results, addressing the failure mode where pure-relevance ranking surfaces clusters of similar documents and misses alternative intents the query might cover.

Motivating Problem

A pure-relevance ranking can surface the most relevant document twice (or in slight variants), or surface 10 documents all addressing the same intent when the query has multiple plausible meanings. For navigational queries with one clear intent, the clustering is fine; for ambiguous or discovery queries, the clustering hurts: users who wanted intent B see 10 results for intent A. Diversification addresses this by trading some relevance for diversity, producing result lists that cover multiple intents within the top-K positions.

How It Works

The MMR algorithm. Iteratively select the next result by maximizing a combined score: MMR_score = lambda × (relevance to query) - (1 - lambda) × (max similarity to already-selected results). The lambda parameter (0 to 1) controls the trade-off: lambda = 1 is pure relevance (ignore diversity); lambda = 0 is pure diversity (ignore relevance); lambda = 0.5 is balanced. Typical production values: 0.7–0.9 (favor relevance with some diversity boost).

The similarity measure. MMR needs a way to measure similarity between documents — to determine which candidates are "similar to already-selected." Common choices: vector similarity in the document embedding space (cosine of document embeddings); category overlap (documents in the same category are similar); explicit attributes (same brand, same product line). The choice affects what "diversity" means; vector similarity captures semantic diversity, while category overlap captures categorical diversity.

The iteration. Start with empty selection. Score all candidates by MMR_score (only the relevance term is active initially, since no selected results to compare against). Pick the highest-scoring candidate. Recompute scores with that candidate now in the selection. Pick the next highest. Repeat until top-K is filled. The greedy selection produces good diversification in practice; it's not optimal in a global sense but is computationally tractable and produces interpretable behavior.

Per-query-class tuning. The right lambda varies by query class. Navigational queries (one clear intent) can use lambda = 0.95 or higher (almost no diversification). Informational/discovery queries benefit from lambda = 0.7–0.8 (meaningful diversification). Specific query types where ambiguity is common ("jaguar" could mean car or animal) need stronger diversification (lambda = 0.6 or lower).

Beyond MMR. More sophisticated diversification methods exist: determinantal point processes (DPP) produce diversification with theoretical guarantees; learned diversification methods train models to predict ideal diverse result sets. DPP is less commonly deployed because the algorithmic complexity is higher and the marginal benefit over well-tuned MMR is small in most workloads. MMR remains the production default; alternative methods are for cases where MMR specifically isn't sufficient.

Production integration. MMR is typically applied as a post-processing step after LTR or reranking: the ranker produces an ordered candidate list with scores; MMR reorders the top-K by introducing diversity. The original ranker's scores become the "relevance" input to MMR; the similarity function operates on document attributes available at query time. Latency overhead is small (the inner loop is O(K^2) similarity comparisons for top-K results).

When to Use It

Discovery and informational queries where users benefit from seeing multiple intents in the result list. Ambiguous queries with multiple plausible interpretations. Browse-style search interfaces where users scan many results. Cases where pure-relevance ranking produces visibly redundant results (multiple variants of the same product, multiple articles on the same sub-topic).

Alternatives — no diversification for navigational queries where one intent dominates. Categorical filtering or facet-based browsing for cases where the user explicitly chooses among intents through UI rather than implicitly through ranking. The pattern is best suited to queries where the system must guess at user intent and benefits from hedging.

Sources
  • Carbonell and Goldstein, "The Use of MMR, Diversity-Based Reranking" (SIGIR 1998)
  • Kulesza and Taskar, "Determinantal Point Processes for Machine Learning" (2012)
  • Production methodology writings on diversification

Read in context within Volume 04 →