Hybrid RAGKnowledge GraphsEnterprise AI

Tribrid RAG: Three-Signal Retrieval with MMR Fusion

Barnyard combines entity search (BM25 + vector), topic cluster retrieval, and knowledge graph expansion into a single ranked passage pool using Maximum Marginal Relevance fusion.

Dawson Bauer

Overview

Most retrieval-augmented generation systems use a single retrieval signal — either dense vector search or sparse keyword search. Barnyard uses a tribrid approach that combines three independent signals, each optimised for a different type of relevance, then merges them with Maximum Marginal Relevance (MMR) to produce a single ranked, deduplicated passage pool.

At a high level, a query fans out along two paths. The entity path runs BM25 and vector search to find seed entities, expands them by one hop through canonical relations in the graph, and pulls the chunks those entities were extracted from. The chunk path searches topic clusters to find relevant text nodes and their chunks. The two pools are then merged with Maximum Marginal Relevance into a single ranked, deduplicated set of final passages.


Strategy Selection

Before retrieval begins, classify_query_node routes the query to one of three strategies:

StrategyTriggerPaths active
"entities"Query asks about specific named things, people, or organisationsEntity path only
"chunks"Query asks about topics, themes, or document-level contextChunk path only
"both"Query mixes entity-specific and thematic elementsBoth paths; MMR merge

The classifier uses a structured LLM output with a short reasoning chain. Strategy "both" is the default when the classifier is uncertain.


Path 1: Entity Path

Stage 1a — Hybrid Entity Search (BM25 + Vector)

Two searches run concurrently:

  • BM25 keyword search — Neo4j full-text index on Entity.name (exact/lexical match)
  • Vector search — ColBERT embeddings against the Entity_name Qdrant collection

Results are fused with Reciprocal Rank Fusion (RRF):

The two lists are fused with Reciprocal Rank Fusion: each entity's score is the sum of one divided by a constant plus its rank in the keyword list, and one divided by that same constant plus its rank in the vector list.

k=60 dampens the impact of top-rank outliers. Entities appearing highly in both lists score highest.

Stage 1b — Score Normalisation

RRF scores (~0.013–0.033) are incompatible with chunk path cosine similarity scores (0–1). Before MMR, entity scores are min-max normalised to [0, 1]:

Each entity's score is then min-max normalised into a 0-to-1 range — subtracting the lowest RRF score in the set and dividing by the spread between the highest and lowest.

Stage 1c — CanonicalRelation Graph Expansion

Top-K seed entities are expanded by one hop via CanonicalRelation. Querying "Tim Cook" also retrieves "Apple Inc." and "Steve Jobs" if they share CanonicalRelation edges. Expanded entities receive a fixed relevance score of 0.7.

Stage 1d — ChunkNode Retrieval via HAS_ENTITY

Each seed entity is then used to look up the chunks it came from: the system follows the HAS_ENTITY edges from every entity to its chunk nodes — scoped to the current space — and returns those chunks along with their parent text nodes.

The HAS_ENTITY edges are written at ingestion time — each entity is linked directly to the specific chunk(s) from which it was extracted by GLiNER.


Path 2: Chunk Path

Stage 2a — TopicCluster Coarse Filter

The query is encoded with MPNet (768-dim) and searched against the TopicCluster_summary Qdrant collection. Up to chunk_inner_top_k (default: 30) TopicClusters are retrieved and post-filtered by user_id/space_ids.

TopicClusters are LLM-generated semantic summaries — denser and more semantically coherent than raw chunk text, improving recall for thematic queries.

Stage 2b — ChunkNode Expansion via Neo4j

From the matching topic clusters, the system walks the graph to the text nodes tagged with each cluster and on to their chunks, returning those chunks in document order.

Each ChunkNode is scored by its parent TopicCluster's cosine similarity — the same [0,1] range as entity path scores, making them directly comparable in MMR.


MMR Fusion

When strategy is "both", merge_context_node receives both ChunkNode pools and passes them through Maximum Marginal Relevance:

Maximum Marginal Relevance scores each candidate by balancing two things: its relevance to the query, weighted by a factor lambda, minus its highest similarity to any passage already selected, weighted by one minus lambda. The result rewards passages that are both relevant and non-redundant.

By default it keeps the top 8 passages with lambda set to 0.6 (leaning toward relevance), and treats any candidate above a 0.72 Jaccard similarity threshold as a near-duplicate.

Jaccard similarity is computed on word sets. Candidates above the threshold are treated as near-duplicates of an already-selected passage and skipped regardless of relevance score.


Score Compatibility

A key design constraint: entity path and chunk path scores must be comparable for MMR to work. Without normalisation, raw RRF scores (~0.013–0.033) would always lose to cosine similarity scores (0.3–0.9) — the entity path would be effectively muted.

The min-max normalisation in Stage 1b resolves this. Related entity chunks (expanded via CanonicalRelation) receive a fixed score of 0.7 — tunable via retrieval.related_entity_chunk_score.


Configuration

The retrieval and fusion behaviour is fully configurable — the number of seed entities, the RRF constant, the fixed score given to related-entity chunks, the various top-K limits for clusters and chunks, and the MMR settings (how many passages to keep, the relevance-versus-diversity balance, and the near-duplicate threshold) are all tunable parameters.