Precision
Cross-Encoder Reranking
Reranking improves top-k relevance by scoring query and passage jointly.
Where reranking fits
Use semantic + lexical retrieval for broad candidate generation, then apply reranking to a small candidate set. This gives better precision without full-corpus cross-encoding cost.
Reranking is most useful for ambiguous or compositional queries.
Candidate budgeting
- Retrieve 30-100 candidates from fusion stage.
- Rerank top 20-40 for latency-sensitive workloads.
- Expose a per-request override for evaluation runs.
reranking:
enabled: true
model: cross-encoder/ms-marco-MiniLM-L-6-v2
max_candidates: 32When to disable
Disable reranking for strict low-latency paths where lexical exact-match dominates query value, or when running tiny local benchmarks focused only on ingestion correctness.