Vector RAG returns wrong doc when user asks for a specific section by number

Question

A retrieval pipeline keyed on OpenAI text-embedding-3-large returns confidently wrong chunks when the user query names a section or chapter ("summarize section 4.2"). The retriever ranks semantically similar content higher than the exact section match. Rewriting the query, reranking with a cross-encoder, and adding a small keyword boost all help partially but none reliably beat ~75% exact-match accuracy on section-by-number queries.

Corpus is ~20k chunks across 400 technical PDFs. Chunks are 512-token passages with overlapping 64-token windows. Metadata carries document id and section number. We currently hybrid-retrieve via pgvector cosine similarity + BM25 fallback, then rerank with bge-reranker-base.

Identify the root cause — is it chunking strategy, metadata routing, rerank weight, or query rewriting — and propose a concrete change that pushes exact-section-match accuracy above 95% without regressing semantic queries.

Must remain under 400ms p95 latency. Cannot rebuild the embedding index from scratch. No fine-tuning budget for a custom embedder.

Vector RAG returns wrong doc when user asks for a specific section by number

context

goal

constraints

0 answers

your answer