Definition
RAG retrieval quality metrics quantify how effectively the retrieval step surfaces relevant documents, using recall@k (fraction of relevant docs found in top-k results) and precision@k (fraction of top-k results that are relevant).
In Depth
For a RAG system retrieving k=5 documents per query: - Recall@5: of all relevant documents in the corpus, what fraction appeared in the top 5? Higher is better for coverage. - Precision@5: of the 5 retrieved documents, what fraction were actually relevant? Higher is better for reducing noise injected into the LLM context. Vector retrieval (embedding-based) excels at semantic similarity: finding documents that mean the same thing even with different words. Search API retrieval excels at keyword precision and recency: finding documents that contain specific terms published recently. For queries about named entities (product names, company names, person names), search API retrieval typically achieves higher precision@5 because keyword matching is exact. For queries about concepts or topics described in varied vocabulary, vector retrieval typically achieves higher recall@5. Hybrid retrieval — search API for initial candidate set, vector re-ranking for relevance ordering — outperforms either alone on standard RAG benchmarks. The practical tradeoff at scale: vector re-ranking adds 50-150ms and requires an embedding model. For most production RAG systems handling factual, entity-heavy queries, search API retrieval alone achieves sufficient precision@5 (>70%) without the embedding infrastructure overhead. For abstract, conceptual queries, hybrid retrieval is worth the added complexity.
Example Usage
A product FAQ RAG system tested search API retrieval (precision@5: 0.78, recall@5: 0.61) vs vector retrieval (precision@5: 0.69, recall@5: 0.74) on 200 product-named queries. Search API won on precision, reducing hallucination rate from 12% to 5%.
Platforms
RAG Retrieval Quality Metric is relevant across the following platforms, all accessible through Scavio's unified API:
Related Terms
Search-Augmented RAG
Search-augmented RAG is a retrieval-augmented generation pattern where live search API results replace a vector database...
SERP Grounding Accuracy
SERP grounding accuracy is the improvement in factual correctness achieved when an LLM's response is generated using liv...
Structured SERP Data
Structured SERP data is search engine results delivered as typed JSON fields — title, URL, snippet, position, price, rat...