ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Build Hybrid RAG with Local + API Search
Tutorial

How to Build Hybrid RAG with Local + API Search

Learn how to build a RAG system that searches a local document index first and falls back to a search API for questions outside the corpus.

Get Free API KeyAPI Docs

RAG systems that only search a local index cannot answer questions outside their corpus. This tutorial builds a hybrid retrieval system: local document search for known-domain questions (fast, free, private) with automatic fallback to Scavio's search API for open-domain questions (current web data). The confidence threshold determines when to fall back, and source labels let the LLM attribute answers correctly.

Prerequisites

  • Python 3.8+ installed
  • A local search index (Meilisearch, Elasticsearch, or SQLite FTS)
  • requests library installed
  • A Scavio API key from scavio.dev

Walkthrough

Step 1: Set up the local search function

Define a function that searches your local index and returns results with confidence scores.

Python
# Example with Meilisearch (replace with your index):
import meilisearch

local_client = meilisearch.Client('http://localhost:7700')

def local_search(query: str, top_k: int = 3) -> list:
    results = local_client.index('docs').search(query, {'limit': top_k})
    return [{
        'text': hit['content'],
        'title': hit.get('title', ''),
        'score': hit.get('_rankingScore', 0),
        'source': 'local'
    } for hit in results['hits']]

Step 2: Set up the API search function

Define a function that searches via Scavio when local results are insufficient.

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def api_search(query: str, platform: str = 'google') -> list:
    resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': platform, 'query': query}, timeout=10)
    return [{
        'text': r.get('snippet', ''),
        'title': r.get('title', ''),
        'url': r.get('link', ''),
        'source': 'web'
    } for r in resp.json().get('organic', [])[:3]]

Step 3: Build the hybrid retriever

Route to local or API based on confidence threshold.

Python
CONFIDENCE_THRESHOLD = 0.7

def hybrid_retrieve(query: str) -> dict:
    local_results = local_search(query)
    if local_results and local_results[0]['score'] >= CONFIDENCE_THRESHOLD:
        return {'source': 'local', 'results': local_results}
    web_results = api_search(query)
    if web_results:
        return {'source': 'web', 'results': web_results}
    return {'source': 'none', 'results': local_results or []}

Step 4: Format context for the LLM

Build a prompt context string with source labels for attribution.

Python
def format_context(retrieval: dict) -> str:
    source_label = 'Internal docs' if retrieval['source'] == 'local' else 'Web search'
    lines = [f'Source: {source_label}']
    for r in retrieval['results']:
        if r.get('url'):
            lines.append(f"- {r['title']}: {r['text']} (ref: {r['url']})")
        else:
            lines.append(f"- {r['title']}: {r['text']}")
    return '\n'.join(lines)

# Use in your RAG prompt:
# context = format_context(hybrid_retrieve(user_question))
# prompt = f'{context}\n\nQuestion: {user_question}\nAnswer:'

Python Example

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def hybrid_rag(query, local_index, threshold=0.7):
    local = local_index.search(query, {'limit': 3})
    if local['hits'] and local['hits'][0].get('_rankingScore', 0) >= threshold:
        return {'source': 'local', 'context': [h['content'] for h in local['hits']]}
    web = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': query}, timeout=10).json()
    return {'source': 'web', 'context': [r['snippet'] for r in web.get('organic', [])[:3]]}

JavaScript Example

JavaScript
async function hybridRag(query, localIndex, threshold = 0.7) {
  const local = await localIndex.search(query, {limit: 3});
  if (local.hits?.length && local.hits[0]._rankingScore >= threshold) {
    return {source: 'local', context: local.hits.map(h => h.content)};
  }
  const web = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'},
    body: JSON.stringify({platform: 'google', query})
  }).then(r => r.json());
  return {source: 'web', context: (web.organic || []).slice(0, 3).map(r => r.snippet)};
}

Expected Output

JSON
A hybrid RAG retriever that searches local docs first and falls back to web search for out-of-corpus questions.

Related Tutorials

  • How to Fetch Google Search Results in Python

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+ installed. A local search index (Meilisearch, Elasticsearch, or SQLite FTS). requests library installed. A Scavio API key from scavio.dev. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Solution

Local RAG with Search API Fallback

Read more
Use Case

Local RAG + Search API Hybrid Application

Read more
Solution

Boost RAG Accuracy with Hybrid Web Search

Read more
Best Of

Best Search API for RAG Applications in 2026

Read more
Best Of

Best Search API for RAG Accuracy in 2026

Read more
Glossary

Local Search Index for RAG

Read more

Start Building

Learn how to build a RAG system that searches a local document index first and falls back to a search API for questions outside the corpus.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy