ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Build Curated Search for AI Agents
Tutorial

How to Build Curated Search for AI Agents

Build a search wrapper that filters, ranks, and compresses results before feeding to an AI agent. Cut token cost 70% and reduce hallucination.

Get Free API KeyAPI Docs

Building curated search for AI agents means wrapping a search API with filtering, re-ranking, and compression so the agent receives only high-signal results instead of raw SERP noise. Raw search results waste tokens on ads, irrelevant domains, and verbose snippets that dilute agent reasoning. This tutorial builds a Python middleware that sits between your agent and the Scavio search API, applying domain allowlists, relevance scoring, and snippet truncation to cut token usage by 70% while improving answer quality.

Prerequisites

  • Python 3.8+
  • requests library installed
  • Scavio API key from scavio.dev
  • Basic understanding of token counting (tiktoken optional)

Walkthrough

Step 1: Define domain filters and quality rules

Create allowlists and blocklists for domains, plus rules for filtering low-quality results like forums with no answers or paywalled content.

Python
BLOCKED_DOMAINS = [
    'pinterest.com', 'quora.com', 'facebook.com',
    'instagram.com', 'tiktok.com', 'twitter.com',
]

TRUSTED_DOMAINS = [
    'docs.python.org', 'developer.mozilla.org', 'github.com',
    'stackoverflow.com', 'arxiv.org', 'huggingface.co',
]

def domain_score(url):
    domain = url.split('/')[2] if url else ''
    if any(b in domain for b in BLOCKED_DOMAINS):
        return -10
    if any(t in domain for t in TRUSTED_DOMAINS):
        return 5
    return 0

Step 2: Build the search wrapper with compression

Create a function that calls Scavio, filters results, scores relevance, and compresses snippets to a token budget.

Python
import requests, os

H = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}

def curated_search(query, max_results=5, max_snippet_chars=200):
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=H, json={'query': query, 'country_code': 'us'}).json()
    results = data.get('organic_results', [])
    scored = []
    for r in results:
        ds = domain_score(r.get('link', ''))
        if ds < 0:
            continue
        snippet = (r.get('snippet') or '')[:max_snippet_chars]
        scored.append({
            'title': r.get('title', ''),
            'url': r.get('link', ''),
            'snippet': snippet,
            'score': ds + r.get('position', 10) * -0.5
        })
    scored.sort(key=lambda x: x['score'], reverse=True)
    return scored[:max_results]

Step 3: Add relevance re-ranking

Score results by keyword overlap with the original query to push the most relevant results to the top.

Python
def relevance_score(query, result):
    query_terms = set(query.lower().split())
    text = f"{result['title']} {result['snippet']}".lower()
    matches = sum(1 for t in query_terms if t in text)
    return matches / max(len(query_terms), 1)

def curated_search_v2(query, max_results=5, max_snippet_chars=200):
    raw = curated_search(query, max_results=15, max_snippet_chars=max_snippet_chars)
    for r in raw:
        r['score'] += relevance_score(query, r) * 3
    raw.sort(key=lambda x: x['score'], reverse=True)
    return raw[:max_results]

Step 4: Measure token savings

Compare token counts between raw and curated results to verify the compression ratio.

Python
def estimate_tokens(text):
    return len(text.split()) * 1.3  # rough estimate

def compare_token_usage(query):
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=H, json={'query': query, 'country_code': 'us'}).json()
    raw_text = str(data.get('organic_results', []))
    raw_tokens = estimate_tokens(raw_text)

    curated = curated_search_v2(query)
    curated_text = str(curated)
    curated_tokens = estimate_tokens(curated_text)

    savings = (1 - curated_tokens / raw_tokens) * 100
    print(f'Raw: ~{int(raw_tokens)} tokens')
    print(f'Curated: ~{int(curated_tokens)} tokens')
    print(f'Savings: {savings:.0f}%')

compare_token_usage('how to deploy FastAPI to production')

Python Example

Python
import os, requests

H = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}
BLOCKED = ['pinterest.com', 'quora.com', 'facebook.com']
TRUSTED = ['docs.python.org', 'developer.mozilla.org', 'github.com', 'stackoverflow.com']

def curated_search(query, max_results=5, max_chars=200):
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=H, json={'query': query, 'country_code': 'us'}).json()
    results = []
    query_terms = set(query.lower().split())
    for r in data.get('organic_results', []):
        domain = r.get('link', '').split('/')[2] if r.get('link') else ''
        if any(b in domain for b in BLOCKED):
            continue
        snippet = (r.get('snippet') or '')[:max_chars]
        text = f"{r.get('title', '')} {snippet}".lower()
        relevance = sum(1 for t in query_terms if t in text) / max(len(query_terms), 1)
        trust = 5 if any(t in domain for t in TRUSTED) else 0
        results.append({
            'title': r.get('title', ''),
            'url': r.get('link', ''),
            'snippet': snippet,
            'score': trust + relevance * 3
        })
    results.sort(key=lambda x: x['score'], reverse=True)
    return results[:max_results]

for r in curated_search('FastAPI deployment best practices 2026'):
    print(f"[{r['score']:.1f}] {r['title']}")
    print(f"  {r['url']}")

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
const BLOCKED = ['pinterest.com', 'quora.com', 'facebook.com'];
const TRUSTED = ['docs.python.org', 'developer.mozilla.org', 'github.com', 'stackoverflow.com'];

async function curatedSearch(query, maxResults = 5, maxChars = 200) {
  const data = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H,
    body: JSON.stringify({query, country_code: 'us'})
  }).then(r => r.json());
  const queryTerms = new Set(query.toLowerCase().split(' '));
  const results = (data.organic_results || []).map(r => {
    const domain = (r.link || '').split('/')[2] || '';
    if (BLOCKED.some(b => domain.includes(b))) return null;
    const snippet = (r.snippet || '').slice(0, maxChars);
    const text = \`\${r.title || ''} \${snippet}\`.toLowerCase();
    const relevance = [...queryTerms].filter(t => text.includes(t)).length / queryTerms.size;
    const trust = TRUSTED.some(t => domain.includes(t)) ? 5 : 0;
    return {title: r.title, url: r.link, snippet, score: trust + relevance * 3};
  }).filter(Boolean).sort((a, b) => b.score - a.score).slice(0, maxResults);
  return results;
}

curatedSearch('FastAPI deployment best practices 2026').then(results => {
  results.forEach(r => console.log(\`[\${r.score.toFixed(1)}] \${r.title}\`));
});

Expected Output

JSON
[8.0] FastAPI Deployment Guide - Official Docs
  https://docs.python.org/fastapi-deploy
[5.5] Deploy FastAPI on AWS Lambda in 2026
  https://github.com/example/fastapi-lambda
[3.0] Production FastAPI Checklist
  https://example.com/fastapi-prod

Related Tutorials

  • How to Audit Search Tool Security in AI Agents
  • How to Ground a Local LLM with Structured Search
  • How to Build a Lead Gen Pipeline with Search API

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+. requests library installed. Scavio API key from scavio.dev. Basic understanding of token counting (tiktoken optional). A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Best Of

Best Search APIs for Agentic Stacks in 2026

Read more
Best Of

Best Curated Search Tool for AI Agents in 2026

Read more
Use Case

Agent Context Management

Read more
Use Case

MCP Search Gateway for Multi-Agent Systems

Read more
Solution

Route Agent Searches to the Cheapest Provider Automatically

Read more
Glossary

Search API Cost per Context Window

Read more

Start Building

Build a search wrapper that filters, ranks, and compresses results before feeding to an AI agent. Cut token cost 70% and reduce hallucination.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy