ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Reduce Agent Search Token Count
Tutorial

How to Reduce Agent Search Token Count

Compress web search results before passing them to LLM agents. Cut token usage by 60-80% while preserving the information agents need to answer.

Get Free API KeyAPI Docs

LLM agents that call web search tools often consume excessive tokens because raw search results contain titles, snippets, URLs, metadata, and SERP features that the agent does not need. Passing full search responses into an agent context window wastes tokens and money. This tutorial shows how to compress search results by extracting only the fields the agent needs, truncating snippets, deduplicating content, and formatting results as compact text. You will build a search compression layer that reduces token count by 60-80% while keeping the information density high.

Prerequisites

  • Python 3.8+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • An LLM agent that uses search tools

Walkthrough

Step 1: Fetch raw search results

Query the Scavio API and measure the raw token count of the full response.

Python
import os, requests, json

API_KEY = os.environ["SCAVIO_API_KEY"]
resp = requests.post("https://api.scavio.dev/api/v1/search",
    headers={"x-api-key": API_KEY},
    json={"platform": "google", "query": "best CRM for startups 2026"})
raw = resp.json()
raw_size = len(json.dumps(raw))
print(f"Raw response: {raw_size} chars")

Step 2: Extract essential fields only

Strip the response down to only the fields an agent needs: title, snippet, and URL.

Python
def compress_results(data, max_results=5):
    results = []
    for r in data.get("organic_results", [])[:max_results]:
        results.append({
            "title": r.get("title", "")[:80],
            "snippet": r.get("snippet", "")[:200],
            "url": r.get("link", ""),
        })
    return results

compressed = compress_results(raw)
comp_size = len(json.dumps(compressed))
print(f"Compressed: {comp_size} chars ({100 - round(comp_size/raw_size*100)}% reduction)")

Step 3: Format as compact text for agent context

Convert structured results to a minimal text format that uses fewer tokens than JSON.

Python
def format_for_agent(results):
    lines = []
    for i, r in enumerate(results, 1):
        lines.append(f"[{i}] {r['title']}")
        lines.append(f"    {r['snippet']}")
        lines.append(f"    {r['url']}")
    return "\n".join(lines)

agent_text = format_for_agent(compressed)
print(f"Agent text: {len(agent_text)} chars")
print(agent_text[:500])

Step 4: Deduplicate overlapping results

Remove near-duplicate results that waste agent context with redundant information.

Python
def deduplicate(results):
    seen_domains = set()
    unique = []
    for r in results:
        from urllib.parse import urlparse
        domain = urlparse(r["url"]).netloc
        if domain not in seen_domains:
            seen_domains.add(domain)
            unique.append(r)
    return unique

deduped = deduplicate(compressed)
print(f"After dedup: {len(deduped)} results (was {len(compressed)})")

Step 5: Build the compression wrapper

Combine all compression steps into a single function that replaces the raw search call in your agent.

Python
def agent_search(query, max_results=5):
    resp = requests.post("https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": API_KEY},
        json={"platform": "google", "query": query})
    compressed = compress_results(resp.json(), max_results)
    deduped = deduplicate(compressed)
    return format_for_agent(deduped)

result = agent_search("best CRM for startups 2026")
print(f"Final token-efficient output: {len(result)} chars")

Python Example

Python
import os, requests, json
API_KEY = os.environ["SCAVIO_API_KEY"]
resp = requests.post("https://api.scavio.dev/api/v1/search",
    headers={"x-api-key": API_KEY},
    json={"platform": "google", "query": "best CRM for startups 2026"})
results = resp.json().get("organic_results", [])[:5]
for r in results:
    print(f"{r['title'][:80]}\n  {r.get('snippet', '')[:150]}")

JavaScript Example

JavaScript
const r = await fetch("https://api.scavio.dev/api/v1/search", {
  method: "POST",
  headers: {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"},
  body: JSON.stringify({platform: "google", query: "best CRM for startups 2026"})
});
const data = await r.json();
(data.organic_results || []).slice(0, 5).forEach(r =>
  console.log(r.title.slice(0, 80), "\n ", (r.snippet || "").slice(0, 150))
);

Expected Output

JSON
A compressed text representation of search results that uses 60-80% fewer tokens than the raw JSON response while preserving all information an agent needs.

Related Tutorials

  • How to Stop Burning Claude Code Tokens on HTML Parsing
  • How to Audit Agent Token Usage per Tool

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+ installed. requests library installed. A Scavio API key from scavio.dev. An LLM agent that uses search tools. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Best Of

Best Agent Token Optimization Tools in 2026

Read more
Glossary

Agent Token Optimization

Read more
Best Of

Best Web Search API for Local LLMs in 2026

Read more
Solution

Reduce Agent Search Tokens with Structured JSON

Read more
Use Case

Token-Efficient Search Context for LLM Pipelines

Read more
Use Case

Token-Efficient Web Search for AI Agents

Read more

Start Building

Compress web search results before passing them to LLM agents. Cut token usage by 60-80% while preserving the information agents need to answer.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy