ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Set Up RAG with Search Instead of Vectors
Tutorial

How to Set Up RAG with Search Instead of Vectors

Replace a vector database with a live search API for RAG. Simpler setup, fresher data, no embedding costs. Python implementation with Anthropic Claude.

Get Free API KeyAPI Docs

Search-based RAG replaces a vector store with a live search API call. Instead of embedding documents and doing similarity search, you query the web (or a platform) for relevant content and inject the results directly into the LLM context.

Prerequisites

  • Python 3.9+
  • Scavio API key
  • anthropic SDK

Walkthrough

Step 1: Understand the pattern difference

Traditional RAG: embed documents -> store vectors -> similarity search -> inject. Search RAG: query API -> get snippets -> inject. No embedding step, no vector DB.

Python
# Traditional vector RAG (what you're replacing)
# 1. Embed your corpus (expensive, slow)
# 2. Store in Pinecone/Chroma/Weaviate
# 3. Embed the query
# 4. Cosine similarity search
# 5. Retrieve top-k chunks
# 6. Inject into prompt

# Search RAG (this tutorial)
# 1. Convert question to search query
# 2. Call search API (1 API call, 1 credit)
# 3. Extract top snippets
# 4. Inject into prompt
# Done. Fresh data. No embedding costs.

Step 2: Build the retrieval function

The retriever takes a question, searches for relevant content, and returns formatted snippets.

Python
import requests

SCAVIO_KEY = "your-scavio-api-key"

def retrieve(question: str, num_results: int = 5, platform: str = None) -> list[dict]:
    payload = {"query": question, "num_results": num_results}
    if platform:
        payload["platform"] = platform
    r = requests.post(
        "https://api.scavio.dev/api/v1/search",
        json=payload,
        headers={"x-api-key": SCAVIO_KEY},
        timeout=15
    )
    r.raise_for_status()
    results = r.json().get("organic_results", [])
    return [{"title": res["title"], "snippet": res.get("snippet", ""), "url": res["link"]} for res in results]

Step 3: Format retrieved docs as context

Convert the search results into a context block that the LLM can reference.

Python
def format_context(docs: list[dict]) -> str:
    lines = []
    for i, doc in enumerate(docs, 1):
        lines.append(f"[{i}] {doc['title']}\nURL: {doc['url']}\n{doc['snippet']}")
    return "\n---\n".join(lines)

Step 4: Generate answer with Anthropic Claude

Inject the retrieved context into the prompt and get a grounded answer.

Python
import anthropic

ANTHROPIC_KEY = "your-anthropic-key"

def rag_answer(question: str, platform: str = None) -> dict:
    docs = retrieve(question, num_results=5, platform=platform)
    context = format_context(docs)
    prompt = f"""Use the following search results to answer the question. Cite sources by number.

{context}

Question: {question}
Answer:"""

    client = anthropic.Anthropic(api_key=ANTHROPIC_KEY)
    msg = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": prompt}]
    )
    return {"answer": msg.content[0].text, "sources": [d["url"] for d in docs]}

result = rag_answer("What are the latest AI models released in 2026?")
print(result["answer"])
print("\nSources:", result["sources"])

Python Example

Python
import requests
import anthropic

SCAVIO_KEY = "your-scavio-api-key"
ANTHROPIC_KEY = "your-anthropic-key"

def retrieve(question: str, n: int = 5, platform: str = None) -> list[dict]:
    payload = {"query": question, "num_results": n}
    if platform:
        payload["platform"] = platform
    r = requests.post(
        "https://api.scavio.dev/api/v1/search",
        json=payload,
        headers={"x-api-key": SCAVIO_KEY},
        timeout=15
    )
    r.raise_for_status()
    return [{"title": d["title"], "snippet": d.get("snippet",""), "url": d["link"]}
            for d in r.json().get("organic_results", [])]

def format_context(docs: list) -> str:
    return "\n---\n".join(f"[{i}] {d['title']}\n{d['snippet']}\n{d['url']}" for i, d in enumerate(docs, 1))

def rag_answer(question: str, platform: str = None) -> dict:
    docs = retrieve(question, n=5, platform=platform)
    context = format_context(docs)
    prompt = f"Use these search results to answer. Cite source numbers.\n\n{context}\n\nQuestion: {question}\nAnswer:"
    client = anthropic.Anthropic(api_key=ANTHROPIC_KEY)
    msg = client.messages.create(model="claude-sonnet-4-6", max_tokens=1024,
                                  messages=[{"role": "user", "content": prompt}])
    return {"answer": msg.content[0].text, "sources": [d["url"] for d in docs]}

if __name__ == "__main__":
    questions = [
        "What are the most popular vector databases in 2026?",
        "Latest AI coding assistants compared"
    ]
    for q in questions:
        result = rag_answer(q)
        print(f"Q: {q}")
        print(f"A: {result['answer'][:300]}...\n")

JavaScript Example

JavaScript
const SCAVIO_KEY = 'your-scavio-api-key';
const ANTHROPIC_KEY = 'your-anthropic-key';

async function retrieve(question, n = 5, platform = null) {
  const payload = { query: question, num_results: n };
  if (platform) payload.platform = platform;
  const res = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'x-api-key': SCAVIO_KEY },
    body: JSON.stringify(payload)
  });
  const data = await res.json();
  return (data.organic_results ?? []).map(d => ({ title: d.title, snippet: d.snippet ?? '', url: d.link }));
}

function formatContext(docs) {
  return docs.map((d, i) => `[${i+1}] ${d.title}\n${d.snippet}\n${d.url}`).join('\n---\n');
}

async function ragAnswer(question, platform = null) {
  const docs = await retrieve(question, 5, platform);
  const context = formatContext(docs);
  const prompt = `Use these search results to answer. Cite source numbers.\n\n${context}\n\nQuestion: ${question}\nAnswer:`;

  const res = await fetch('https://api.anthropic.com/v1/messages', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'x-api-key': ANTHROPIC_KEY, 'anthropic-version': '2023-06-01' },
    body: JSON.stringify({ model: 'claude-sonnet-4-6', max_tokens: 1024, messages: [{ role: 'user', content: prompt }] })
  });
  const msg = await res.json();
  return { answer: msg.content[0].text, sources: docs.map(d => d.url) };
}

const result = await ragAnswer('What are the most popular vector databases in 2026?');
console.log(result.answer);

Expected Output

JSON
Based on the search results, the most popular vector databases in 2026 include:

1. Pinecone - serverless, widely used in production [1]
2. Weaviate - open source with hybrid search [2]
3. Qdrant - performance-focused, Rust-based [3]
4. Chroma - popular for local development [4]
5. pgvector - PostgreSQL extension for teams already using Postgres [5]

Sources: https://pinecone.io, https://weaviate.io, ...

Related Tutorials

  • How to Ground LLM Output with Live SERP Data
  • How to Build a Multi-Source Research Agent
  • How to Integrate a Search API with CrewAI

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.9+. Scavio API key. anthropic SDK. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Best Of

Best Search APIs for RAG Grounding in Production in 2026

Read more
Best Of

Best Search APIs for LangChain RAG Pipelines in May 2026

Read more
Use Case

Live Search in LangChain RAG Pipeline

Read more
Solution

Improve RAG Answer Quality with Search Grounding

Read more
Use Case

Local RAG + Search API Hybrid Application

Read more
Glossary

Search-Augmented RAG

Read more

Start Building

Replace a vector database with a live search API for RAG. Simpler setup, fresher data, no embedding costs. Python implementation with Anthropic Claude.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy