ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Ground an LLM with GitHub Repo Data
Tutorial

How to Ground an LLM with GitHub Repo Data

Ground LLM answers in actual repo content by combining GitHub search via SERP site operators with Scavio's fetch endpoint.

Get Free API KeyAPI Docs

Grounding LLM answers in source code beats hallucinated explanations. This tutorial uses Scavio's SERP with site:github.com plus its fetch endpoint to bring repo content into the agent loop without a heavy GitHub API integration.

Prerequisites

  • Python 3.10+
  • A Scavio API key
  • An LLM API key

Walkthrough

Step 1: Search inside a repo via SERP

site:github.com/ORG/REPO scoped search finds the right file fast.

Python
import requests, os
API_KEY = os.environ['SCAVIO_API_KEY']

def repo_search(repo, query):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': f'site:github.com/{repo} {query}', 'num_results': 10})
    return r.json().get('organic_results', [])

Step 2: Fetch the selected file

GitHub raw URLs work with Scavio's fetch endpoint.

Python
def fetch_raw(url):
    raw = url.replace('github.com', 'raw.githubusercontent.com').replace('/blob/', '/')
    r = requests.post('https://api.scavio.dev/api/v1/extract',
        headers={'x-api-key': API_KEY},
        json={'url': raw})
    return r.json().get('content', '')

Step 3: Ground the answer

Pass the fetched content into the LLM prompt with source citation.

Python
import anthropic
client = anthropic.Anthropic()

def grounded_answer(repo, question):
    hits = repo_search(repo, question)
    content = fetch_raw(hits[0]['link']) if hits else ''
    msg = client.messages.create(
        model='claude-sonnet-4-6',
        max_tokens=1024,
        messages=[{'role': 'user', 'content': f'{question}\n\nCONTEXT:\n{content[:4000]}'}])
    return msg.content[0].text

Step 4: Add multi-file composition

Pull top 3 results, rank by relevance, compose context.

Python
def multi_file_context(repo, question):
    hits = repo_search(repo, question)[:3]
    return '\n\n'.join([fetch_raw(h['link'])[:2000] for h in hits])

Step 5: Validate citations

Ensure the LLM response mentions at least one source URL.

Python
def has_citations(answer, urls):
    return any(u in answer for u in urls)

Python Example

Python
import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']

def repo_grounded(repo, question):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': f'site:github.com/{repo} {question}'})
    return r.json().get('organic_results', [])[:3]

print(repo_grounded('prisma/prisma', 'migrate.ts'))

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;
export async function repoGrounded(repo, question) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: `site:github.com/${repo} ${question}` })
  });
  return ((await r.json()).organic_results || []).slice(0, 3);
}

Expected Output

JSON
LLM answers cite exact files and code paths in the target repo. Hallucination rate drops materially versus ungrounded answers.

Related Tutorials

  • How to Build a Coding Agent with Realtime GitHub Issues and Docs Search
  • How to Convert API Docs to Markdown for Cursor
  • How to Convert a Website to LLM-Ready Markdown

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10+. A Scavio API key. An LLM API key. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Use Case

RAG Grounding Post-Google I/O 2026

Read more
Workflow

GitHub Issue Context for Coding Agents

Read more
Best Of

Best Agent Search Grounding Tools in 2026

Read more
Solution

Improve RAG Answer Quality with Search Grounding

Read more
Use Case

Karpathy LLM Wiki-Style RAG Agent

Read more
Glossary

Grounding LLM Workflows

Read more

Start Building

Ground LLM answers in actual repo content by combining GitHub search via SERP site operators with Scavio's fetch endpoint.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy