Build an LLM Wiki RAG Agent (2026)

An r/AI_Agents post asked specifically about tools for a Karpathy-style LLM Wiki: search, scraping, MCPs, ingestion. This walks the minimum stack with verified-online costs.

Prerequisites

Python 3.10+
Scavio API key
Qdrant Cloud free tier or self-hosted Qdrant
An LLM API (Claude/OpenAI/DeepSeek)

Walkthrough

Step 1: Discover sources via Scavio search

For a topic, get top SERP + top Reddit threads + top YouTube videos.

Python

import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def discover(topic):
    return {
        'web': requests.post('https://api.scavio.dev/api/v1/search', headers=H, json={'query': topic}).json(),
        'reddit': requests.post('https://api.scavio.dev/api/v1/reddit/search', headers=H, json={'query': topic}).json(),
        'youtube': requests.post('https://api.scavio.dev/api/v1/youtube/search', headers=H, json={'query': topic}).json(),
    }

Step 2: Extract clean markdown for top sources

Per source, /extract returns markdown ready for embedding.

Python

def extract(url):
    return requests.post('https://api.scavio.dev/api/v1/extract',
        headers=H, json={'url': url, 'format': 'markdown'}).json()

Step 3: Embed and store in Qdrant

Chunk markdown, embed, upsert with source URL as payload.

Python

from qdrant_client import QdrantClient
from qdrant_client.models import PointStruct
client = QdrantClient(url='https://your-qdrant.cloud')
# embed_fn = your embedding function (OpenAI/Cohere/Jina)
for i, chunk in enumerate(chunks):
    client.upsert(collection_name='wiki', points=[PointStruct(
        id=i, vector=embed_fn(chunk), payload={'text': chunk, 'url': source_url})])

Step 4: Query with citation prompt

LLM emits [N] markers tied to chunk source URLs.

Python

def answer(question, k=5):
    hits = client.search(collection_name='wiki', query_vector=embed_fn(question), limit=k)
    sources = [{'i': i+1, 'text': h.payload['text'], 'url': h.payload['url']} for i, h in enumerate(hits)]
    prompt = f'Question: {question}\nSources:\n' + '\n'.join(f'[{s["i"]}] {s["url"]}: {s["text"][:300]}' for s in sources)
    prompt += '\nAnswer with [N] citations referencing sources.'
    return llm.complete(prompt), sources

Step 5: Render with clickable citations

[1] becomes a link to the source URL.

Python

import re
def render(answer, sources):
    for s in sources:
        answer = answer.replace(f'[{s["i"]}]', f'[[{s["i"]}]]({s["url"]})')
    return answer

Python Example

Python

# Cost per question: ~5 search credits + ~3 extract credits + 1 LLM call = ~$0.04-0.10

JavaScript Example

JavaScript

// Same flow in TS using qdrant-js + Scavio fetch calls.

Expected Output

JSON

LLM Wiki agent that pulls from Google + Reddit + YouTube under one Scavio key, embeds into Qdrant, answers with clickable citations. Stack cost: Scavio $30 + Qdrant Cloud ~$25 + LLM tokens.

Related Tutorials

How to Build a RAG Pipeline with Citations Using Scavio

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10+. Scavio API key. Qdrant Cloud free tier or self-hosted Qdrant. An LLM API (Claude/OpenAI/DeepSeek). A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Walkthrough

Step 1: Discover sources via Scavio search

For a topic, get top SERP + top Reddit threads + top YouTube videos.

Python

import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def discover(topic):
    return {
        'web': requests.post('https://api.scavio.dev/api/v1/search', headers=H, json={'query': topic}).json(),
        'reddit': requests.post('https://api.scavio.dev/api/v1/reddit/search', headers=H, json={'query': topic}).json(),
        'youtube': requests.post('https://api.scavio.dev/api/v1/youtube/search', headers=H, json={'query': topic}).json(),
    }

Step 2: Extract clean markdown for top sources

Per source, /extract returns markdown ready for embedding.

Python

def extract(url):
    return requests.post('https://api.scavio.dev/api/v1/extract',
        headers=H, json={'url': url, 'format': 'markdown'}).json()

Step 3: Embed and store in Qdrant

Chunk markdown, embed, upsert with source URL as payload.

Python

from qdrant_client import QdrantClient
from qdrant_client.models import PointStruct
client = QdrantClient(url='https://your-qdrant.cloud')
# embed_fn = your embedding function (OpenAI/Cohere/Jina)
for i, chunk in enumerate(chunks):
    client.upsert(collection_name='wiki', points=[PointStruct(
        id=i, vector=embed_fn(chunk), payload={'text': chunk, 'url': source_url})])

Step 4: Query with citation prompt

LLM emits [N] markers tied to chunk source URLs.

Python

def answer(question, k=5):
    hits = client.search(collection_name='wiki', query_vector=embed_fn(question), limit=k)
    sources = [{'i': i+1, 'text': h.payload['text'], 'url': h.payload['url']} for i, h in enumerate(hits)]
    prompt = f'Question: {question}\nSources:\n' + '\n'.join(f'[{s["i"]}] {s["url"]}: {s["text"][:300]}' for s in sources)
    prompt += '\nAnswer with [N] citations referencing sources.'
    return llm.complete(prompt), sources

Step 5: Render with clickable citations

[1] becomes a link to the source URL.

Python

import re
def render(answer, sources):
    for s in sources:
        answer = answer.replace(f'[{s["i"]}]', f'[[{s["i"]}]]({s["url"]})')
    return answer

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10+. Scavio API key. Qdrant Cloud free tier or self-hosted Qdrant. An LLM API (Claude/OpenAI/DeepSeek). A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

How to Build a Karpathy-Style LLM Wiki RAG Agent

Prerequisites

Walkthrough

Step 1: Discover sources via Scavio search

Step 2: Extract clean markdown for top sources

Step 3: Embed and store in Qdrant

Step 4: Query with citation prompt

Step 5: Render with clickable citations

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this build a karpathy-style llm wiki rag agent tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

Best Tools for LLM Wiki-Style RAG Stacks in 2026

Karpathy LLM Wiki-Style RAG Agent

LLM Wiki Research Stack

LLM Wiki Multi-Source Ingestion

LLM Wiki Ingestion Workflow

Best RAG Data Source Tools Without Firecrawl (2026)

Start Building

How to Build a Karpathy-Style LLM Wiki RAG Agent

Prerequisites

Walkthrough

Step 1: Discover sources via Scavio search

Step 2: Extract clean markdown for top sources

Step 3: Embed and store in Qdrant

Step 4: Query with citation prompt

Step 5: Render with clickable citations

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this build a karpathy-style llm wiki rag agent tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

Best Tools for LLM Wiki-Style RAG Stacks in 2026

Karpathy LLM Wiki-Style RAG Agent

LLM Wiki Research Stack

LLM Wiki Multi-Source Ingestion

LLM Wiki Ingestion Workflow

Best RAG Data Source Tools Without Firecrawl (2026)

Start Building