ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Build a Search Trust Score Pipeline
Tutorial

How to Build a Search Trust Score Pipeline

Build a pipeline that assigns trust scores to search results based on source authority, freshness, and cross-reference consistency.

Get Free API KeyAPI Docs

Not all search results are equally trustworthy. A .gov domain citing primary data is more reliable than a content-farm blog post. This tutorial builds a trust scoring pipeline that evaluates each search result on source authority, content freshness, and cross-reference consistency. The scores help AI agents prioritize reliable sources and flag questionable ones. Cost: $0.005 per search, plus optional verification queries.

Prerequisites

  • Python 3.9+ installed
  • requests library installed
  • A Scavio API key from scavio.dev

Walkthrough

Step 1: Define source authority tiers

Classify domains into authority tiers based on their TLD and known reputation. This provides a baseline trust signal.

Python
AUTHORITY_TIERS = {
    'tier1': {
        'domains': {'gov', 'edu', 'mil'},
        'known_sites': {'reuters.com', 'apnews.com', 'nature.com', 'science.org',
                        'arxiv.org', 'nih.gov', 'cdc.gov', 'who.int'},
        'score': 90
    },
    'tier2': {
        'domains': set(),
        'known_sites': {'nytimes.com', 'bbc.com', 'washingtonpost.com',
                        'github.com', 'stackoverflow.com', 'docs.python.org',
                        'developer.mozilla.org', 'microsoft.com'},
        'score': 75
    },
    'tier3': {
        'domains': {'org', 'io'},
        'known_sites': {'medium.com', 'dev.to', 'hackernoon.com', 'reddit.com'},
        'score': 50
    },
}

def get_authority_score(url: str) -> int:
    domain = url.split('/')[2] if '/' in url else ''
    tld = domain.split('.')[-1]
    for tier_name, tier in AUTHORITY_TIERS.items():
        if domain in tier['known_sites'] or tld in tier['domains']:
            return tier['score']
    return 30  # unknown domain baseline

test_urls = ['https://nih.gov/study', 'https://github.com/repo',
             'https://randomsite.xyz/blog']
for url in test_urls:
    print(f'  {url}: authority={get_authority_score(url)}')

Step 2: Add freshness scoring

Score results based on how recently the content was published or updated. Extract dates from snippets and URLs.

Python
import re
from datetime import datetime

def get_freshness_score(snippet: str, url: str) -> int:
    """Score freshness from 0-100 based on detected dates."""
    text = snippet + ' ' + url
    # Look for year patterns
    years = re.findall(r'20(2[4-9])', text)
    if years:
        latest_year = max(int('20' + y) for y in years)
        current_year = 2026
        age = current_year - latest_year
        if age == 0:
            return 100  # current year
        elif age == 1:
            return 70
        elif age == 2:
            return 40
        else:
            return 10
    # Look for month-year patterns
    months = re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\w*\s+202[4-9]', text)
    if months:
        return 80  # has a recent date reference
    return 20  # no date information found

test_snippets = [
    ('Updated May 2026 - Best CRM tools', 'https://site.com/crm-2026'),
    ('A comprehensive guide from 2024', 'https://site.com/old-guide'),
    ('Learn Python programming basics', 'https://site.com/python'),
]
for snippet, url in test_snippets:
    print(f'  freshness={get_freshness_score(snippet, url):3d}: {snippet[:50]}')

Step 3: Build the composite trust scoring pipeline

Combine authority, freshness, and cross-reference consistency into a single trust score for each search result.

Python
import requests, os

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

def trust_score_results(query: str) -> list:
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': 10})
    results = resp.json().get('organic_results', [])
    scored = []
    # Collect all snippets for cross-reference
    all_snippets = [r.get('snippet', '').lower() for r in results]
    for i, r in enumerate(results):
        authority = get_authority_score(r['link'])
        freshness = get_freshness_score(r.get('snippet', ''), r['link'])
        # Cross-reference: do other results mention similar facts?
        my_keywords = set(re.findall(r'\b\w{5,}\b', r.get('snippet', '').lower()))
        cross_ref = 0
        for j, other in enumerate(all_snippets):
            if i != j:
                other_words = set(re.findall(r'\b\w{5,}\b', other))
                overlap = len(my_keywords & other_words)
                if overlap > 3:
                    cross_ref += 1
        consistency = min(cross_ref * 20, 100)
        # Weighted composite
        trust = round(authority * 0.4 + freshness * 0.3 + consistency * 0.3)
        scored.append({
            'title': r['title'][:50], 'url': r['link'],
            'trust_score': trust, 'authority': authority,
            'freshness': freshness, 'consistency': consistency
        })
    scored.sort(key=lambda x: -x['trust_score'])
    return scored

results = trust_score_results('best CRM software 2026')
print(f'{"Score":>5} {"Auth":>5} {"Fresh":>5} {"Cross":>5}  Title')
print('-' * 70)
for r in results[:5]:
    print(f'{r["trust_score"]:>5} {r["authority"]:>5} {r["freshness"]:>5} '
          f'{r["consistency"]:>5}  {r["title"]}')

Python Example

Python
import requests, os, re

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
KNOWN = {'gov': 90, 'edu': 90, 'github.com': 75, 'stackoverflow.com': 75}

def trust_score(query):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': 10})
    for r in resp.json().get('organic_results', []):
        domain = r['link'].split('/')[2] if '/' in r['link'] else ''
        tld = domain.split('.')[-1]
        auth = KNOWN.get(domain, KNOWN.get(tld, 30))
        fresh = 100 if '2026' in r.get('snippet', '') else 40
        score = int(auth * 0.5 + fresh * 0.5)
        print(f'[{score:3d}] {r["title"][:50]}')

trust_score('python best practices 2026')

JavaScript Example

JavaScript
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;
const KNOWN = { gov: 90, edu: 90, 'github.com': 75, 'stackoverflow.com': 75 };

async function trustScore(query) {
  const resp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, country_code: 'us', num_results: 10 })
  });
  for (const r of (await resp.json()).organic_results || []) {
    const domain = new URL(r.link).hostname;
    const tld = domain.split('.').pop();
    const auth = KNOWN[domain] || KNOWN[tld] || 30;
    const fresh = (r.snippet || '').includes('2026') ? 100 : 40;
    console.log(`[${Math.round(auth*0.5+fresh*0.5)}] ${r.title.slice(0, 50)}`);
  }
}

trustScore('python best practices 2026');

Expected Output

JSON
Score  Auth Fresh Cross  Title
----------------------------------------------------------------------
   82    90   100    40  NIH Guidelines on Data Analysis 2026
   75    75   100    60  GitHub - python-best-practices: Updated May
   68    75    70    60  Stack Overflow: Python 3.14 New Features
   52    30   100    40  Best Python Practices 2026 - TechBlog
   38    30    40    40  Python Tips and Tricks - randomsite.com

Related Tutorials

  • How to Verify AI Search Results Programmatically
  • How to Benchmark Search API Quality Per Dollar
  • How to Build a Multi-Engine Search Fallback Agent

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.9+ installed. requests library installed. A Scavio API key from scavio.dev. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Glossary

Search API Provider Landscape (2026)

Read more
Best Of

Best Search APIs for Pipeline Integration in 2026

Read more
Best Of

Best Budget Search APIs for AI Agents Under $10/mo (2026)

Read more
Glossary

Search Result Trust Verification Layer

Read more
Workflow

Search API Vendor Evaluation Pipeline

Read more
Solution

Migrate from Brave Search API to Scavio for Better Coverage

Read more

Start Building

Build a pipeline that assigns trust scores to search results based on source authority, freshness, and cross-reference consistency.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy