ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Evaluate MCP Servers for Data Quality
Tutorial

How to Evaluate MCP Servers for Data Quality

Build an automated evaluation harness to compare MCP search servers on freshness, coverage, and accuracy. Step-by-step with Python code.

Get Free API KeyAPI Docs

Evaluate MCP servers for data quality by running a standardized set of test queries and scoring the results on freshness, coverage, and factual accuracy. Most MCP server comparisons focus on latency and uptime but ignore the quality of the data returned, which directly impacts LLM output. This tutorial builds a scoring harness that tests a search MCP server against a curated set of queries with known-good answers, then produces a quality report. We use Scavio's MCP endpoint as the server under test.

Prerequisites

  • Python 3.8+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • A set of test queries with known expected results

Walkthrough

Step 1: Define the evaluation dataset

Create a list of test queries paired with expected attributes like minimum result count and required domains.

Python
import os, requests

API_KEY = os.environ['SCAVIO_API_KEY']

EVAL_SET = [
    {'query': 'python 3.13 release date', 'expected_domain': 'python.org', 'min_results': 3},
    {'query': 'react 19 new features', 'expected_domain': 'react.dev', 'min_results': 3},
    {'query': 'nvidia h200 price', 'expected_domain': 'nvidia.com', 'min_results': 2},
    {'query': 'fastapi latest version', 'expected_domain': 'fastapi.tiangolo.com', 'min_results': 3},
]

Step 2: Run queries and collect results

Send each evaluation query through the Scavio API and record the raw results for scoring.

Python
def run_eval_query(test_case: dict) -> dict:
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'platform': 'google', 'query': test_case['query']}, timeout=15)
    resp.raise_for_status()
    results = resp.json().get('organic_results', [])
    return {
        'query': test_case['query'],
        'results': results,
        'expected_domain': test_case['expected_domain'],
        'min_results': test_case['min_results'],
    }

Step 3: Score each response

Score on three dimensions: coverage (result count meets minimum), authority (expected domain appears in top results), and freshness (results contain current-year dates).

Python
def score_response(eval_result: dict) -> dict:
    results = eval_result['results']
    coverage = 1.0 if len(results) >= eval_result['min_results'] else len(results) / eval_result['min_results']
    domain_found = any(eval_result['expected_domain'] in r.get('link', '') for r in results[:5])
    authority = 1.0 if domain_found else 0.0
    year_mentions = sum(1 for r in results[:5] if '2026' in r.get('snippet', '') or '2025' in r.get('snippet', ''))
    freshness = min(year_mentions / 3, 1.0)
    return {
        'query': eval_result['query'],
        'coverage': round(coverage, 2),
        'authority': authority,
        'freshness': round(freshness, 2),
        'composite': round((coverage + authority + freshness) / 3, 2),
    }

Step 4: Generate the quality report

Run the full evaluation and print a summary report with per-query scores and an aggregate quality score.

Python
def run_evaluation():
    scores = []
    for test in EVAL_SET:
        result = run_eval_query(test)
        score = score_response(result)
        scores.append(score)
        print(f'{score["query"][:40]:<42} C={score["coverage"]} A={score["authority"]} F={score["freshness"]} => {score["composite"]}')
    avg = round(sum(s['composite'] for s in scores) / len(scores), 2)
    print(f'\nAggregate quality score: {avg}/1.00')
    return scores

run_evaluation()

Python Example

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def eval_query(query, expected_domain):
    data = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': query}, timeout=15).json()
    results = data.get('organic_results', [])
    found = any(expected_domain in r.get('link', '') for r in results[:5])
    return {'query': query, 'count': len(results), 'authority': found}

print(eval_query('python 3.13 release date', 'python.org'))

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function evalQuery(query, expectedDomain) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H, body: JSON.stringify({platform: 'google', query})
  });
  const results = (await r.json()).organic_results || [];
  const found = results.slice(0, 5).some(r => r.link?.includes(expectedDomain));
  return {query, count: results.length, authority: found};
}
evalQuery('python 3.13 release date', 'python.org').then(console.log);

Expected Output

JSON
A quality report scoring each MCP server test query on coverage, authority, and freshness, with an aggregate composite score out of 1.00.

Related Tutorials

  • How to Benchmark SERP API Providers Yourself
  • How to Monitor MCP Server Health for Production Agents

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+ installed. requests library installed. A Scavio API key from scavio.dev. A set of test queries with known expected results. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Best Of

Best MCP Search Servers: Community Edition, May 2026

Read more
Best Of

Best MCP Search Server in 2026

Read more
Use Case

MCP Custom Search Server

Read more
Use Case

MCP Search Gateway for Multi-Agent Systems

Read more
Glossary

MCP Data Server

Read more
Solution

Give AI Agents Multi-Source Search via MCP

Read more

Start Building

Build an automated evaluation harness to compare MCP search servers on freshness, coverage, and accuracy. Step-by-step with Python code.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy