ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Build a Web Extraction MCP Server
Tutorial

How to Build a Web Extraction MCP Server

Build an MCP server that exposes web search and page extraction as tools for AI agents. Uses Scavio search and extract endpoints.

Get Free API KeyAPI Docs

Build a web extraction MCP server by wrapping Scavio's search and extract endpoints as MCP tools that any compatible AI client can call. This gives agents the ability to both find relevant pages and extract structured content from them in a single workflow. The search tool discovers URLs via Google, Reddit, or YouTube queries, and the extract tool pulls clean text and metadata from any URL. This tutorial builds both tools as a single MCP server using the MCP SDK.

Prerequisites

  • Node.js 18+ installed
  • A Scavio API key from scavio.dev
  • Basic understanding of the MCP protocol
  • npm or bun package manager

Walkthrough

Step 1: Scaffold the MCP server

Set up a minimal MCP server project using the MCP SDK that registers two tools: search and extract.

// package.json:
// { "dependencies": { "@modelcontextprotocol/sdk": "latest" } }

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';

const API_KEY = process.env.SCAVIO_API_KEY;
const server = new McpServer({ name: 'web-extraction', version: '1.0.0' });

Step 2: Add the search tool

Register a search tool that queries Scavio and returns structured results the agent can use to decide which pages to extract.

server.tool('search', 'Search Google, Reddit, YouTube, Amazon, or Walmart',
  { query: z.string(), platform: z.string().default('google') },
  async ({ query, platform }) => {
    const resp = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST',
      headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
      body: JSON.stringify({ platform, query }),
    });
    const data = await resp.json();
    return { content: [{ type: 'text', text: JSON.stringify(data.organic_results?.slice(0, 5) || [], null, 2) }] };
  }
);

Step 3: Add the extract tool

Register an extract tool that takes a URL and returns the page's clean text content via Scavio's extract endpoint.

server.tool('extract', 'Extract clean text content from a URL',
  { url: z.string().url() },
  async ({ url }) => {
    const resp = await fetch('https://api.scavio.dev/api/v1/extract', {
      method: 'POST',
      headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
      body: JSON.stringify({ url }),
    });
    const data = await resp.json();
    return { content: [{ type: 'text', text: data.text || data.content || JSON.stringify(data) }] };
  }
);

Step 4: Start the server and configure Claude

Connect the server via stdio transport and add it to your Claude Desktop or Claude Code MCP config.

// Start the server:
const transport = new StdioServerTransport();
await server.connect(transport);

// In .mcp.json:
// {
//   "mcpServers": {
//     "web-extraction": {
//       "command": "node",
//       "args": ["server.js"],
//       "env": { "SCAVIO_API_KEY": "your_key" }
//     }
//   }
// }
//
// Now Claude can: search for pages, then extract content from the best results.

Python Example

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def search(query, platform='google'):
    return requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': platform, 'query': query}).json().get('organic_results', [])

def extract(url):
    return requests.post('https://api.scavio.dev/api/v1/extract', headers=H,
        json={'url': url}).json()

results = search('best crm 2026')
if results:
    content = extract(results[0]['link'])
    print(content.get('text', '')[:500])

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function search(query) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H, body: JSON.stringify({platform: 'google', query})
  });
  return (await r.json()).organic_results || [];
}
async function extract(url) {
  const r = await fetch('https://api.scavio.dev/api/v1/extract', {
    method: 'POST', headers: H, body: JSON.stringify({url})
  });
  return r.json();
}
const results = await search('best crm 2026');
if (results[0]) console.log((await extract(results[0].link)).text?.slice(0, 500));

Expected Output

JSON
An MCP server exposing search and extract tools that AI agents can call to discover URLs and extract their content in a single workflow.

Related Tutorials

  • How to Build an MCP Documentation Search Server
  • How to Extract Structured Data from Any Website

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Node.js 18+ installed. A Scavio API key from scavio.dev. Basic understanding of the MCP protocol. npm or bun package manager. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Use Case

MCP Custom Search Server

Read more
Best Of

Best MCP Search Servers: Community Edition, May 2026

Read more
Best Of

Best MCP Search Tools for Claude Desktop in 2026

Read more
Use Case

IDE MCP Search

Read more
Comparison

Scavio MCP vs Built-in MCP web_search

Read more
Solution

Give AI Agents Multi-Source Search via MCP

Read more

Start Building

Build an MCP server that exposes web search and page extraction as tools for AI agents. Uses Scavio search and extract endpoints.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy