ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Build a YouTube Data Agent Pipeline
Tutorial

How to Build a YouTube Data Agent Pipeline

Build an AI agent that researches YouTube channels, analyzes video trends, and generates reports using Scavio YouTube search at $0.005/query.

Get Free API KeyAPI Docs

YouTube search data reveals what content performs in any niche, which creators dominate, and what gaps exist. This pipeline builds an agent that researches a topic on YouTube, ranks channels by consistency, and outputs a competitive landscape report. Each YouTube search costs $0.005 through Scavio, making a full niche analysis under $0.10.

Prerequisites

  • Python 3.8+
  • requests library
  • A Scavio API key from scavio.dev
  • Target niches or topics to research

Walkthrough

Step 1: Search YouTube for topic coverage

Query YouTube for videos on a topic and extract channel data.

Python
import os, requests, json
from collections import Counter, defaultdict

API_KEY = os.environ['SCAVIO_API_KEY']
SH = {'x-api-key': API_KEY, 'Content-Type': 'application/json'}

def youtube_search(query):
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=SH, json={'query': query, 'platform': 'youtube', 'country_code': 'us'}).json()
    results = data.get('organic_results', data.get('video_results', []))[:10]
    return [{'title': r.get('title', ''), 'channel': r.get('channel', {}).get('name', r.get('channel_name', '')),
             'views': r.get('views', 0), 'date': r.get('published_date', ''),
             'link': r.get('link', '')} for r in results]

videos = youtube_search('python web scraping tutorial 2026')
print(f'Found {len(videos)} videos')
for v in videos[:5]: print(f'  {v["channel"]:25} | {v["title"][:50]}')

Step 2: Map the channel landscape

Search multiple angles to build a map of who covers the topic.

Python
def map_landscape(topic, angles=None):
    if not angles:
        angles = [f'{topic} tutorial', f'{topic} guide 2026', f'{topic} for beginners',
                  f'best {topic}', f'{topic} tips']
    channels = defaultdict(lambda: {'videos': 0, 'total_views': 0, 'titles': []})
    for angle in angles:
        videos = youtube_search(angle)
        for v in videos:
            ch = v['channel'] or 'Unknown'
            channels[ch]['videos'] += 1
            channels[ch]['total_views'] += v.get('views', 0) if isinstance(v.get('views'), int) else 0
            channels[ch]['titles'].append(v['title'][:60])
    cost = len(angles) * 0.005
    print(f'\nLandscape for "{topic}" ({len(angles)} queries, ${cost:.3f}):')
    sorted_ch = sorted(channels.items(), key=lambda x: x[1]['videos'], reverse=True)
    for name, data in sorted_ch[:10]:
        print(f'  {name:25} | {data["videos"]} videos | views: {data["total_views"]:,}')
    return dict(channels), cost

channels, cost = map_landscape('web scraping')

Step 3: Identify content gaps

Find subtopics with low competition or outdated coverage.

Python
def find_gaps(topic, subtopics):
    gaps = []
    for sub in subtopics:
        query = f'{topic} {sub}'
        videos = youtube_search(query)
        recent = [v for v in videos if '2026' in v.get('date', '') or '2025' in v.get('date', '')]
        total = len(videos)
        print(f'  {sub:25} | {total} results | {len(recent)} recent')
        if total < 5 or len(recent) < 2:
            gaps.append({'subtopic': sub, 'total': total, 'recent': len(recent),
                'opportunity': 'low competition' if total < 5 else 'outdated content'})
    print(f'\nGaps found: {len(gaps)}')
    for g in gaps:
        print(f'  {g["subtopic"]}: {g["opportunity"]} ({g["total"]} total, {g["recent"]} recent)')
    return gaps

subtopics = ['playwright', 'scrapy', 'beautifulsoup', 'selenium', 'httpx', 'curl_cffi']
gaps = find_gaps('web scraping', subtopics)

Step 4: Generate the landscape report

Combine channel mapping and gap analysis into a report.

Python
def generate_report(topic, subtopics=None):
    if not subtopics:
        subtopics = ['beginner', 'advanced', 'tools', 'automation', 'api']
    print(f'=== YouTube Landscape Report: {topic} ===')
    channels, map_cost = map_landscape(topic)
    print(f'\n--- Content Gaps ---')
    gaps = find_gaps(topic, subtopics)
    gap_cost = len(subtopics) * 0.005
    total_cost = map_cost + gap_cost
    top_channels = sorted(channels.items(), key=lambda x: x[1]['videos'], reverse=True)[:5]
    report = {
        'topic': topic,
        'total_channels': len(channels),
        'top_5': [{'name': n, 'videos': d['videos']} for n, d in top_channels],
        'gaps': gaps,
        'cost': total_cost
    }
    with open(f'yt_report_{topic.replace(" ", "_")}.json', 'w') as f:
        json.dump(report, f, indent=2)
    print(f'\nTotal cost: ${total_cost:.3f}. Saved report.')
    return report

generate_report('web scraping')

Python Example

Python
import os, requests
from collections import Counter
SH = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}

def yt_landscape(topic):
    channels = Counter()
    for suffix in ['tutorial', 'guide 2026', 'for beginners']:
        data = requests.post('https://api.scavio.dev/api/v1/search',
            headers=SH, json={'query': f'{topic} {suffix}', 'platform': 'youtube', 'country_code': 'us'}).json()
        for r in data.get('organic_results', [])[:10]:
            channels[r.get('channel', {}).get('name', 'Unknown')] += 1
    print(f'{topic} YouTube landscape ({len(channels)} channels):')
    for ch, count in channels.most_common(5):
        print(f'  {ch}: {count} videos')

yt_landscape('web scraping')

JavaScript Example

JavaScript
const SH = { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' };
async function ytLandscape(topic) {
  const channels = {};
  for (const suffix of ['tutorial', 'guide 2026', 'for beginners']) {
    const data = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST', headers: SH,
      body: JSON.stringify({ query: `${topic} ${suffix}`, platform: 'youtube', country_code: 'us' })
    }).then(r => r.json());
    for (const r of (data.organic_results || []).slice(0, 10)) {
      const ch = r.channel?.name || 'Unknown';
      channels[ch] = (channels[ch] || 0) + 1;
    }
  }
  const sorted = Object.entries(channels).sort((a,b) => b[1]-a[1]);
  console.log(`${topic}: ${sorted.length} channels`);
  sorted.slice(0, 5).forEach(([ch, n]) => console.log(`  ${ch}: ${n} videos`));
}
ytLandscape('web scraping').catch(console.error);

Expected Output

JSON
Landscape for "web scraping" (5 queries, $0.025):
  Tech With Tim              | 4 videos | views: 2,340,000
  Corey Schafer              | 3 videos | views: 1,890,000
  John Watson Rooney         | 3 videos | views: 456,000
  NetworkChuck               | 2 videos | views: 3,200,000
  Fireship                   | 2 videos | views: 5,100,000

--- Content Gaps ---
  playwright                 | 8 results | 3 recent
  scrapy                     | 9 results | 2 recent
  beautifulsoup              | 10 results | 1 recent
  httpx                      | 3 results | 1 recent
  curl_cffi                  | 2 results | 0 recent

Gaps found: 2
  httpx: low competition (3 total, 1 recent)
  curl_cffi: low competition (2 total, 0 recent)

Total cost: $0.055. Saved report.

Related Tutorials

  • How to Build a Content Pipeline with Live Data
  • How to Build a Custom SEO Dashboard with a Search API
  • How to Monitor Search Surfaces Beyond Rank

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+. requests library. A Scavio API key from scavio.dev. Target niches or topics to research. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Best Of

Best YouTube Channel Data Tools and APIs in 2026

Read more
Best Of

Best YouTube Data API in 2026

Read more
Solution

Find YouTube Influencers via API Instead of Scraping

Read more
Use Case

Content Research Pipeline Agent

Read more
Use Case

YouTube Agent Data

Read more
Solution

Track YouTube Channels, Videos, and Trends

Read more

Start Building

Build an AI agent that researches YouTube channels, analyzes video trends, and generates reports using Scavio YouTube search at $0.005/query.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy