ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Workflows
  3. Deep Research Daily Pipeline
Workflow

Deep Research Daily Pipeline

Daily deep research agent pipeline using search, extraction, and structured analysis. Replace multi-tool stacks with one API.

Start FreeAPI Docs

Overview

This workflow runs a daily deep research pipeline that searches across Google, Reddit, and YouTube for target topics, extracts key content from top results, and compiles structured research briefs. It replaces a multi-tool stack (Serper + Jina + E2B) with a single API for search and extraction.

Trigger

Cron schedule (daily at 5:00 AM UTC)

Schedule

Runs daily at 5:00 AM UTC

Workflow Steps

1

Load research topics

Read the daily research topic list from configuration. Topics can be static keywords or dynamically generated from previous day's signals.

2

Multi-platform search

Search each topic on Google, Reddit, and YouTube to gather diverse perspectives and source types.

3

Extract top result content

Use Scavio extract endpoint to pull full content from the top 3 Google results for each topic.

4

Compile research brief

Combine search results and extracted content into a structured research brief per topic.

5

Archive and notify

Save research briefs to archive and send summary notification via webhook or email.

Python Implementation

Python
import requests
import json
from pathlib import Path
from datetime import datetime

API_KEY = "your_scavio_api_key"
BASE = "https://api.scavio.dev/api/v1"

TOPICS = ["AI agent search tools 2026", "SERP API pricing changes", "MCP server adoption"]

def search_platform(query: str, platform: str) -> list[dict]:
    res = requests.post(
        f"{BASE}/search",
        headers={"x-api-key": API_KEY},
        json={"platform": platform, "query": query},
        timeout=15,
    )
    res.raise_for_status()
    return res.json().get("organic", [])

def extract_content(url: str) -> dict:
    res = requests.post(
        f"{BASE}/extract",
        headers={"x-api-key": API_KEY},
        json={"url": url},
        timeout=30,
    )
    res.raise_for_status()
    return res.json()

def research_topic(topic: str) -> dict:
    google_results = search_platform(topic, "google")
    reddit_results = search_platform(topic, "reddit")
    youtube_results = search_platform(topic, "youtube")

    # Extract top 3 Google results
    extracted = []
    for result in google_results[:3]:
        url = result.get("link", "")
        if url:
            try:
                content = extract_content(url)
                extracted.append({
                    "url": url,
                    "title": result.get("title", ""),
                    "content_preview": content.get("text", "")[:500],
                })
            except Exception:
                pass

    return {
        "topic": topic,
        "google_count": len(google_results),
        "reddit_count": len(reddit_results),
        "youtube_count": len(youtube_results),
        "extracted_pages": len(extracted),
        "top_reddit": [{"title": r.get("title", ""), "score": r.get("score", 0)} for r in reddit_results[:5]],
        "top_youtube": [{"title": r.get("title", ""), "views": r.get("views", 0)} for r in youtube_results[:5]],
        "extracted": extracted,
    }

def run():
    date = datetime.utcnow().strftime("%Y-%m-%d")
    briefs = [research_topic(t) for t in TOPICS]
    total_credits = sum(3 + b["extracted_pages"] for b in briefs)  # 3 searches + extractions per topic
    report = {"date": date, "topics": len(TOPICS), "credits_used": total_credits, "briefs": briefs}
    Path(f"research_{date}.json").write_text(json.dumps(report, indent=2))
    print(f"Research complete: {len(TOPICS)} topics, {total_credits} credits")
    for brief in briefs:
        print(f"  {brief['topic']}: {brief['google_count']}G {brief['reddit_count']}R {brief['youtube_count']}Y {brief['extracted_pages']}E")

if __name__ == "__main__":
    run()

JavaScript Implementation

JavaScript
const API_KEY = "your_scavio_api_key";
const BASE = "https://api.scavio.dev/api/v1";
const TOPICS = ["AI agent search tools 2026", "SERP API pricing changes", "MCP server adoption"];

async function search(query, platform) {
  const res = await fetch(`${BASE}/search`, {
    method: "POST",
    headers: { "x-api-key": API_KEY, "content-type": "application/json" },
    body: JSON.stringify({ platform, query }),
  });
  if (!res.ok) throw new Error(`scavio ${res.status}`);
  return (await res.json()).organic ?? [];
}

async function extract(url) {
  const res = await fetch(`${BASE}/extract`, {
    method: "POST",
    headers: { "x-api-key": API_KEY, "content-type": "application/json" },
    body: JSON.stringify({ url }),
  });
  if (!res.ok) return null;
  return res.json();
}

async function run() {
  const fs = await import("fs/promises");
  const briefs = [];
  for (const topic of TOPICS) {
    const [google, reddit, youtube] = await Promise.all([
      search(topic, "google"), search(topic, "reddit"), search(topic, "youtube"),
    ]);
    const extracted = [];
    for (const r of google.slice(0, 3)) {
      if (r.link) { const c = await extract(r.link); if (c) extracted.push({ url: r.link, title: r.title ?? "", preview: (c.text ?? "").slice(0, 500) }); }
    }
    briefs.push({ topic, google: google.length, reddit: reddit.length, youtube: youtube.length, extracted: extracted.length });
  }
  const date = new Date().toISOString().slice(0, 10);
  await fs.writeFile(`research_${date}.json`, JSON.stringify(briefs, null, 2));
  for (const b of briefs) console.log(`  ${b.topic}: ${b.google}G ${b.reddit}R ${b.youtube}Y ${b.extracted}E`);
}

run();

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

YouTube

Video search with transcripts and metadata

Reddit

Community, posts & threaded comments from any subreddit

Frequently Asked Questions

This workflow runs a daily deep research pipeline that searches across Google, Reddit, and YouTube for target topics, extracts key content from top results, and compiles structured research briefs. It replaces a multi-tool stack (Serper + Jina + E2B) with a single API for search and extraction.

This workflow uses a cron schedule (daily at 5:00 am utc). Runs daily at 5:00 AM UTC.

This workflow uses the following Scavio platforms: google, youtube, reddit. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 50 credits on signup with no credit card required. That is enough to test and validate this workflow before scaling it.

Deep Research Daily Pipeline

Daily deep research agent pipeline using search, extraction, and structured analysis. Replace multi-tool stacks with one API.

Get Your API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy