ScavioScavio
ProductPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Build a B2B Enrichment Layer Without Apollo Lock-In
Tutorial

How to Build a B2B Enrichment Layer Without Apollo Lock-In

Build a flexible B2B data enrichment layer using search APIs instead of locked-in tools like Apollo. Get company data, tech stacks, and hiring signals.

Get Free API KeyAPI Docs

Apollo and similar B2B data providers lock you into expensive subscriptions with data that goes stale. A search-based enrichment layer pulls fresh data every time you need it: company descriptions from Google Knowledge Graph, tech stack signals from job postings, hiring velocity, and social presence. This tutorial builds a modular enrichment layer using the Scavio API at $0.005 per search. No annual contracts, no stale databases.

Prerequisites

  • Python 3.9+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • A list of company domains or names to enrich

Walkthrough

Step 1: Build the company profile enricher

Search Google for each company to extract Knowledge Graph data, website description, and basic company information.

Python
import os, requests, time, re

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
URL = 'https://api.scavio.dev/api/v1/search'
H = {'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'}

def enrich_company(company: str) -> dict:
    """Get basic company profile from search."""
    resp = requests.post(URL, headers=H,
        json={'query': f'{company} company', 'country_code': 'us', 'num_results': 5})
    data = resp.json()
    kg = data.get('knowledge_graph', {})
    organic = data.get('organic_results', [])
    return {
        'company': company,
        'description': kg.get('description', organic[0].get('snippet', '') if organic else ''),
        'website': kg.get('website', ''),
        'type': kg.get('type', ''),
        'founded': kg.get('founded', ''),
        'headquarters': kg.get('headquarters', ''),
        'employees': kg.get('employees', ''),
    }

profile = enrich_company('Vercel')
for key, val in profile.items():
    if val:
        print(f'  {key}: {val}')

Step 2: Add tech stack detection

Search for the company's job postings to detect their tech stack. Job requirements reveal what technologies the company actually uses.

Python
def detect_tech_stack(company: str) -> dict:
    """Detect tech stack from job postings."""
    resp = requests.post(URL, headers=H,
        json={'query': f'{company} hiring engineer 2026',
              'country_code': 'us', 'num_results': 5})
    results = resp.json().get('organic_results', [])
    all_text = ' '.join(f"{r.get('title','')} {r.get('snippet','')}" for r in results).lower()
    tech_categories = {
        'languages': ['python', 'javascript', 'typescript', 'go', 'rust', 'java', 'ruby'],
        'frameworks': ['react', 'next.js', 'django', 'fastapi', 'rails', 'spring', 'vue'],
        'databases': ['postgresql', 'mongodb', 'redis', 'mysql', 'dynamodb', 'elasticsearch'],
        'cloud': ['aws', 'gcp', 'azure', 'vercel', 'cloudflare', 'railway'],
        'tools': ['docker', 'kubernetes', 'terraform', 'github actions', 'datadog'],
    }
    detected = {}
    for category, techs in tech_categories.items():
        found = [t for t in techs if t in all_text]
        if found:
            detected[category] = found
    return {
        'tech_stack': detected,
        'hiring_signals': len(results),
        'job_titles': [r['title'][:60] for r in results[:3]],
    }

tech = detect_tech_stack('Vercel')
print(f'Tech Stack:')
for cat, techs in tech['tech_stack'].items():
    print(f'  {cat}: {", ".join(techs)}')
print(f'\nHiring signals: {tech["hiring_signals"]} job postings')

Step 3: Add social presence analysis

Check the company's presence on Reddit and YouTube. Active social presence indicates a company that engages with developers and could be a good prospect.

Python
def analyze_social(company: str) -> dict:
    """Check social presence on Reddit and YouTube."""
    social = {}
    # Reddit presence
    resp = requests.post(URL, headers=H,
        json={'query': f'site:reddit.com {company}',
              'country_code': 'us', 'num_results': 5})
    reddit_results = resp.json().get('organic_results', [])
    social['reddit'] = {
        'mentions': len(reddit_results),
        'subreddits': list(set(
            re.search(r'r/(\w+)', r.get('link', '')).group(1)
            for r in reddit_results
            if re.search(r'r/(\w+)', r.get('link', ''))
        )),
        'recent_topics': [r['title'][:60] for r in reddit_results[:3]],
    }
    time.sleep(0.3)
    # YouTube presence
    resp = requests.post(URL, headers=H,
        json={'query': f'site:youtube.com {company}',
              'country_code': 'us', 'num_results': 5})
    yt_results = resp.json().get('organic_results', [])
    social['youtube'] = {
        'videos': len(yt_results),
        'recent_videos': [r['title'][:60] for r in yt_results[:3]],
    }
    return social

social = analyze_social('Vercel')
print(f'Reddit: {social["reddit"]["mentions"]} mentions in {", ".join(social["reddit"]["subreddits"][:3])}')
print(f'YouTube: {social["youtube"]["videos"]} videos')

Step 4: Build the full enrichment pipeline and export

Combine all enrichment sources into a complete company profile. Export as JSON for CRM integration or CSV for manual review.

Python
import json, csv

def full_enrichment(company: str) -> dict:
    """Run all enrichment steps for a company."""
    profile = enrich_company(company)
    time.sleep(0.3)
    tech = detect_tech_stack(company)
    time.sleep(0.3)
    social = analyze_social(company)
    profile.update({
        'tech_stack': tech['tech_stack'],
        'hiring_signals': tech['hiring_signals'],
        'reddit_mentions': social['reddit']['mentions'],
        'youtube_videos': social['youtube']['videos'],
        'credits_used': 4,  # 1 company + 1 jobs + 1 reddit + 1 youtube
        'cost': 0.020,
    })
    return profile

def batch_enrich(companies: list, output: str = 'enriched_companies.json'):
    results = []
    for company in companies:
        print(f'Enriching: {company}...')
        profile = full_enrichment(company)
        results.append(profile)
        time.sleep(0.5)
    with open(output, 'w') as f:
        json.dump(results, f, indent=2)
    total_cost = sum(r['cost'] for r in results)
    print(f'\nEnriched {len(results)} companies')
    print(f'Total cost: ${total_cost:.3f}')
    print(f'Apollo equivalent: ~$50-100/month for similar data')
    return results

results = batch_enrich(['Vercel', 'Supabase', 'Railway'])
for r in results:
    stack = ', '.join(t for techs in r.get('tech_stack', {}).values() for t in techs)
    print(f"  {r['company']}: {r.get('description', '')[:50]}")
    print(f"    Stack: {stack[:60]}")

Python Example

Python
import os, requests, time

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
H = {'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'}

def enrich(company):
    # Company info
    resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'query': f'{company} company', 'country_code': 'us', 'num_results': 3})
    kg = resp.json().get('knowledge_graph', {})
    time.sleep(0.3)
    # Tech stack from jobs
    resp2 = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'query': f'{company} hiring engineer', 'country_code': 'us', 'num_results': 3})
    text = ' '.join(r.get('snippet','') for r in resp2.json().get('organic_results', [])).lower()
    techs = [t for t in ['python','react','typescript','aws','docker'] if t in text]
    print(f"{company}: {kg.get('description','N/A')[:60]}")
    print(f"  Tech: {', '.join(techs) or 'N/A'}")

for c in ['Vercel', 'Supabase']:
    enrich(c)
    time.sleep(0.3)

JavaScript Example

JavaScript
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function enrich(company) {
  const resp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: `${company} company`, country_code: 'us', num_results: 3 })
  });
  const kg = (await resp.json()).knowledge_graph || {};
  const resp2 = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: `${company} hiring engineer`, country_code: 'us', num_results: 3 })
  });
  const text = ((await resp2.json()).organic_results || []).map(r => r.snippet || '').join(' ').toLowerCase();
  const techs = ['python','react','typescript','aws','docker'].filter(t => text.includes(t));
  console.log(`${company}: ${(kg.description || 'N/A').slice(0, 60)}`);
  console.log(`  Tech: ${techs.join(', ') || 'N/A'}`);
}

(async () => { for (const c of ['Vercel', 'Supabase']) await enrich(c); })();

Expected Output

JSON
Enriching: Vercel...
Enriching: Supabase...
Enriching: Railway...

Enriched 3 companies
Total cost: $0.060
Apollo equivalent: ~$50-100/month for similar data

  Vercel: Cloud platform for frontend frameworks and serverle
    Stack: typescript, react, next.js, aws
  Supabase: Open source Firebase alternative with PostgreSQL
    Stack: typescript, postgresql, docker
  Railway: Cloud platform for deploying applications
    Stack: typescript, docker, kubernetes

Related Tutorials

  • How to Build an Intent-Based Lead Pipeline with Search Data
  • How to Find Automation Agency Clients with Search Data
  • How to Enrich Leads with Google Search Data for Cold Email
  • How to Extract Google Knowledge Graph Data via API

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.9+ installed. requests library installed. A Scavio API key from scavio.dev. A list of company domains or names to enrich. A Scavio API key gives you 50 free credits on signup.

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Comparison

Apollo vs API-First Enrichment (Scavio, Exa, search APIs)

Read more
Best Of

Best B2B Enrichment APIs in May 2026

Read more
Use Case

Apollo Alternative: Search-Based Lead Enrichment

Read more
Solution

Enrich Sales Leads with Search Data Instead of Apollo

Read more
Best Of

Best B2B Enrichment Search Tools in 2026

Read more
Solution

API-First B2B Enrichment Without Apollo Lock-In

Read more

Start Building

Build a flexible B2B data enrichment layer using search APIs instead of locked-in tools like Apollo. Get company data, tech stacks, and hiring signals.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy