LEAD GENERATION

Waterfall Contact Enrichment

Find business emails, phones, and social profiles from a name + company domain. Cascades through MX validation, website scraping, pattern detection, and SMTP verification. Free Clay alternative.

Try on Apify Store
$0.20per event
39
Users (30d)
2,904
Runs (30d)
96
Actively maintained
Maintenance Pulse
$0.20
Per event

Maintenance Pulse

96/100
Last Build
Today
Last Version
3d ago
Builds (30d)
12
Issue Response
6h avg

Cost Estimate

How many results do you need?

contact-enricheds
Estimated cost:$20.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
contact-enrichedCharged per contact enriched. Runs a 10-step waterfall: MX check, pattern generation, web scraping, cross-referencing, SMTP verification, and social profile generation.$0.20

Example: 100 events = $20.00 · 1,000 events = $200.00

Documentation

A faster, cheaper alternative to Clay, Apollo, and Hunter.io. Find business email addresses, phone numbers, and social media profiles for any person when you provide their name and company domain. This actor cascades through multiple independent data sources until it finds the most accurate contact information available.

Given a list of people (each with a first name, last name, and company domain), the actor runs a multi-step enrichment pipeline for each person: MX record validation, email pattern generation (15 candidates), website contact scraping, email pattern detection, optional SMTP mailbox verification, and confidence scoring. The best email is selected from all candidates with full transparency into the ranking.

Why Use This on Apify

Traditional contact enrichment tools like Clay, Apollo, or Lusha charge per contact lookup -- often $0.10 to $0.50 per record. This actor uses only free, publicly available data sources (DNS lookups, company websites, and pattern analysis) to achieve similar results at a fraction of the cost.

Running on Apify gives you cloud-based scalability, scheduled runs for ongoing lead enrichment, and easy integration with CRMs and spreadsheets through Apify's built-in integrations. You can process hundreds of contacts in a single run with configurable concurrency, and the results land in a structured dataset ready for export.

Key Features

  • Multi-source waterfall -- Cascades through MX validation, pattern generation, website scraping, pattern detection, and optional SMTP verification for maximum accuracy
  • Confidence scoring -- Every email candidate gets a 0-100 confidence score based on how many sources confirmed it
  • Batch processing -- Enrich entire lists of people in one run with configurable concurrency (1-5 parallel lookups)
  • SMTP verification -- Optional deep verification mode that checks whether the mail server accepts the address (without sending email)
  • Catch-all detection -- Identifies domains that accept any email address, which reduces false positives in SMTP verification
  • Phone number extraction -- Scrapes company websites for publicly listed phone numbers
  • Social profile discovery -- Finds LinkedIn, GitHub, and Twitter/X profiles from website links and generates likely profile URLs
  • International name support -- Handles accented characters (umlauts, diacritics, cedillas) through automatic ASCII transliteration
  • Domain-level caching -- When enriching multiple people at the same company, website scraping and pattern detection results are cached to save time and compute
  • Full transparency -- Returns up to 10 ranked email candidates per person so you can see the reasoning behind each result

How to Use

  1. Open the actor in Apify Console.
  2. Add your list of people to the People to enrich input field as a JSON array. Each person needs at minimum a domain and either firstName + lastName or fullName.
  3. Choose whether to enable website scraping and pattern detection (both are on by default and improve accuracy).
  4. Select your verification level: Standard (fast, MX-only) or Deep (slower, adds SMTP verification for higher confidence).
  5. Click Start and wait for results to appear in the dataset.

Input Parameters

ParameterTypeRequiredDefaultDescription
peoplearrayYes--List of people to enrich. Each object must have domain and either firstName + lastName or fullName. Optionally include company.
enrichFromWebsitebooleanNotrueScrape the company website for emails, phones, and social links using the Website Contact Scraper sub-actor.
detectPatternbooleanNotrueDetect the company's email naming pattern using the Email Pattern Finder sub-actor.
verificationLevelstringNostandardstandard checks MX records only (fast). deep adds SMTP verification per candidate (slower but more accurate).
smtpTimeoutintegerNo10Timeout in seconds for SMTP connections in deep verification mode (3-30 range).
maxConcurrencyintegerNo3Number of people to process in parallel (1-5). Lower values are gentler on target mail servers.

Input Examples

Sales prospecting list:

{
    "people": [
        {
            "firstName": "Jane",
            "lastName": "Smith",
            "domain": "acme.com",
            "company": "Acme Corp"
        },
        {
            "firstName": "John",
            "lastName": "Doe",
            "domain": "example.com"
        }
    ],
    "enrichFromWebsite": true,
    "detectPattern": true,
    "verificationLevel": "deep",
    "maxConcurrency": 3
}

Quick lookup with full names:

{
    "people": [
        {"fullName": "María García", "domain": "empresa.es"},
        {"fullName": "François Müller", "domain": "firme.de"}
    ],
    "verificationLevel": "standard"
}

Budget-friendly mode (no sub-actor calls):

{
    "people": [
        {"firstName": "Alex", "lastName": "Chen", "domain": "startup.io"}
    ],
    "enrichFromWebsite": false,
    "detectPattern": false,
    "verificationLevel": "deep"
}

Input Tips

  • Group people from the same company together in your input list -- website and pattern results are cached per domain, so processing all contacts at one company together is more efficient.
  • Use firstName + lastName instead of fullName for best accuracy. When using fullName, the actor takes the first and last words as first and last name.
  • International names with accents (ä, é, ñ, ç) are automatically transliterated to ASCII for pattern generation.
  • Disable enrichFromWebsite and detectPattern to save compute credits when you only need pattern-based guesses with SMTP verification.

Output Example

Each person produces one output record in the dataset:

{
    "firstName": "Jane",
    "lastName": "Smith",
    "domain": "acme.com",
    "company": "Acme Corp",
    "email": "[email protected]",
    "emailConfidence": 95,
    "emailSource": "website",
    "phone": "+1-555-123-4567",
    "phoneSource": "website",
    "socialProfiles": {
        "linkedin": "https://www.linkedin.com/company/acme-corp",
        "github": "https://github.com/janesmith",
        "twitter": "https://twitter.com/acmecorp"
    },
    "status": "found",
    "sources": {
        "patternGeneration": {
            "candidates": ["[email protected]", "[email protected]", "[email protected]", "[email protected]"],
            "topPattern": "first.last"
        },
        "websiteScraping": {
            "emailsFound": ["[email protected]", "[email protected]"],
            "phonesFound": ["+1-555-123-4567"],
            "socialsFound": ["https://www.linkedin.com/company/acme-corp"],
            "namesFound": ["Jane Smith"]
        },
        "patternDetection": {
            "detectedPattern": "first.last",
            "patternConfidence": 85,
            "generatedEmail": "[email protected]"
        }
    },
    "allCandidates": [
        {
            "email": "[email protected]",
            "pattern": "first.last",
            "confidence": 95,
            "sources": ["pattern_generation", "website", "pattern_detection"]
        },
        {
            "email": "[email protected]",
            "pattern": "firstlast",
            "confidence": 67,
            "sources": ["pattern_generation"]
        }
    ],
    "domainValid": true,
    "mxHost": "mx1.acme.com",
    "verifiedAt": "2025-01-15T10:30:00.000Z"
}

Output Fields

FieldTypeDescription
firstNamestringPerson's first name (from input or parsed from fullName)
lastNamestringPerson's last name
domainstringCompany domain
companystring/nullCompany name (pass-through from input)
emailstring/nullBest email found (highest confidence candidate)
emailConfidencenumberConfidence score 0-98 for the best email
emailSourcestring/nullHow the best email was found: website, pattern_detection, smtp, or pattern_generation
phonestring/nullPhone number scraped from company website
phoneSourcestring/nullAlways website when a phone is found
socialProfilesobjectLinkedIn, GitHub, Twitter URLs (from website or generated guesses)
statusstringfound (confidence >= 70), likely (40-69), or not_found (< 40)
sourcesobjectFull breakdown of what each enrichment step found
allCandidatesarrayTop 10 email candidates ranked by confidence
domainValidbooleanWhether the domain has MX records
mxHoststring/nullPrimary mail exchange server
verifiedAtstringISO 8601 timestamp

Use via API

You can run Waterfall Contact Enrichment programmatically using the Apify API. This is ideal for integrating enrichment into automated lead pipelines.

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run = client.actor("ryanclinton/waterfall-contact-enrichment").call(run_input={
    "people": [
        {"firstName": "Jane", "lastName": "Smith", "domain": "acme.com"},
        {"firstName": "John", "lastName": "Doe", "domain": "example.com"},
    ],
    "enrichFromWebsite": True,
    "detectPattern": True,
    "verificationLevel": "deep",
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    email = item.get("email", "not found")
    confidence = item.get("emailConfidence", 0)
    print(f"{item['firstName']} {item['lastName']}: {email} ({confidence}%)")

JavaScript

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });

const run = await client.actor('ryanclinton/waterfall-contact-enrichment').call({
    people: [
        { firstName: 'Jane', lastName: 'Smith', domain: 'acme.com' },
        { firstName: 'John', lastName: 'Doe', domain: 'example.com' },
    ],
    enrichFromWebsite: true,
    detectPattern: true,
    verificationLevel: 'deep',
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
    console.log(`${item.firstName} ${item.lastName}: ${item.email || 'not found'} (${item.emailConfidence}%)`);
});

cURL

curl "https://api.apify.com/v2/acts/ryanclinton~waterfall-contact-enrichment/runs?token=YOUR_APIFY_TOKEN" \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "people": [
      {"firstName": "Jane", "lastName": "Smith", "domain": "acme.com"}
    ],
    "verificationLevel": "deep"
  }'

How It Works

The actor runs a 10-step waterfall enrichment pipeline for each person, combining multiple data sources to maximize accuracy:

┌────────────────────────────────────────────────────────┐
│  INPUT: firstName + lastName + domain                  │
└──────────────────────┬─────────────────────────────────┘
                       │
            ┌──────────▼──────────┐
            │  1. MX Validation   │  DNS MX lookup (cached per domain)
            │  No MX → not_found  │
            └──────────┬──────────┘
            ┌──────────▼──────────┐
            │  2. Pattern Gen     │  15 candidates ranked by popularity
            │  first.last@...     │  first.last, firstlast, first, flast...
            └──────────┬──────────┘
                       │
          ┌────────────┼────────────┐  (run in parallel)
          │                         │
┌─────────▼─────────┐   ┌──────────▼──────────┐
│ 3a. Website Scrape │   │ 3b. Pattern Detect  │
│ Contact Scraper    │   │ Pattern Finder      │
│ (cached per domain)│   │ (cached per domain) │
└─────────┬─────────┘   └──────────┬──────────┘
          │                         │
          └────────────┬────────────┘
            ┌──────────▼──────────┐
            │  4. Cross-Reference │  Match website emails to person name
            │  Direct match?      │  → websiteDirectMatch (90-98%)
            └──────────┬──────────┘
            ┌──────────▼──────────┐
            │  5. Merge Candidates│  Combine all sources, dedup
            └──────────┬──────────┘
            ┌──────────▼──────────┐
            │  6. SMTP Verify     │  Deep mode only, top 5 candidates
            │  (1s between checks)│  Stop on first valid + catch-all test
            └──────────┬──────────┘
            ┌──────────▼──────────┐
            │  7. Score & Rank    │  Multi-signal confidence scoring
            └──────────┬──────────┘
            ┌──────────▼──────────┐
            │  8. Best Email +    │  Pick top candidate
            │     Phone + Social  │  Extract phone, merge social profiles
            └─────────────────────┘

Email Pattern Candidates

The actor generates up to 15 email candidates per person using the most common B2B email naming conventions, ranked by industry prevalence:

PatternExamplePopularity
first.last[email protected]Most common
firstlast[email protected]Very common
first[email protected]Common
flast[email protected]Common
f.last[email protected]Moderate
first_last[email protected]Moderate
first-last[email protected]Moderate
firstl[email protected]Less common
first.l[email protected]Less common
f_last[email protected]Uncommon
last.first[email protected]Uncommon
lastfirst[email protected]Uncommon
last[email protected]Rare
last.f[email protected]Rare
lastf[email protected]Rare

Confidence Scoring Algorithm

Each candidate is scored based on cascading signals from all enrichment steps:

ConditionConfidenceStatus
SMTP rejected (550)5not_found
Website match + SMTP valid + not catch-all98found
Website match + SMTP valid (unknown catch-all)95found
Website match (no SMTP)90found
SMTP valid + not catch-all95found
SMTP valid + catch-all50likely
SMTP valid (unknown catch-all)80found
Website match + catch-all60likely
Pattern detected (high confidence)65-80found/likely
Pattern guess (first.last, firstlast)67-70found
Pattern guess (flast, f.last, first_last)64-67likely
Pattern guess (uncommon patterns)60likely

Status thresholds: found at confidence >= 70, likely at 40-69, not_found below 40.

Sub-Actor Integration

The actor calls two other actors from the same suite as sub-actors:

Sub-ActorWhen CalledWhat It ReturnsCache
Website Contact ScraperWhen enrichFromWebsite is trueEmails, phones, social links, team namesPer domain (120s timeout)
Email Pattern FinderWhen detectPattern is trueCompany email pattern + confidencePer domain (120s timeout)

Both run in parallel for each new domain. When enriching multiple people at the same company, the second person reuses cached results instantly.

Name Transliteration

International names are automatically transliterated to ASCII for pattern generation:

InputTransliteratedGenerated Email
María Garcíamaria garcia[email protected]
François Müllerfrancois muller[email protected]
Jiří Řehákjiri rehak[email protected]
Hans Straßehans strasse[email protected]

Cost

The actor itself uses minimal compute (256 MB memory). However, when website scraping and pattern detection are enabled (the defaults), it calls two sub-actors per unique domain:

Setting1 Person10 People (1 domain)50 People (20 domains)
All features + deep~2 min~3 min (cached)~15-20 min
All features + standard~1 min~2 min (cached)~10-15 min
No sub-actors + deep~30 sec~2 min~5-10 min
No sub-actors + standard~5 sec~10 sec~1 min

To reduce costs, disable website scraping and/or pattern detection. The actor will still generate pattern-based email candidates with 60-70% confidence.

Tips

  1. Start with deep verification for critical outreach campaigns where bounce rates matter. Use standard mode for initial prospecting or large-volume enrichment.

  2. Group people by company domain in your input list. The actor caches website and pattern results per domain, so processing all contacts at one company together is much more efficient.

  3. Check the allCandidates array in the output if the top email has low confidence. Sometimes the second or third candidate is correct, especially for companies with unusual email patterns.

  4. Use the company field in your input -- it does not affect the enrichment logic but is passed through to the output, making downstream data processing easier.

  5. Lower concurrency to 1 when targeting a single mail server in deep mode to avoid being rate-limited or blocked.

  6. Chain with Bulk Email Verifier to double-check results. The waterfall actor's SMTP check only tests the top 5 candidates. Running the best email through the full verifier adds another layer of confidence.

  7. Disable sub-actors for cost savings. Setting enrichFromWebsite: false and detectPattern: false skips the sub-actor calls entirely. You still get 15 pattern-based candidates and SMTP verification (in deep mode).

Limitations

  • No guaranteed accuracy. Email enrichment is inherently probabilistic. Even high-confidence results (90%+) can be wrong if a person uses an unusual email format. The allCandidates array provides alternatives.
  • Catch-all domains reduce confidence. Domains that accept all addresses (catch-all) make SMTP verification meaningless. These emails cap at 50-60% confidence.
  • Social profile URLs may be guesses. When no social links are found on the company website, the actor generates likely profile URLs (e.g., linkedin.com/in/firstname-lastname). These are unverified guesses and may point to the wrong person.
  • Only top 5 candidates are SMTP-verified. To avoid hammering mail servers, the actor only tests the top 5 candidates and stops at the first valid one. Less common patterns may not be verified.
  • Sub-actor costs add up. Website scraping and pattern detection each call a separate actor. For large lists across many domains, sub-actor costs can exceed the main actor's compute cost.
  • Personal email providers are unsupported. The actor is designed for B2B company domains. Gmail, Yahoo, and other free providers will have valid MX records but pattern generation will produce meaningless results.
  • Name parsing is English-centric. The fullName parser takes the first and last words as first and last name, which may not work for names with particles (e.g., "Ludwig van Beethoven" → first: "ludwig", last: "beethoven", missing "van").

Responsible Use

  • Respect privacy laws. Contact enrichment does not grant permission to contact someone. Comply with GDPR, CAN-SPAM, CASL, and other applicable regulations before sending outreach.
  • Use for legitimate B2B outreach only. This tool is designed for professional business communication, not mass unsolicited emails or spam.
  • Rate limit SMTP checks. The actor includes built-in delays (1 second between SMTP checks, 500ms before catch-all tests), but avoid running many parallel instances against the same mail server.
  • Verify before sending. Even high-confidence emails should be verified through a dedicated email verification tool before sending campaigns to protect your sender reputation.

FAQ

Does this actor send any emails? No. Even in deep verification mode, the actor only opens an SMTP connection and checks whether the mail server would accept the address. It disconnects before the DATA stage, so no email is ever sent or received.

How accurate are the results? Accuracy depends on the verification level and available data. Website-confirmed emails score 90%+. Pattern-detected emails score 65-80%. Pattern-only guesses score 60-70%. SMTP-verified addresses on non-catch-all domains score 95%+. In practice, the actor finds a high-confidence email (70%+) for roughly 40-60% of B2B contacts.

What is a catch-all domain? A catch-all (or accept-all) domain is configured to accept email sent to any address at that domain, even nonexistent ones. This means SMTP verification cannot distinguish between real and fake addresses. The actor detects catch-all domains and adjusts confidence scores downward accordingly.

Can I use this for personal email addresses (Gmail, Yahoo, etc.)? The actor is designed for B2B company domains. Personal email providers like Gmail will have valid MX records but pattern generation and website scraping will not produce useful results. The actor works best with company domains where employees share a consistent email naming pattern.

What happens if a domain has no website or no MX records? If a domain has no MX records, the actor immediately returns a not_found status with 0% confidence. If the website is unreachable, the actor skips that enrichment step and relies on pattern generation and SMTP verification alone.

How are names with accents handled? Names with accented characters are automatically transliterated to ASCII equivalents before pattern generation. For example, "María" becomes "maria", "Müller" becomes "muller", and "ß" becomes "ss". This ensures email patterns work correctly since most email systems use ASCII-only addresses.

Integrations

Connect Waterfall Contact Enrichment with other tools and platforms:

  • Export to Google Sheets -- Use Apify's Google Sheets integration to automatically send enriched contacts to a spreadsheet for your sales team.
  • Push to CRM -- Connect to HubSpot via the HubSpot Lead Pusher actor, or to Salesforce and Pipedrive via Apify webhooks.
  • Chain with other actors -- Feed output from a lead scraper (like Google Maps or LinkedIn) directly into this actor for contact enrichment.
  • API access -- Call this actor programmatically via the Apify API from any language or platform.
  • Zapier and Make -- Trigger enrichment runs from Zapier or Make workflows, then route the results to your email marketing tool, CRM, or notification system.
  • Scheduled runs -- Set up Apify schedules to periodically re-enrich your contact database as people change companies.

Related Actors

These actors from ryanclinton on the Apify Store work well with Waterfall Contact Enrichment:

ActorWhat It DoesHow It Connects
Website Contact ScraperExtract emails, phones, and team members from websitesCalled as a sub-actor for website enrichment
Email Pattern FinderDiscover company email patternsCalled as a sub-actor for pattern detection
Bulk Email VerifierVerify email deliverabilityDouble-check enriched emails before outreach
B2B Lead QualifierScore and grade leadsQualify enriched contacts for pipeline prioritization
HubSpot Lead PusherPush leads to HubSpot CRMPush enriched contacts with emails directly to HubSpot
Google Maps Lead EnricherEnrich Google Maps business listingsGet company domains from Maps, then enrich key contacts

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Waterfall Contact Enrichment?

Start for free on Apify. No credit card required.

Open on Apify Store