Waterfall Contact Enrichment
Find business emails, phones, and social profiles from a name + company domain. Cascades through MX validation, website scraping, pattern detection, and SMTP verification. Free Clay alternative.
Maintenance Pulse
96/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| contact-enriched | Charged per contact enriched. Runs a 10-step waterfall: MX check, pattern generation, web scraping, cross-referencing, SMTP verification, and social profile generation. | $0.20 |
Example: 100 events = $20.00 · 1,000 events = $200.00
Documentation
A faster, cheaper alternative to Clay, Apollo, and Hunter.io. Find business email addresses, phone numbers, and social media profiles for any person when you provide their name and company domain. This actor cascades through multiple independent data sources until it finds the most accurate contact information available.
Given a list of people (each with a first name, last name, and company domain), the actor runs a multi-step enrichment pipeline for each person: MX record validation, email pattern generation (15 candidates), website contact scraping, email pattern detection, optional SMTP mailbox verification, and confidence scoring. The best email is selected from all candidates with full transparency into the ranking.
Why Use This on Apify
Traditional contact enrichment tools like Clay, Apollo, or Lusha charge per contact lookup -- often $0.10 to $0.50 per record. This actor uses only free, publicly available data sources (DNS lookups, company websites, and pattern analysis) to achieve similar results at a fraction of the cost.
Running on Apify gives you cloud-based scalability, scheduled runs for ongoing lead enrichment, and easy integration with CRMs and spreadsheets through Apify's built-in integrations. You can process hundreds of contacts in a single run with configurable concurrency, and the results land in a structured dataset ready for export.
Key Features
- Multi-source waterfall -- Cascades through MX validation, pattern generation, website scraping, pattern detection, and optional SMTP verification for maximum accuracy
- Confidence scoring -- Every email candidate gets a 0-100 confidence score based on how many sources confirmed it
- Batch processing -- Enrich entire lists of people in one run with configurable concurrency (1-5 parallel lookups)
- SMTP verification -- Optional deep verification mode that checks whether the mail server accepts the address (without sending email)
- Catch-all detection -- Identifies domains that accept any email address, which reduces false positives in SMTP verification
- Phone number extraction -- Scrapes company websites for publicly listed phone numbers
- Social profile discovery -- Finds LinkedIn, GitHub, and Twitter/X profiles from website links and generates likely profile URLs
- International name support -- Handles accented characters (umlauts, diacritics, cedillas) through automatic ASCII transliteration
- Domain-level caching -- When enriching multiple people at the same company, website scraping and pattern detection results are cached to save time and compute
- Full transparency -- Returns up to 10 ranked email candidates per person so you can see the reasoning behind each result
How to Use
- Open the actor in Apify Console.
- Add your list of people to the People to enrich input field as a JSON array. Each person needs at minimum a
domainand eitherfirstName+lastNameorfullName. - Choose whether to enable website scraping and pattern detection (both are on by default and improve accuracy).
- Select your verification level: Standard (fast, MX-only) or Deep (slower, adds SMTP verification for higher confidence).
- Click Start and wait for results to appear in the dataset.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
people | array | Yes | -- | List of people to enrich. Each object must have domain and either firstName + lastName or fullName. Optionally include company. |
enrichFromWebsite | boolean | No | true | Scrape the company website for emails, phones, and social links using the Website Contact Scraper sub-actor. |
detectPattern | boolean | No | true | Detect the company's email naming pattern using the Email Pattern Finder sub-actor. |
verificationLevel | string | No | standard | standard checks MX records only (fast). deep adds SMTP verification per candidate (slower but more accurate). |
smtpTimeout | integer | No | 10 | Timeout in seconds for SMTP connections in deep verification mode (3-30 range). |
maxConcurrency | integer | No | 3 | Number of people to process in parallel (1-5). Lower values are gentler on target mail servers. |
Input Examples
Sales prospecting list:
{
"people": [
{
"firstName": "Jane",
"lastName": "Smith",
"domain": "acme.com",
"company": "Acme Corp"
},
{
"firstName": "John",
"lastName": "Doe",
"domain": "example.com"
}
],
"enrichFromWebsite": true,
"detectPattern": true,
"verificationLevel": "deep",
"maxConcurrency": 3
}
Quick lookup with full names:
{
"people": [
{"fullName": "María García", "domain": "empresa.es"},
{"fullName": "François Müller", "domain": "firme.de"}
],
"verificationLevel": "standard"
}
Budget-friendly mode (no sub-actor calls):
{
"people": [
{"firstName": "Alex", "lastName": "Chen", "domain": "startup.io"}
],
"enrichFromWebsite": false,
"detectPattern": false,
"verificationLevel": "deep"
}
Input Tips
- Group people from the same company together in your input list -- website and pattern results are cached per domain, so processing all contacts at one company together is more efficient.
- Use
firstName+lastNameinstead offullNamefor best accuracy. When usingfullName, the actor takes the first and last words as first and last name. - International names with accents (ä, é, ñ, ç) are automatically transliterated to ASCII for pattern generation.
- Disable
enrichFromWebsiteanddetectPatternto save compute credits when you only need pattern-based guesses with SMTP verification.
Output Example
Each person produces one output record in the dataset:
{
"firstName": "Jane",
"lastName": "Smith",
"domain": "acme.com",
"company": "Acme Corp",
"email": "[email protected]",
"emailConfidence": 95,
"emailSource": "website",
"phone": "+1-555-123-4567",
"phoneSource": "website",
"socialProfiles": {
"linkedin": "https://www.linkedin.com/company/acme-corp",
"github": "https://github.com/janesmith",
"twitter": "https://twitter.com/acmecorp"
},
"status": "found",
"sources": {
"patternGeneration": {
"candidates": ["[email protected]", "[email protected]", "[email protected]", "[email protected]"],
"topPattern": "first.last"
},
"websiteScraping": {
"emailsFound": ["[email protected]", "[email protected]"],
"phonesFound": ["+1-555-123-4567"],
"socialsFound": ["https://www.linkedin.com/company/acme-corp"],
"namesFound": ["Jane Smith"]
},
"patternDetection": {
"detectedPattern": "first.last",
"patternConfidence": 85,
"generatedEmail": "[email protected]"
}
},
"allCandidates": [
{
"email": "[email protected]",
"pattern": "first.last",
"confidence": 95,
"sources": ["pattern_generation", "website", "pattern_detection"]
},
{
"email": "[email protected]",
"pattern": "firstlast",
"confidence": 67,
"sources": ["pattern_generation"]
}
],
"domainValid": true,
"mxHost": "mx1.acme.com",
"verifiedAt": "2025-01-15T10:30:00.000Z"
}
Output Fields
| Field | Type | Description |
|---|---|---|
firstName | string | Person's first name (from input or parsed from fullName) |
lastName | string | Person's last name |
domain | string | Company domain |
company | string/null | Company name (pass-through from input) |
email | string/null | Best email found (highest confidence candidate) |
emailConfidence | number | Confidence score 0-98 for the best email |
emailSource | string/null | How the best email was found: website, pattern_detection, smtp, or pattern_generation |
phone | string/null | Phone number scraped from company website |
phoneSource | string/null | Always website when a phone is found |
socialProfiles | object | LinkedIn, GitHub, Twitter URLs (from website or generated guesses) |
status | string | found (confidence >= 70), likely (40-69), or not_found (< 40) |
sources | object | Full breakdown of what each enrichment step found |
allCandidates | array | Top 10 email candidates ranked by confidence |
domainValid | boolean | Whether the domain has MX records |
mxHost | string/null | Primary mail exchange server |
verifiedAt | string | ISO 8601 timestamp |
Use via API
You can run Waterfall Contact Enrichment programmatically using the Apify API. This is ideal for integrating enrichment into automated lead pipelines.
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("ryanclinton/waterfall-contact-enrichment").call(run_input={
"people": [
{"firstName": "Jane", "lastName": "Smith", "domain": "acme.com"},
{"firstName": "John", "lastName": "Doe", "domain": "example.com"},
],
"enrichFromWebsite": True,
"detectPattern": True,
"verificationLevel": "deep",
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
email = item.get("email", "not found")
confidence = item.get("emailConfidence", 0)
print(f"{item['firstName']} {item['lastName']}: {email} ({confidence}%)")
JavaScript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
const run = await client.actor('ryanclinton/waterfall-contact-enrichment').call({
people: [
{ firstName: 'Jane', lastName: 'Smith', domain: 'acme.com' },
{ firstName: 'John', lastName: 'Doe', domain: 'example.com' },
],
enrichFromWebsite: true,
detectPattern: true,
verificationLevel: 'deep',
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(`${item.firstName} ${item.lastName}: ${item.email || 'not found'} (${item.emailConfidence}%)`);
});
cURL
curl "https://api.apify.com/v2/acts/ryanclinton~waterfall-contact-enrichment/runs?token=YOUR_APIFY_TOKEN" \
-X POST \
-H "Content-Type: application/json" \
-d '{
"people": [
{"firstName": "Jane", "lastName": "Smith", "domain": "acme.com"}
],
"verificationLevel": "deep"
}'
How It Works
The actor runs a 10-step waterfall enrichment pipeline for each person, combining multiple data sources to maximize accuracy:
┌────────────────────────────────────────────────────────┐
│ INPUT: firstName + lastName + domain │
└──────────────────────┬─────────────────────────────────┘
│
┌──────────▼──────────┐
│ 1. MX Validation │ DNS MX lookup (cached per domain)
│ No MX → not_found │
└──────────┬──────────┘
┌──────────▼──────────┐
│ 2. Pattern Gen │ 15 candidates ranked by popularity
│ first.last@... │ first.last, firstlast, first, flast...
└──────────┬──────────┘
│
┌────────────┼────────────┐ (run in parallel)
│ │
┌─────────▼─────────┐ ┌──────────▼──────────┐
│ 3a. Website Scrape │ │ 3b. Pattern Detect │
│ Contact Scraper │ │ Pattern Finder │
│ (cached per domain)│ │ (cached per domain) │
└─────────┬─────────┘ └──────────┬──────────┘
│ │
└────────────┬────────────┘
┌──────────▼──────────┐
│ 4. Cross-Reference │ Match website emails to person name
│ Direct match? │ → websiteDirectMatch (90-98%)
└──────────┬──────────┘
┌──────────▼──────────┐
│ 5. Merge Candidates│ Combine all sources, dedup
└──────────┬──────────┘
┌──────────▼──────────┐
│ 6. SMTP Verify │ Deep mode only, top 5 candidates
│ (1s between checks)│ Stop on first valid + catch-all test
└──────────┬──────────┘
┌──────────▼──────────┐
│ 7. Score & Rank │ Multi-signal confidence scoring
└──────────┬──────────┘
┌──────────▼──────────┐
│ 8. Best Email + │ Pick top candidate
│ Phone + Social │ Extract phone, merge social profiles
└─────────────────────┘
Email Pattern Candidates
The actor generates up to 15 email candidates per person using the most common B2B email naming conventions, ranked by industry prevalence:
| Pattern | Example | Popularity |
|---|---|---|
| first.last | [email protected] | Most common |
| firstlast | [email protected] | Very common |
| first | [email protected] | Common |
| flast | [email protected] | Common |
| f.last | [email protected] | Moderate |
| first_last | [email protected] | Moderate |
| first-last | [email protected] | Moderate |
| firstl | [email protected] | Less common |
| first.l | [email protected] | Less common |
| f_last | [email protected] | Uncommon |
| last.first | [email protected] | Uncommon |
| lastfirst | [email protected] | Uncommon |
| last | [email protected] | Rare |
| last.f | [email protected] | Rare |
| lastf | [email protected] | Rare |
Confidence Scoring Algorithm
Each candidate is scored based on cascading signals from all enrichment steps:
| Condition | Confidence | Status |
|---|---|---|
| SMTP rejected (550) | 5 | not_found |
| Website match + SMTP valid + not catch-all | 98 | found |
| Website match + SMTP valid (unknown catch-all) | 95 | found |
| Website match (no SMTP) | 90 | found |
| SMTP valid + not catch-all | 95 | found |
| SMTP valid + catch-all | 50 | likely |
| SMTP valid (unknown catch-all) | 80 | found |
| Website match + catch-all | 60 | likely |
| Pattern detected (high confidence) | 65-80 | found/likely |
| Pattern guess (first.last, firstlast) | 67-70 | found |
| Pattern guess (flast, f.last, first_last) | 64-67 | likely |
| Pattern guess (uncommon patterns) | 60 | likely |
Status thresholds: found at confidence >= 70, likely at 40-69, not_found below 40.
Sub-Actor Integration
The actor calls two other actors from the same suite as sub-actors:
| Sub-Actor | When Called | What It Returns | Cache |
|---|---|---|---|
| Website Contact Scraper | When enrichFromWebsite is true | Emails, phones, social links, team names | Per domain (120s timeout) |
| Email Pattern Finder | When detectPattern is true | Company email pattern + confidence | Per domain (120s timeout) |
Both run in parallel for each new domain. When enriching multiple people at the same company, the second person reuses cached results instantly.
Name Transliteration
International names are automatically transliterated to ASCII for pattern generation:
| Input | Transliterated | Generated Email |
|---|---|---|
| María García | maria garcia | [email protected] |
| François Müller | francois muller | [email protected] |
| Jiří Řehák | jiri rehak | [email protected] |
| Hans Straße | hans strasse | [email protected] |
Cost
The actor itself uses minimal compute (256 MB memory). However, when website scraping and pattern detection are enabled (the defaults), it calls two sub-actors per unique domain:
| Setting | 1 Person | 10 People (1 domain) | 50 People (20 domains) |
|---|---|---|---|
| All features + deep | ~2 min | ~3 min (cached) | ~15-20 min |
| All features + standard | ~1 min | ~2 min (cached) | ~10-15 min |
| No sub-actors + deep | ~30 sec | ~2 min | ~5-10 min |
| No sub-actors + standard | ~5 sec | ~10 sec | ~1 min |
To reduce costs, disable website scraping and/or pattern detection. The actor will still generate pattern-based email candidates with 60-70% confidence.
Tips
-
Start with deep verification for critical outreach campaigns where bounce rates matter. Use standard mode for initial prospecting or large-volume enrichment.
-
Group people by company domain in your input list. The actor caches website and pattern results per domain, so processing all contacts at one company together is much more efficient.
-
Check the
allCandidatesarray in the output if the top email has low confidence. Sometimes the second or third candidate is correct, especially for companies with unusual email patterns. -
Use the
companyfield in your input -- it does not affect the enrichment logic but is passed through to the output, making downstream data processing easier. -
Lower concurrency to 1 when targeting a single mail server in deep mode to avoid being rate-limited or blocked.
-
Chain with Bulk Email Verifier to double-check results. The waterfall actor's SMTP check only tests the top 5 candidates. Running the best email through the full verifier adds another layer of confidence.
-
Disable sub-actors for cost savings. Setting
enrichFromWebsite: falseanddetectPattern: falseskips the sub-actor calls entirely. You still get 15 pattern-based candidates and SMTP verification (in deep mode).
Limitations
- No guaranteed accuracy. Email enrichment is inherently probabilistic. Even high-confidence results (90%+) can be wrong if a person uses an unusual email format. The
allCandidatesarray provides alternatives. - Catch-all domains reduce confidence. Domains that accept all addresses (catch-all) make SMTP verification meaningless. These emails cap at 50-60% confidence.
- Social profile URLs may be guesses. When no social links are found on the company website, the actor generates likely profile URLs (e.g., linkedin.com/in/firstname-lastname). These are unverified guesses and may point to the wrong person.
- Only top 5 candidates are SMTP-verified. To avoid hammering mail servers, the actor only tests the top 5 candidates and stops at the first valid one. Less common patterns may not be verified.
- Sub-actor costs add up. Website scraping and pattern detection each call a separate actor. For large lists across many domains, sub-actor costs can exceed the main actor's compute cost.
- Personal email providers are unsupported. The actor is designed for B2B company domains. Gmail, Yahoo, and other free providers will have valid MX records but pattern generation will produce meaningless results.
- Name parsing is English-centric. The
fullNameparser takes the first and last words as first and last name, which may not work for names with particles (e.g., "Ludwig van Beethoven" → first: "ludwig", last: "beethoven", missing "van").
Responsible Use
- Respect privacy laws. Contact enrichment does not grant permission to contact someone. Comply with GDPR, CAN-SPAM, CASL, and other applicable regulations before sending outreach.
- Use for legitimate B2B outreach only. This tool is designed for professional business communication, not mass unsolicited emails or spam.
- Rate limit SMTP checks. The actor includes built-in delays (1 second between SMTP checks, 500ms before catch-all tests), but avoid running many parallel instances against the same mail server.
- Verify before sending. Even high-confidence emails should be verified through a dedicated email verification tool before sending campaigns to protect your sender reputation.
FAQ
Does this actor send any emails? No. Even in deep verification mode, the actor only opens an SMTP connection and checks whether the mail server would accept the address. It disconnects before the DATA stage, so no email is ever sent or received.
How accurate are the results? Accuracy depends on the verification level and available data. Website-confirmed emails score 90%+. Pattern-detected emails score 65-80%. Pattern-only guesses score 60-70%. SMTP-verified addresses on non-catch-all domains score 95%+. In practice, the actor finds a high-confidence email (70%+) for roughly 40-60% of B2B contacts.
What is a catch-all domain? A catch-all (or accept-all) domain is configured to accept email sent to any address at that domain, even nonexistent ones. This means SMTP verification cannot distinguish between real and fake addresses. The actor detects catch-all domains and adjusts confidence scores downward accordingly.
Can I use this for personal email addresses (Gmail, Yahoo, etc.)? The actor is designed for B2B company domains. Personal email providers like Gmail will have valid MX records but pattern generation and website scraping will not produce useful results. The actor works best with company domains where employees share a consistent email naming pattern.
What happens if a domain has no website or no MX records?
If a domain has no MX records, the actor immediately returns a not_found status with 0% confidence. If the website is unreachable, the actor skips that enrichment step and relies on pattern generation and SMTP verification alone.
How are names with accents handled? Names with accented characters are automatically transliterated to ASCII equivalents before pattern generation. For example, "María" becomes "maria", "Müller" becomes "muller", and "ß" becomes "ss". This ensures email patterns work correctly since most email systems use ASCII-only addresses.
Integrations
Connect Waterfall Contact Enrichment with other tools and platforms:
- Export to Google Sheets -- Use Apify's Google Sheets integration to automatically send enriched contacts to a spreadsheet for your sales team.
- Push to CRM -- Connect to HubSpot via the HubSpot Lead Pusher actor, or to Salesforce and Pipedrive via Apify webhooks.
- Chain with other actors -- Feed output from a lead scraper (like Google Maps or LinkedIn) directly into this actor for contact enrichment.
- API access -- Call this actor programmatically via the Apify API from any language or platform.
- Zapier and Make -- Trigger enrichment runs from Zapier or Make workflows, then route the results to your email marketing tool, CRM, or notification system.
- Scheduled runs -- Set up Apify schedules to periodically re-enrich your contact database as people change companies.
Related Actors
These actors from ryanclinton on the Apify Store work well with Waterfall Contact Enrichment:
| Actor | What It Does | How It Connects |
|---|---|---|
| Website Contact Scraper | Extract emails, phones, and team members from websites | Called as a sub-actor for website enrichment |
| Email Pattern Finder | Discover company email patterns | Called as a sub-actor for pattern detection |
| Bulk Email Verifier | Verify email deliverability | Double-check enriched emails before outreach |
| B2B Lead Qualifier | Score and grade leads | Qualify enriched contacts for pipeline prioritization |
| HubSpot Lead Pusher | Push leads to HubSpot CRM | Push enriched contacts with emails directly to HubSpot |
| Google Maps Lead Enricher | Enrich Google Maps business listings | Get company domains from Maps, then enrich key contacts |
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Website Contact Scraper
Extract emails, phone numbers, team members, and social media links from any business website. Feed it URLs from Google Maps or your CRM and get structured contact data back. Fast HTTP requests, no browser — scrapes 1,000 sites for ~$0.50.
Email Pattern Finder
Discover the email format used by any company. Enter a domain like stripe.com and detect patterns like [email protected]. Then generate email addresses for any name. Combine with Website Contact Scraper to turn company websites into complete email lists.
B2B Lead Qualifier - Score & Rank Company Leads
Score and rank B2B leads 0-100 by crawling company websites. Analyzes 30+ signals across contact reachability, business legitimacy, online presence, website quality, and team transparency. No AI keys needed.
Google Maps Lead Enricher
Search Google Maps for businesses, then automatically enrich each result with emails, phone numbers, named contacts, social links, email patterns, and lead quality scores (0-100) through a 4-step pipeline.
Ready to try Waterfall Contact Enrichment?
Start for free on Apify. No credit card required.
Open on Apify Store