OTHER

Csv Lead Processor

**CSV lead processing** for sales teams and agencies who already have a list — Apollo exports, LinkedIn Sales Navigator CSVs, trade show downloads, or any other lead file. Upload your CSV, map your column headers to a standard format, and receive a clean, enriched, verified dataset ready for your CRM or outreach tool. No coding required.

Try on Apify Store

Users (30d)

Runs (30d)

Actively maintained

Maintenance Pulse

Free

Per event

Maintenance Pulse

90/100

Last Build

Today

Last Version

1d ago

Builds (30d)

Issue Response

N/A

Documentation

CSV lead processing for sales teams and agencies who already have a list — Apollo exports, LinkedIn Sales Navigator CSVs, trade show downloads, or any other lead file. Upload your CSV, map your column headers to a standard format, and receive a clean, enriched, verified dataset ready for your CRM or outreach tool. No coding required.

The actor handles the full data pipeline in one run: it parses your file, normalizes field values, deduplicates by company domain, fills in missing email addresses by scraping company websites and detecting email format patterns, verifies deliverability against live mail servers, and exports an enriched CSV back to you. What takes a half-day of manual work in Excel and Hunter.io is done in minutes.

What data can you extract?

Data point	Source	Example
📋 Company name	CSV column	Meridian Software Group
🌐 Website	CSV column or derived from domain	https://meridiansoftware.io
🔗 Domain	Extracted from website URL	meridiansoftware.io
👤 Contact first name	CSV column	Danielle
👤 Contact last name	CSV column	Okafor
📧 Primary email	CSV, scraper, or pattern generator	[email protected]
📞 Phone number	CSV column	+1 (415) 882-3301
💼 Job title	CSV column	VP of Engineering
🔗 LinkedIn URL	CSV column (normalized)	https://linkedin.com/company/meridian-software-group
🏢 Industry	CSV column	Enterprise Software
👥 Employee count	CSV column	51-200
📍 City / State / Country	CSV column	Austin / TX / United States
📧 Enriched emails	Website contact scraper	[email protected]; [email protected]
🔍 Email pattern	Pattern finder sub-actor	{first}.{last}@meridiansoftware.io
✅ Email verified	Bulk email verifier (MX + SMTP)	true
📊 Email status	Bulk email verifier	valid
📊 Email confidence	Bulk email verifier (0-100)	94

Why use CSV Lead Processor?

Your lead list is only as good as the contact data inside it. A typical Apollo or LinkedIn export has incomplete email coverage — anywhere from 30-60% of rows may be missing a deliverable email address. Cleaning that by hand means opening each company website, guessing email formats, validating in a separate tool, and re-importing to your CRM. A list of 500 leads can easily consume a full workday.

This actor automates the entire pipeline. Feed it any CSV, tell it which column is "Company Name" and which is "Email", and it handles the rest: normalizing inconsistent formatting, removing duplicate companies, calling a website scraper for missing emails, falling back to email pattern detection when the scraper finds nothing, and running every email through live mail server verification. You get back a single enriched file with confidence scores you can act on immediately.

Beyond the data itself, the Apify platform adds:

Scheduling — run weekly on a refreshed export to keep enrichment current as your pipeline grows
API access — trigger runs from Python, JavaScript, or any HTTP client and pull results programmatically
Proxy infrastructure — the website contact scraper sub-actor uses Apify's residential proxy pool, reducing block rates on company sites
Monitoring — configure Slack or email alerts when a run fails or processes fewer rows than expected
Integrations — connect output to Zapier, Make, Google Sheets, HubSpot, or any webhook destination

Features

Flexible CSV ingestion — accepts files via public URL or base64-encoded payload; supports comma, semicolon, and tab (TSV) delimiters with automatic UTF-8 BOM stripping so Excel exports parse cleanly
Case-insensitive column mapping — map any header name to a canonical field; headers like "Company Name", "company name", and "COMPANY NAME" all match without configuration changes
Domain extraction and normalization — automatically strips scheme (https://), www. prefix, and trailing paths to produce a clean domain string; derives website from domain and vice versa when only one is present
Email format validation — discards any email value that fails the basic [email protected] pattern before processing; malformed values from source files never propagate downstream
LinkedIn URL normalization — bare handles (e.g. meridian-software-group) are expanded to full https://linkedin.com/company/ URLs automatically
Domain-level deduplication — keeps only the first row per company domain by default, preventing the same company from being enriched and verified multiple times
Two-stage email enrichment — first calls Website Contact Scraper (up to 3 pages per domain) on all domains lacking an email; any domain still without an email then goes to Email Pattern Finder for format detection and name-based generation
Batch sub-actor calls — all domains without emails are sent in a single batch to each sub-actor, not one call per row; this makes enrichment dramatically faster and cheaper for large files
Pattern-based email generation — when a format like {first}.{last}@domain.com is detected with confidence, it is applied to the contact's name to produce a candidate email
Bulk email verification — calls Bulk Email Verifier with DNS MX checks and SMTP probing; results include emailStatus (valid, risky, invalid, unknown, disposable) and a 0-100 confidence score
Downloadable output CSV — writes a UTF-8 with BOM CSV to the actor's Key-Value Store and returns the direct download URL in the summary record; ready to import into any CRM
Configurable output columns — choose which canonical fields appear in the output CSV and in what order; enrichment columns are always appended
Spending limit enforcement — the actor checks the pay-per-event limit after each row push and stops cleanly if your budget cap is reached, so you are never charged more than you authorize
Row cap for testing — set maxRows to process a subset of a large file before committing to a full run
Streaming CSV parser — uses an async generator over csv-parse stream so even multi-megabyte files are processed without memory pressure; 512 MB memory allocation handles files up to tens of thousands of rows

Use cases for CSV lead processing

Sales prospecting list cleanup

Sales development reps at B2B companies frequently export from Apollo, ZoomInfo, or LinkedIn Sales Navigator and find that 40% of rows have no email, and another 20% have emails that bounce. Running those exports through this actor before loading into Outreach, Salesloft, or HubSpot means the sequences your team sets up actually reach real inboxes. A 500-row export cleaned and enriched here typically surfaces 150-200 net-new deliverable contacts.

Marketing agency client deliverables

Agencies managing lead generation campaigns for multiple clients receive raw lists from various sources — event registrations, trade show scans, inbound form submissions. This actor standardizes any format into a consistent schema, enriches missing contact data, and returns a verified CSV the client can load directly into their CRM. One actor run replaces a multi-tool workflow involving manual Excel cleanup, Hunter.io searches, and NeverBounce verification.

Recruiting and talent sourcing outreach

Recruiters working from company target lists exported from LinkedIn often have company names and websites but no individual contact emails. The enrichment step scrapes company websites for HR and talent acquisition contacts, and the pattern finder generates candidate emails from the recruiter's target contact names. The verified results feed directly into recruiting outreach sequences.

Data enrichment for event-sourced leads

Trade show badge scans and conference registration exports typically include name, company, and phone, but rarely email. This actor's enrichment pipeline fills those gaps systematically: the website scraper visits the company site, and the pattern finder generates email candidates using the contact name. Combined with verification, you know which generated emails are safe to send before the event follow-up window closes.

CRM hygiene and re-engagement campaigns

Existing CRM records go stale. Companies change domains, contacts move roles, and email addresses churn. Exporting a segment of cold or bounced contacts as a CSV and running it through this actor with verification enabled quickly identifies which records still have valid delivery paths and which need to be marked as undeliverable, without touching your CRM's automation triggers.

Lead list arbitrage and resale

Data brokers and list vendors who buy raw contact data for resale use this actor to standardize inconsistent formats from multiple sources into a single canonical schema, deduplicate by domain, and append verification scores before packaging. The output CSV is formatted for direct import by the end customer.

How to process and enrich a CSV lead list

Upload your CSV — paste a public URL (Google Drive shareable link, S3 signed URL, Dropbox direct link) into the csvUrl field, or base64-encode your file and paste the encoded string into csvBase64. No file size limit is enforced by the actor.
Set your column mapping — in the columnMapping field, add one entry per column you want to keep. The key is your CSV header exactly as it appears (case-insensitive matching is applied automatically), and the value is the canonical field name. For an Apollo export: "Company": "companyName", "Website": "website", "First Name": "firstName", "Last Name": "lastName", "Email": "email", "Title": "title". Any unmapped columns are ignored.
Choose enrichment and verification — check enrichEmails to automatically find emails for rows that have a website but no email address. Check verifyEmails to run all emails through DNS MX and SMTP probing. Either option can be used independently.
Click Start and wait — a 100-row CSV with no enrichment takes under 30 seconds. Enabling enrichment for 100 domains adds roughly 2-5 minutes. Enabling verification adds 1-3 minutes. Download your enriched CSV from the Key-Value Store URL in the summary record, or pull the structured dataset directly from the Dataset tab.

Input parameters

Parameter	Type	Required	Default	Description
`csvUrl`	string	One of `csvUrl` / `csvBase64` required	—	Public URL of the CSV file to download. Supports HTTP and HTTPS. Retried up to 3 times with exponential backoff on transient failures.
`csvBase64`	string	One of `csvUrl` / `csvBase64` required	—	Base64-encoded CSV content. Use when you cannot provide a public URL (e.g. uploading directly from a script).
`columnMapping`	object	Required	See default	JSON object mapping CSV header strings to canonical field names. Keys are your CSV column headers (case-insensitive). Values must be one of the 17 canonical field names listed below.
`enrichEmails`	boolean	No	`false`	When true, rows with a domain but no email are enriched via Website Contact Scraper then Email Pattern Finder. All domains are sent in a single batch call.
`verifyEmails`	boolean	No	`false`	When true, all emails (from CSV and enrichment) are verified via Bulk Email Verifier using DNS MX + SMTP probing. All unique emails are sent in a single batch.
`deduplicateByDomain`	boolean	No	`true`	Keep only the first row per company domain. Requires `website` or `domain` to be mapped.
`outputCsv`	boolean	No	`true`	Write an enriched CSV (UTF-8 with BOM) to the Key-Value Store and return the download URL in the summary record.
`outputColumns`	array	No	`[]` (all)	Canonical field names to include in the output CSV, in the order listed. Enrichment columns are always appended. Leave empty to include all 17 canonical fields.
`csvDelimiter`	string	No	`,`	Field delimiter for both input and output. Options: `,` (comma), `;` (semicolon for European locales), `\t` (tab / TSV).
`maxRows`	integer	No	`0` (unlimited)	Stop after processing this many rows. Set to a small number (e.g. `10`) to test a large file before a full run.

Canonical field names (valid values for columnMapping): companyName, website, domain, firstName, lastName, fullName, email, phone, title, linkedinUrl, industry, employeeCount, city, state, country, description, tags.

Input examples

Apollo export with enrichment and verification:

{
    "csvUrl": "https://storage.googleapis.com/my-bucket/apollo-export-2024-q1.csv",
    "columnMapping": {
        "Company": "companyName",
        "Website": "website",
        "First Name": "firstName",
        "Last Name": "lastName",
        "Email": "email",
        "Title": "title",
        "Industry": "industry",
        "Employees": "employeeCount",
        "City": "city",
        "Country": "country"
    },
    "enrichEmails": true,
    "verifyEmails": true,
    "deduplicateByDomain": true,
    "outputCsv": true
}

LinkedIn Sales Navigator export (company list, no emails):

{
    "csvUrl": "https://storage.googleapis.com/my-bucket/linkedin-companies.csv",
    "columnMapping": {
        "Company Name": "companyName",
        "Website": "website",
        "Industry": "industry",
        "Company Size": "employeeCount",
        "Headquarters": "city",
        "Country": "country"
    },
    "enrichEmails": true,
    "verifyEmails": false,
    "deduplicateByDomain": true
}

Quick test on first 10 rows only:

{
    "csvUrl": "https://storage.googleapis.com/my-bucket/large-list-5000-rows.csv",
    "columnMapping": {
        "Name": "companyName",
        "Domain": "domain",
        "Contact Email": "email"
    },
    "maxRows": 10,
    "enrichEmails": false,
    "verifyEmails": true,
    "outputCsv": true
}

Input tips

Test with maxRows: 10 first — on a large file, run 10 rows with your mapping to confirm the column names are matching before spending credits on the full file.
Map domain or website, not just email — enrichment only runs on rows where a domain is known. If your CSV has neither, the actor cannot look up missing emails.
Use deduplicateByDomain: false for contact-level lists — if your CSV intentionally has multiple contacts per company (e.g. 5 decision-makers at one account), disable deduplication so all rows are kept.
Batch in one run — processing 500 rows in one run is faster than 5 runs of 100, because enrichment sub-actors benefit from batch calls across all domains at once.
European CSV files — if your file uses semicolons as delimiters (common in French, German, and Spanish Excel exports), set csvDelimiter to ;.

Output example

{
    "companyName": "Meridian Software Group",
    "website": "https://meridiansoftware.io",
    "domain": "meridiansoftware.io",
    "firstName": "Danielle",
    "lastName": "Okafor",
    "fullName": null,
    "email": "[email protected]",
    "phone": "+1 (415) 882-3301",
    "title": "VP of Engineering",
    "linkedinUrl": "https://linkedin.com/company/meridian-software-group",
    "industry": "Enterprise Software",
    "employeeCount": "51-200",
    "city": "Austin",
    "state": "TX",
    "country": "United States",
    "description": "B2B workflow automation platform for mid-market teams.",
    "tags": "saas, enterprise, workflow",
    "enrichedEmails": ["[email protected]", "[email protected]"],
    "emailPattern": "{first}.{last}@meridiansoftware.io",
    "emailPatternConfidence": 87,
    "generatedEmails": ["[email protected]", "[email protected]"],
    "emailVerified": true,
    "emailStatus": "valid",
    "emailConfidence": 94,
    "sourceRowIndex": 3,
    "enrichmentApplied": true,
    "verificationApplied": true,
    "processedAt": "2026-03-22T09:14:37.821Z"
}

In addition to individual lead records, the actor pushes a summary record (identifiable by "type": "summary") at the end of every run:

{
    "type": "summary",
    "totalRowsRead": 487,
    "rowsAfterDedup": 431,
    "rowsWithEmail": 358,
    "rowsWithoutEmail": 73,
    "enrichmentAttempted": 73,
    "emailsFoundByEnrichment": 51,
    "emailsVerified": 358,
    "emailsValid": 312,
    "emailsInvalid": 46,
    "leadsPushed": 431,
    "csvKey": "output-leads.csv",
    "csvDownloadUrl": "https://api.apify.com/v2/key-value-stores/STORE_ID/records/output-leads.csv",
    "enrichmentEnabled": true,
    "verificationEnabled": true,
    "deduplicatedByDomain": true,
    "spendingLimitReached": false,
    "processedAt": "2026-03-22T09:18:02.114Z"
}

Output fields

Field	Type	Description
`companyName`	string \| null	Company name from CSV
`website`	string \| null	Company website URL (derived from domain if not mapped)
`domain`	string \| null	Normalized domain, e.g. `acmecorp.com` (derived from website if not mapped)
`firstName`	string \| null	Contact first name
`lastName`	string \| null	Contact last name
`fullName`	string \| null	Contact full name (used for pattern-based email generation when first/last not available)
`email`	string \| null	Primary email. Source priority: CSV → contact scraper → pattern generator
`phone`	string \| null	Contact phone number
`title`	string \| null	Contact job title
`linkedinUrl`	string \| null	LinkedIn URL, normalized to `https://linkedin.com/company/...`
`industry`	string \| null	Company industry
`employeeCount`	string \| null	Company size string, e.g. `51-200`
`city`	string \| null	Company city
`state`	string \| null	Company state or region
`country`	string \| null	Company country
`description`	string \| null	Company or contact description
`tags`	string \| null	Comma-separated tags from the source CSV
`enrichedEmails`	string[]	Additional emails found by the website contact scraper
`emailPattern`	string \| null	Detected email format pattern, e.g. `{first}.{last}@domain.com`
`emailPatternConfidence`	number \| null	Confidence score (0-100) for the detected pattern
`generatedEmails`	string[]	Emails generated by applying the detected pattern to the contact name
`emailVerified`	boolean \| null	`true` if the primary email passed DNS MX + SMTP verification
`emailStatus`	string \| null	Verification result: `valid`, `risky`, `invalid`, `unknown`, or `disposable`
`emailConfidence`	number \| null	Verification confidence score (0-100)
`sourceRowIndex`	number	1-based row number in the source CSV for traceability
`enrichmentApplied`	boolean	Whether this lead went through the enrichment sub-actors
`verificationApplied`	boolean	Whether this lead's email was submitted to bulk-email-verifier
`processedAt`	string	ISO 8601 timestamp when this row was processed

How much does it cost to process a CSV lead list?

CSV Lead Processor uses pay-per-event pricing — you pay $0.05 per lead successfully parsed and pushed to the dataset. Platform compute costs are included.

Scenario	Leads	Cost per lead	Total cost
Quick test	10	$0.05	$0.50
Small campaign	100	$0.05	$5.00
Medium list	500	$0.05	$25.00
Large export	2,000	$0.05	$100.00
Agency monthly	5,000	$0.05	$250.00

Note that enabling enrichEmails or verifyEmails triggers sub-actor runs (Website Contact Scraper at $0.15/site, Email Pattern Finder at $0.10/domain, Bulk Email Verifier at $0.005/email) which are charged separately against your Apify account. A fully enriched and verified run on 500 leads where 150 need email enrichment and all 500 are verified costs approximately $25 (lead processing) + $22.50 (contact scraper) + $10 (pattern finder fallback on residual domains) + $2.50 (verification) = around $60 total — compare that to Hunter.io at $99/month for 1,000 searches, or Clay at $149-499/month.

You can set a maximum spending limit per run in the actor's settings to control total cost. The actor stops cleanly when your budget cap is reached, so you are never charged more than you authorize.

CSV lead processing using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/csv-lead-processor").call(run_input={
    "csvUrl": "https://storage.googleapis.com/my-bucket/leads.csv",
    "columnMapping": {
        "Company": "companyName",
        "Website": "website",
        "First Name": "firstName",
        "Last Name": "lastName",
        "Email": "email",
        "Title": "title",
        "Industry": "industry",
        "Country": "country"
    },
    "enrichEmails": True,
    "verifyEmails": True,
    "deduplicateByDomain": True,
    "outputCsv": True
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item.get("type") == "summary":
        print(f"Processed: {item['leadsPushed']} leads | Valid emails: {item['emailsValid']}")
        print(f"CSV download: {item.get('csvDownloadUrl')}")
    else:
        status = item.get("emailStatus", "unverified")
        print(f"{item.get('companyName')} | {item.get('email')} | {status}")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/csv-lead-processor").call({
    csvUrl: "https://storage.googleapis.com/my-bucket/leads.csv",
    columnMapping: {
        "Company": "companyName",
        "Website": "website",
        "First Name": "firstName",
        "Last Name": "lastName",
        "Email": "email",
        "Title": "title",
        "Industry": "industry",
        "Country": "country"
    },
    enrichEmails: true,
    verifyEmails: true,
    deduplicateByDomain: true,
    outputCsv: true
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    if (item.type === "summary") {
        console.log(`Leads processed: ${item.leadsPushed} | Emails valid: ${item.emailsValid}`);
        console.log(`CSV: ${item.csvDownloadUrl}`);
    } else {
        console.log(`${item.companyName} | ${item.email} | ${item.emailStatus}`);
    }
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~csv-lead-processor/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "csvUrl": "https://storage.googleapis.com/my-bucket/leads.csv",
    "columnMapping": {
      "Company": "companyName",
      "Website": "website",
      "First Name": "firstName",
      "Last Name": "lastName",
      "Email": "email",
      "Title": "title",
      "Industry": "industry",
      "Country": "country"
    },
    "enrichEmails": true,
    "verifyEmails": true,
    "deduplicateByDomain": true,
    "outputCsv": true
  }'

# Fetch results once the run completes (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

# Download the enriched CSV directly (replace STORE_ID from the summary record)
curl "https://api.apify.com/v2/key-value-stores/STORE_ID/records/output-leads.csv?token=YOUR_API_TOKEN" \
  -o enriched-leads.csv

How CSV Lead Processor works

Phase 1: CSV loading and streaming parse

The actor accepts a CSV via URL (fetched with up to 3 retries using exponential backoff) or as a base64-encoded string. The raw bytes are scanned for a UTF-8 BOM (EF BB BF) and stripped before parsing — this is critical because Excel's "Save as CSV" produces BOM-prefixed files where the first header would otherwise appear as \uFEFFCompany Name rather than Company Name, breaking column mapping.

The CSV is then streamed through csv-parse with columns: true (header row auto-detection), relax_column_count: true (ragged row tolerance), and your chosen delimiter. Rows are yielded one at a time via an async generator, so even large files never load as a full in-memory object. For each row, mapRow() performs a case-insensitive header lookup, applies field normalization (email lowercasing and format validation, LinkedIn URL expansion, domain extraction from website URLs), and assembles the canonical lead record.

Domain deduplication uses a Set<string> keyed on the normalized lowercase domain. The first row per domain is kept; subsequent rows are skipped with a debug log entry.

Phase 2: Two-stage email enrichment (optional)

All leads that have a domain but no email are collected into a single batch. The batch is sent to Website Contact Scraper as a list of URLs with maxPagesPerDomain: 3. The scraper visits each company's homepage, contact page, and about page, returning any email addresses found. These are applied to the corresponding lead records.

Leads that still have no email after the contact scraper step go to a second batch call to Email Pattern Finder. The pattern finder analyzes domain DNS records and public web presence to detect the email naming convention (e.g. {first}.{last}@domain.com, confidence: 87). When a contact name is available (from fullName or combined firstName + lastName), it generates candidate emails by applying the pattern. The top candidate is set as the primary email field, and all generated candidates are stored in generatedEmails.

Both enrichment sub-actors are called once per run regardless of how many leads need enrichment — batching is key to keeping run time and cost predictable.

Phase 3: Email verification (optional)

All unique email addresses across all leads (both from the original CSV and from enrichment) are deduplicated into a single list and sent to Bulk Email Verifier with verificationLevel: standard. The verifier performs DNS MX record lookup (confirming the mail server exists) and SMTP probing (confirming the mailbox exists without sending an actual message). Results are indexed by email address and applied back to each lead: emailVerified (boolean), emailStatus (valid / risky / invalid / unknown / disposable), and emailConfidence (0-100).

Again, a single batch call is made — not one call per lead.

Phase 4: Output and CSV generation

Enriched leads are pushed to the Apify dataset one by one. In pay-per-event mode, the actor charges one lead-processed event ($0.05) after each successful push and checks whether the spending limit has been reached. If the limit is hit, the actor stops gracefully and records spendingLimitReached: true in the summary.

If outputCsv is enabled, the actor serializes all pushed leads using csv-stringify with the canonical column set (or your outputColumns selection) plus the enrichment columns always appended to the right. The output buffer is prepended with a UTF-8 BOM for Excel compatibility and written to the actor's Key-Value Store as output-leads.csv. The public download URL is recorded in the summary dataset record.

Tips for best results

Run a 10-row test before the full list. Set maxRows: 10 and check that your column mapping is picking up the right fields. Look at the output to confirm domain is being extracted correctly — enrichment depends on it.
Map domain or website even if you already have emails. If you later want to re-run with enrichEmails: true to fill gaps, the actor needs the domain. Mapping it costs nothing in the initial run.
Use deduplicateByDomain: false for multi-contact lists. If you have 5 decision-makers listed per company and want to enrich and verify each contact independently, disable deduplication. The trade-off is that enrichment sub-actors will be called once per domain, not once per contact — so you may get the same scraped emails applied to multiple contacts at the same company.
Filter output by emailStatus: "valid" before importing to your CRM. The verification step identifies risky and invalid addresses. Importing only valid status leads keeps your sender reputation clean and avoids bounces that affect deliverability.
Set a maxRows budget cap on unknown-size files. If you receive a CSV from a third party and are unsure of its row count, set maxRows: 500 to cap your first run cost at $25 while you validate quality.
Combine with HubSpot Lead Pusher — run this actor first to normalize and enrich, then feed the output dataset into HubSpot Lead Pusher to create or update CRM contacts automatically.
Schedule weekly on refreshed exports. If your sales team exports from Apollo or LinkedIn weekly, schedule this actor on the same cadence. New rows in each export are enriched and verified automatically, keeping your working list current without manual intervention.
For Google Sheets input, use a CSV export link. In Google Sheets, go to File > Share > Publish to web, publish the sheet as CSV, and paste that URL into csvUrl. The actor fetches the current data on every run.

Combine with other Apify actors

Actor	How to combine
Website Contact Scraper	Called automatically during enrichment; run standalone first to preview what emails are available on your target company sites
Email Pattern Finder	Called automatically as enrichment fallback; run standalone to detect patterns for a domain list before processing a full CSV
Bulk Email Verifier	Called automatically during verification; run standalone on any email list to get deliverability scores before sending
HubSpot Lead Pusher	Push this actor's output dataset directly into HubSpot as new contacts or company records
B2B Lead Gen Suite	Use when you don't have a list at all — this suite generates leads from scratch from website URLs; CSV Lead Processor handles the "bring your own list" path
Google Maps Email Extractor	Export local business results from Google Maps as a CSV, then run it through this actor to standardize and verify the contacts
Waterfall Contact Enrichment	For higher-value accounts where you want a 10-step enrichment cascade beyond what the two built-in sub-actors provide

Limitations

No JavaScript-rendered website support during enrichment. The Website Contact Scraper sub-actor uses HTTP-based parsing. Company sites that load contact details exclusively via JavaScript (e.g. single-page apps with lazy-loaded content) may return no emails. For those sites, use Website Contact Scraper Pro as a separate step before running this actor.
Enrichment requires a domain. Rows with a company name but no website or domain cannot be enriched. Map the website or domain column to enable enrichment for those rows.
Pattern finder accuracy varies by domain. The pattern confidence score indicates reliability. Patterns below 70 confidence should be treated as candidates, not confirmed addresses — always pair pattern-based emails with verification.
Email verification cannot guarantee delivery. SMTP probing confirms a mailbox exists but does not guarantee the message will not be filtered by spam rules, greylisting, or catch-all configurations. emailStatus: "risky" means the address exists but has characteristics associated with higher bounce rates.
Catch-all domains report as unknown or risky. Some mail servers accept any recipient address to prevent address enumeration. Addresses at these domains cannot be confirmed as real without sending an actual email.
Sub-actor call limits at scale. Each sub-actor call (contact scraper, pattern finder, verifier) is limited to 1,000 results per batch. Lists exceeding 1,000 domains needing enrichment in a single run may see incomplete enrichment for the tail. Split very large files into batches of 800 domains or fewer when using enrichment.
No real-time progress for enrichment steps. While enrichment sub-actors run, the actor status shows "Enriching emails for N domains..." but does not stream per-domain progress. This is a platform-level constraint on sub-actor calls.
Source CSV must be UTF-8 or UTF-8 BOM encoded. Files in Latin-1, Windows-1252, or other encodings may produce garbled text in non-ASCII characters (accented names, etc.). Convert to UTF-8 before uploading.
The summary record is the last item in the dataset. When iterating the dataset programmatically, filter for item.type === "summary" to find the statistics record rather than assuming it is the first or last item positionally.

Integrations

Zapier — trigger this actor when a new CSV is added to a Google Drive folder, then push enriched leads to HubSpot, Salesforce, or a Slack notification
Make — build a scenario that polls the actor's dataset on completion and routes verified leads into your outreach sequence tool
Google Sheets — publish your lead sheet as a CSV URL and pass it to this actor on a schedule; push enriched results back to a separate output sheet
Apify API — call the actor programmatically from your own CRM workflow, pass CSVs as base64, and retrieve structured JSON output without needing the UI
Webhooks — configure a webhook to fire on run completion and POST the summary record (including the CSV download URL) to your internal systems
LangChain / LlamaIndex — use enriched lead records as structured context for AI-powered sales research agents or account scoring workflows

Troubleshooting

Empty results despite a valid CSV file. The most common cause is a column mapping mismatch. Open the actor log and look for lines starting with Row 1: — if the canonical fields are all null, your CSV headers do not match your mapping keys. Check for trailing spaces in your CSV headers, BOM characters (the actor strips the file-level BOM but not per-cell BOM), or tab-delimited files passed with csvDelimiter still set to comma.

Enrichment ran but found no emails. The website contact scraper visits up to 3 pages per domain. If a company's contact details are behind a login, on a JavaScript-rendered SPA, or only in a PDF or image, the scraper will not find them. Check enrichmentApplied: true vs emailsFoundByEnrichment in the summary record to see the success rate. For SPA-heavy company sites, run Website Contact Scraper Pro separately first.

Run is taking much longer than expected. Enrichment and verification involve external sub-actor runs and live network calls. A 500-domain enrichment run can take 5-15 minutes depending on website response times and mail server latency. For very large lists, consider splitting into batches of 300-400 rows and running in parallel. Set maxRows to limit the first run.

spendingLimitReached: true in the summary. You have set a per-run spending cap that was hit before all rows were processed. Increase the cap in the actor's "Max total charge per run" setting and re-run on the remaining rows. You can identify which rows were not processed using sourceRowIndex — the last pushed sourceRowIndex value shows where the run stopped.

Emails showing as unknown across many leads. These companies likely use catch-all mail servers. The verifier cannot confirm individual mailbox existence at these domains without sending an actual email. Treat unknown addresses as lower-confidence leads and consider lower send volumes to those domains.

Responsible use

This actor processes only data you provide — it does not scrape personal data from external sources independently.
When enrichment is enabled, the actor visits company websites to extract publicly listed contact information. Respect each website's terms of service and robots.txt directives.
Comply with GDPR, CAN-SPAM, CASL, and other applicable data protection and anti-spam laws when using processed contact data for outreach.
Do not upload CSVs containing sensitive personal data beyond what is necessary for your legitimate business purpose.
Do not use output data for spam, harassment, or unauthorized marketing.
For guidance on web scraping legality, see Apify's guide.

FAQ

How do I process a CSV lead list that has inconsistent column names? The column mapping uses case-insensitive matching, so "Company Name", "company name", and "COMPANY NAME" all resolve to the same header. If your file has genuinely different column names (e.g. "Org" instead of "Company"), just set the key in columnMapping to match your actual header. You do not need to rename columns in your CSV before uploading.

Can I process a CSV file from Google Drive or Dropbox? Yes. Google Drive shareable links and Dropbox direct download links work when the file is publicly accessible. In Google Drive, use File > Share > Publish to web > CSV to get a direct download URL. In Dropbox, change ?dl=0 to ?dl=1 at the end of the share link to force a download. Google Sheets can also be published as a CSV via File > Share > Publish to web.

How many leads can I process in one CSV lead processing run? The actor imposes no hard limit — maxRows defaults to 0 (unlimited). In practice, memory allocation (512 MB) comfortably handles files of 50,000+ rows for parse-only runs. Enrichment adds sub-actor call overhead; for lists over 1,000 rows requiring enrichment, consider splitting into batches of 800 to stay within sub-actor result limits.

Does CSV lead processing work with tab-separated (TSV) files? Yes. Set csvDelimiter to \t in the input. TSV files from data warehouses, Airtable exports, and some CRM exports are fully supported.

How accurate is the email pattern detection? The Email Pattern Finder returns a confidence score (0-100) with each pattern. Patterns with confidence above 85 are generally reliable for name-based email generation. Below 70, treat the generated address as a candidate to be verified rather than a confirmed address. Always pair pattern-based enrichment with verifyEmails: true for outreach use.

Is it legal to scrape company websites for contact information during enrichment? The enrichment step visits publicly accessible company websites — the same pages you could open in a browser — to extract email addresses that companies have chosen to make public. This is generally considered legal under most jurisdictions. However, you are responsible for complying with applicable laws in your market and the terms of service of the websites visited. See Apify's legal guide for more detail.

How is CSV Lead Processor different from Hunter.io? Hunter.io charges $49-149/month for domain search and email finding with monthly credit limits. This actor charges $0.05 per lead with no subscription — a 500-lead list costs $25, plus sub-actor costs for enrichment. It also accepts any CSV format, handles deduplication, normalizes field values, and writes back a clean output CSV in one run. Hunter.io requires a separate export step after finding emails.

Can I use this actor to verify emails I already have, without importing a full CSV? For a pure email verification run, use Bulk Email Verifier directly — it accepts a list of email addresses with no CSV involved and costs $0.005 per email. This actor's verification feature is designed for use as part of a full lead processing pipeline.

What happens if my CSV has rows with no domain or website? Rows without a domain are parsed and included in the output with all other mapped fields intact. Enrichment simply skips those rows (it cannot look up emails without a domain). If you map the companyName field, you can use those records for manual research or pass them to Company Deep Research to find the website first.

Can I schedule this actor to run automatically on a recurring basis? Yes. In the Apify console, open the actor and go to the Schedules tab. Set a daily, weekly, or custom cron interval. Combine this with a Google Sheets CSV publish URL as the input source, and your enrichment pipeline runs automatically whenever you update the source sheet.

How do I push the output to my CRM after processing? Use HubSpot Lead Pusher to import this actor's output dataset into HubSpot contacts or companies. For Salesforce, Pipedrive, or other CRMs, use the Zapier or Make integrations to map dataset fields to your CRM's API on run completion. Alternatively, download the output CSV from the Key-Value Store URL in the summary record and import it manually.

What does emailStatus: "risky" mean? A risky status means the email address exists and the mailbox responded, but it has characteristics associated with higher bounce rates — typically a role address (info@, sales@, contact@), a recently created domain, or a mail server with unusual configuration. Risky addresses can be sent to but should be used in smaller volumes until you see how they perform.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

Go to Account Settings > Privacy
Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom column mapping help, bulk processing requirements, or CRM integration questions, reach out through the Apify platform.

How it works

Configure

Set your parameters in the Apify Console or pass them via API.

Run

Click Start, trigger via API, webhook, or set up a schedule.

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Related actors

GitHub Repository Search

Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.

99% success

Weather Forecast Search

Get weather forecasts for any location worldwide using the free Open-Meteo API. Returns current conditions, daily and hourly forecasts with temperature, precipitation, wind, UV index, and more. No API key needed.

88% success

EUIPO EU Trademark Search

Search EU trademarks via official EUIPO database. Find registered and pending trademarks by name, Nice class, applicant, or status. Returns full trademark details and filing history.

86% success$0.002/event

Nominatim Address Geocoder

Geocode addresses to GPS coordinates and reverse geocode coordinates to addresses using OpenStreetMap Nominatim. Batch geocoding with rate limiting. Free, no API key needed.

98% success

Not sure which actor to pick?

Try the actor recommender

Ready to try Csv Lead Processor?

Start for free on Apify. No credit card required.

Open on Apify Store