Website Contact Scraper
Extract emails, phone numbers, team members, and social media links from any business website. Feed it URLs from Google Maps or your CRM and get structured contact data back. Fast HTTP requests, no browser — scrapes 1,000 sites for ~$0.50.
Maintenance Pulse
98/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| website-scanned | Charged per website domain scraped with full contact data | $0.15 |
Example: 100 events = $15.00 · 1,000 events = $150.00
Documentation
Website Contact Scraper extracts emails, phone numbers, team member names, job titles, and social media links from any business website. Give it a list of URLs and it returns one clean, structured contact record per domain — ready for CRM import, outreach sequences, or lead databases.
The actor crawls each site's homepage, then automatically discovers and visits contact, about, team, leadership, and company pages within the same domain. All data is deduplicated across every page visited, so you never see duplicate emails or phantom contacts. No code required — paste URLs, click Start, download results.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| 📧 Email addresses | mailto links, body text, anchor hrefs | [email protected] |
| 📞 Phone numbers | tel: links, footer/address/contact areas | +1 (415) 555-0192 |
| 👤 Team member names | Schema.org Person, team cards, heading pairs | Marcus Rodriguez |
| 💼 Job titles | Adjacent to names, itemprop="jobTitle", .job-title | VP of Business Development |
| 🔗 LinkedIn profiles | Company pages and personal profiles | linkedin.com/company/pinnacle-ventures |
| 🐦 Twitter / X profiles | twitter.com and x.com links | twitter.com/pinnaclevc |
| 📘 Facebook pages | Facebook page links | facebook.com/pinnacleventures |
| 📸 Instagram profiles | Instagram profile links | instagram.com/pinnaclevc |
| ▶️ YouTube channels | Channel, user, and @ links | youtube.com/@pinnacleventures |
| 🌐 Domain | Parsed from input URL | pinnacleventures.com |
| 🕐 Scraped timestamp | Run completion time | 2026-03-19T14:32:18.456Z |
| 📄 Pages scraped | Per-domain page count | 4 |
Why use Website Contact Scraper?
Building prospect lists from company websites by hand means opening each site, hunting for a contact page, scanning for emails that might be buried in footers, checking an about page for team names, copying everything into a spreadsheet — then repeating that for 200 more companies. A thorough researcher might process 15 sites per hour. At that rate, 500 websites takes two full working days, and the data is already stale before you finish.
This actor automates the entire process. Paste a list of URLs, press Start, and return to a structured dataset with emails, phone numbers, team members, and social profiles for every domain. A batch of 500 websites typically completes in under 45 minutes for roughly $75 — less than two hours of minimum wage labor.
Built on Apify, the actor gives you production capabilities beyond a one-off script:
- Scheduling — run daily or weekly to keep contact databases fresh without manual effort
- API access — trigger runs from Python, JavaScript, or any HTTP client and pipe results directly into your stack
- Proxy rotation — scrape large batches without IP blocks using Apify's built-in residential and datacenter proxy network
- Monitoring — receive Slack or email alerts when runs fail or return unexpected result counts
- Integrations — connect directly to Zapier, Make, Google Sheets, HubSpot, or webhooks with no extra code
Features
- Three-source email extraction from mailto: link hrefs, full body text (with script, style, and noscript nodes stripped to avoid tracking pixel leakage), and all anchor href attributes — catches emails placed anywhere on the page
- Junk email filtering that automatically removes noreply, no-reply, donotreply, test, admin, postmaster, mailer-daemon, webmaster, and root addresses, plus emails ending in image/CSS/JS file extensions and addresses from known placeholder domains (sentry.io, wixpress.io, example.com, placeholder domains)
- Phone extraction from tel: links as the primary, most-reliable source, supplemented by formatted-number regex in contact-specific page areas (header, footer, nav, address, and elements with contact, phone, info, topbar CSS classes)
- Phone validation that rejects all-same-digit sequences and sequential numbers (1234567) while requiring 7–15 digits and proper formatting (international prefix, parentheses, or dash/dot separators)
- Three-strategy contact name detection: (1) Schema.org Person structured data with
itemprop="name"anditemprop="jobTitle"attributes, (2) 11 team-card CSS selectors (.team-member,.team-card,.staff-member,.person-card,.member-card,.leadership-card,.employee,.bio-card,.team-item,.people-card,.about-member), and (3) heading-paragraph pairs where the h3/h4 matches a strict proper-name regex and the next sibling contains one of 35+ job title keywords - 40+ junk-name word filter that prevents page headings like "Free Plan" or "Our Services" from appearing in the contacts list
- Automatic contact-page discovery that follows same-domain links matching 19 contact-related path keywords: contact, about, team, leadership, management, executives, people, staff, company, and variations
- Configurable crawl depth from 1 to 20 pages per domain — default of 5 covers homepage + contact + about + team for most sites
- Atomic page-slot reservation that prevents concurrent subpage handlers from exceeding the per-domain page limit even at maximum concurrency
- Deduplication across all pages — emails by exact lowercase string, phones by digit-only key (so
+1 (415) 555-0192and4155550192are the same number), contacts by case-insensitive name, and social links first-match-per-platform - Batch processing of unlimited URLs in a single run with up to 10 simultaneous connections and 120 requests per minute
- Built-in retry logic with 2 automatic retries per page and SSL error tolerance for sites with invalid certificates
- Pay-per-event pricing with a per-run spending cap — the actor stops delivering results when your budget is reached so there are no surprise charges
- JavaScript/SPA support available via Website Contact Scraper Pro, which renders React, Angular, and Vue sites with a real browser
Use cases for scraping website contacts
Sales prospecting and outreach
Sales development reps building targeted prospect lists paste company websites from a CRM or LinkedIn search into the actor, then use the output emails, direct phone numbers, and LinkedIn profiles to populate outreach sequences. Finding a decision-maker's direct email manually takes 5–10 minutes per company; this actor processes that same company in seconds.
Marketing agency lead generation
Agencies building prospect databases for clients scrape industry directories, trade association member lists, or competitor customer pages to extract contact information at scale. The structured CSV output maps directly to email marketing tools and CRM import templates.
Recruiting and talent sourcing
Recruiters extract team pages from target companies to identify hiring managers, department heads, and engineers along with their direct contact details and LinkedIn profiles. The contacts array with names and titles makes it easy to identify the right person to reach before making first contact.
Business research and market mapping
Analysts conducting competitive intelligence or market mapping run batches of hundreds of competitor or prospect websites to produce a structured dataset of who works where, what their titles are, and how to reach them. The timestamp field tracks when data was collected, making it easy to identify stale records.
Freelancer and consultant outreach
Independent consultants and agencies identify the right decision-maker to pitch at prospective client companies by scraping the about and leadership pages for names, titles, and email addresses — rather than guessing at generic info@ addresses that rarely convert.
CRM data enrichment
Operations and RevOps teams augment existing company records in HubSpot, Salesforce, or Pipedrive with fresh contact details, social profile links, and team member data scraped directly from live company websites. Combine with Bulk Email Verifier to validate addresses before import.
How to scrape website contact information
- Provide website URLs — Enter one or more business website homepages in the input form. Use the root domain (e.g.,
https://pinnacleventures.com), not a deep URL. The actor discovers internal pages automatically. - Configure options — Keep
maxPagesPerDomainat the default of 5 for most sites. Increase to 10–15 only if you know a site has a large staff directory spread across multiple pages. - Run the actor — Click "Start". The actor crawls each site concurrently, typically finishing 50 websites in 3–5 minutes and 500 websites in 40–60 minutes.
- Download results — Open the Dataset tab and download your data as JSON, CSV, or Excel. Each row is one domain with its complete contact profile: emails, phones, team members, and social links.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
urls | string[] | Yes | — | Business website homepages to scrape. One output record per unique domain. |
maxPagesPerDomain | integer | No | 5 | Pages to crawl per website (1–20). Default covers homepage + contact + about + team for most sites. |
includeNames | boolean | No | true | Extract team member names and job titles from team/about pages. Disable for emails-only runs. |
includeSocials | boolean | No | true | Extract social media profile links (LinkedIn, Twitter/X, Facebook, Instagram, YouTube). |
proxyConfiguration | object | No | Apify Proxy | Proxy settings. Recommended when scraping more than 20 sites. |
Input examples
Single website with defaults:
{
"urls": ["https://pinnacleventures.com"]
}
Batch of 50 sites with deep crawl:
{
"urls": [
"https://pinnacleventures.com",
"https://meridiantech.io",
"https://atlaslogistics.com"
],
"maxPagesPerDomain": 10,
"includeNames": true,
"includeSocials": true,
"proxyConfiguration": { "useApifyProxy": true }
}
Emails and phones only, fast pass:
{
"urls": [
"https://pinnacleventures.com",
"https://meridiantech.io"
],
"maxPagesPerDomain": 3,
"includeNames": false,
"includeSocials": false
}
Input tips
- Start with
maxPagesPerDomain: 5— this covers the homepage plus contact, about, and team pages for the vast majority of business websites. Only increase it for sites with large employee directories spanning 6+ pages. - Enable proxies for batches over 20 sites — Apify Proxy rotates IPs automatically to prevent rate limiting. The default proxy configuration works for most cases.
- Provide root homepages, not deep URLs — enter
https://acmecorp.com, nothttps://acmecorp.com/blog/post-123. The actor discovers contact-related subpages on its own. - Disable
includeNamesfor faster runs — name extraction adds DOM traversal per page. If you only need emails and phone numbers, turn it off to reduce processing time. - Batch everything in one run — processing 200 sites in a single run is faster than 200 separate single-site runs. The actor handles concurrency internally.
Output example
Each item in the dataset represents one website domain:
{
"url": "https://pinnacleventures.com",
"domain": "pinnacleventures.com",
"emails": [
"[email protected]",
"[email protected]",
"[email protected]"
],
"phones": [
"+1 (415) 555-0192",
"+1 800-555-0134"
],
"contacts": [
{
"name": "Marcus Rodriguez",
"title": "Managing Partner",
"email": "[email protected]"
},
{
"name": "Sarah Chen",
"title": "VP of Portfolio Operations"
},
{
"name": "James Okafor",
"title": "Director of Business Development"
}
],
"socialLinks": {
"linkedin": "https://www.linkedin.com/company/pinnacle-ventures",
"twitter": "https://twitter.com/pinnaclevc",
"facebook": "https://www.facebook.com/pinnacleventures",
"instagram": "https://www.instagram.com/pinnaclevc",
"youtube": "https://www.youtube.com/@pinnacleventures"
},
"pagesScraped": 4,
"scrapedAt": "2026-03-19T14:32:18.456Z"
}
Output fields
| Field | Type | Description |
|---|---|---|
url | string | Normalized input URL (HTTPS, no trailing slash) |
domain | string | Domain with www. stripped (e.g., pinnacleventures.com) |
emails | string[] | Deduplicated email addresses from all crawled pages, junk addresses filtered out |
phones | string[] | Deduplicated phone numbers; deduplication keyed on digits only so format variants collapse to one entry |
contacts | object[] | Named team members extracted from team/about pages |
contacts[].name | string | Person's full name (proper capitalization validated) |
contacts[].title | string | Job title (optional; present when found adjacent to the name) |
contacts[].email | string | Email address linked to this person (optional; from mailto: in their team card) |
socialLinks | object | Social media profile URLs keyed by platform |
socialLinks.linkedin | string | LinkedIn company or personal profile URL |
socialLinks.twitter | string | Twitter/X profile URL |
socialLinks.facebook | string | Facebook page URL |
socialLinks.instagram | string | Instagram profile URL |
socialLinks.youtube | string | YouTube channel URL |
pagesScraped | number | Total pages processed for this domain (homepage + discovered subpages) |
scrapedAt | string | ISO 8601 timestamp when the result was assembled |
How much does it cost to scrape website contacts?
Website Contact Scraper uses pay-per-event pricing — you pay $0.15 per website scanned. Platform compute costs are included in the price.
| Scenario | Websites | Cost per website | Total cost |
|---|---|---|---|
| Quick test | 1 | $0.15 | $0.15 |
| Small batch | 10 | $0.15 | $1.50 |
| Medium batch | 50 | $0.15 | $7.50 |
| Large batch | 200 | $0.15 | $30.00 |
| Enterprise | 1,000 | $0.15 | $150.00 |
You can set a maximum spending limit per run to control costs. The actor stops delivering results when your budget is reached, so you never pay more than you expect.
Compare this to Hunter.io at $49–$149/month or Clay at $149–$720/month — most Website Contact Scraper users spend $5–$30/month with no subscription commitment. Apify's free tier also includes $5 of monthly credits, which covers 33 website scans at no cost.
Extract website contacts using the API
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/website-contact-scraper").call(run_input={
"urls": [
"https://pinnacleventures.com",
"https://meridiantech.io",
"https://atlaslogistics.com",
],
"maxPagesPerDomain": 5,
"includeNames": True,
"includeSocials": True,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['domain']}: {len(item['emails'])} emails, {len(item['phones'])} phones")
for contact in item.get("contacts", []):
print(f" {contact['name']} — {contact.get('title', 'no title')}")
if contact.get("email"):
print(f" email: {contact['email']}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/website-contact-scraper").call({
urls: [
"https://pinnacleventures.com",
"https://meridiantech.io",
"https://atlaslogistics.com",
],
maxPagesPerDomain: 5,
includeNames: true,
includeSocials: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
console.log(`${item.domain}: ${item.emails.length} emails, ${item.contacts.length} contacts`);
for (const contact of item.contacts) {
console.log(` ${contact.name} (${contact.title ?? "no title"})`);
}
}
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~website-contact-scraper/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"urls": ["https://pinnacleventures.com", "https://meridiantech.io"],
"maxPagesPerDomain": 5,
"includeNames": true,
"includeSocials": true
}'
# Fetch results once the run completes (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How Website Contact Scraper works
Phase 1: URL normalization and domain deduplication
Before any crawling begins, each input URL is normalized — HTTPS is enforced, trailing slashes are stripped, and the domain is extracted with www. removed. Duplicate domains are collapsed to a single entry so you never pay twice for the same site. An empty result object is created for each unique domain, and the homepage is queued with label: 'HOMEPAGE' along with the user's configuration (maxPagesPerDomain, includeNames, includeSocials).
Phase 2: Homepage crawl and contact-page discovery
CheerioCrawler fetches each homepage using got-scraping with up to 10 concurrent connections, a 120 requests/minute rate limit, 30-second timeout, and 2 automatic retries. SSL errors are silently ignored to handle sites with invalid certificates.
On the homepage, all four extraction functions run in parallel: emails (from mailto: links, body text regex, and anchor hrefs with script/style nodes stripped), phones (from tel: links and contact-area text), social links (5 platform patterns), and contacts (3 strategies). Results are merged into the domain's result object.
The homepage handler then scans every <a href> link for same-domain URLs matching any of 19 contact-page path segments. Discovered links are deduped in memory, then a batch of slots is reserved atomically on the domainPageCounts map — preventing concurrent handlers from exceeding the per-domain limit even at maximum concurrency. Reserved pages are enqueued with label: 'SUBPAGE'.
Phase 3: Subpage extraction
Contact, about, team, and leadership pages run through the same extraction pipeline as the homepage. No additional link-following occurs on subpages — crawl depth is controlled exclusively by the homepage handler. The page count was already incremented at reservation time, so no further synchronization is needed.
Phase 4: Result aggregation and output
After all pages are crawled, the actor iterates over each domain's result and pushes it to the Apify dataset. In pay-per-event mode, a website-scanned charge event fires before each push. If the user's spending limit is reached mid-batch, the actor stops gracefully and logs how many domains were not delivered. The final log line reports total emails, phones, and named contacts found across all domains.
Extraction internals
Email — EMAIL_REGEX (/\b[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,12}\b/g) runs on both stripped body text and href attributes. Thirteen junk patterns are tested against each match. All emails are lowercased before dedup.
Phone — Three regex patterns cover international (+1 (555) 123-4567), parentheses ((555) 123-4567), and separator formats (555-123-4567, 555.123.4567). Dedup key is the digit-only string, so +1 (415) 555-0192 and 14155550192 collapse to one entry.
Contacts — The strict name regex (/^[A-Z][a-z]+(?:\s[A-Z][a-z]+){1,3}$/) rejects single-word strings, all-caps text, and names over 40 characters. A 40-word junk-name blocklist filters headings like "Free Trial", "Our Services", and "Read More". Job-title detection uses 35+ keywords (CEO through Finance) checked case-insensitively against adjacent text.
Tips for best results
-
Default to 5 pages per domain. The default
maxPagesPerDomainof 5 covers the homepage, contact page, about page, and team page for most business websites. Increasing beyond 10 gives diminishing returns and raises cost. -
Enable proxies for batches over 20 sites. Apify Proxy rotates IP addresses automatically. Set
proxyConfiguration: { "useApifyProxy": true }in your input. This is the single biggest factor in preventing blocks on large batches. -
Filter emails by domain post-processing. The output may include third-party emails from embedded contact forms, partner mention pages, or job boards. After downloading, filter
emailsto keep only those ending in@yourtargetdomain.com. -
Pair with Email Pattern Finder for gap coverage. If the scraper returns team member names but no personal emails, feed the names and domain into Email Pattern Finder to predict addresses based on the company's
first.last@,first@, orflast@naming convention. -
Verify emails before sending. Run extracted addresses through Bulk Email Verifier to check MX records and SMTP validity before importing into your outreach tool. This keeps bounce rates below 5%.
-
Disable
includeNamesfor pure contact runs. Name extraction performs DOM traversal with 11 CSS selectors and Schema.org queries per page. If you only need emails and phones, disabling it reduces per-page processing time. -
Use CSV export for CRM bulk import. Download results as CSV and map columns directly to HubSpot, Salesforce, or Pipedrive contact import templates. The flat structure (
emails,phones,domain) imports without transformation. -
Set a spending cap for large batches. Use the run's max cost setting or Apify's budget feature to cap spend at a comfortable amount. The actor stops gracefully at the limit and logs how many domains were processed.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Email Pattern Finder | When contacts are found but emails are missing, predict addresses from the company's email naming convention ($0.10/domain) |
| Bulk Email Verifier | Verify extracted emails via MX and SMTP before CRM import to keep bounce rates low ($0.005/email) |
| B2B Lead Qualifier | Score scraped contacts 0–100 using company data, tech stack, and 30+ signals ($0.15/lead) |
| Website Contact Scraper Pro | Use instead for JavaScript-heavy sites (React, Angular, Vue SPAs) that require a real browser to render contact data |
| HubSpot Lead Pusher | Push scraped contact records directly into HubSpot as new contacts or update existing ones |
| Website Tech Stack Detector | Identify 100+ technologies used by each company for technographic lead scoring ($0.10/site) |
| B2B Lead Gen Suite | Full pipeline: input URLs → scraped contacts → enrichment → scored leads, all in one actor ($0.25/lead) |
Limitations
- No JavaScript rendering — the actor uses CheerioCrawler which parses static server-rendered HTML. Single-page applications that load contact data via client-side JavaScript (React, Angular, Vue) will not have their dynamic content extracted. For JS-heavy sites, use Website Contact Scraper Pro.
- Same-domain links only — the actor only follows links within the same domain as the input URL. Cross-domain team directories or externally hosted about pages are not discovered.
- Name extraction depends on HTML patterns — team member detection relies on Schema.org markup, recognized CSS class names, and heading-paragraph structure. Custom or unconventional layouts may not trigger any of the three extraction strategies.
- Phone extraction limited to contact areas — to minimize false positives from random digit sequences, phone regex runs only against header, footer, nav, address, and elements with contact/phone/info class names, not the full page body. Phones placed in non-standard page areas may be missed.
- No authentication support — only publicly accessible pages are processed. Login-gated employee directories, intranets, and members-only portals are not supported.
- First social link per platform — if a page contains multiple LinkedIn profiles (e.g., company page + individual employee profiles), only the first matched URL per platform is recorded.
- One record per domain — multiple input URLs on the same domain (e.g.,
acmecorp.comandwww.acmecorp.com) are merged into a single output record. This is by design to prevent duplicate billing. - Static data — scraped contact data reflects what was publicly visible at the time of the run. Contacts who have left the company or changed roles will not be reflected until you run the actor again.
Integrations
- Zapier — Trigger a Zap when a run completes and push scraped emails and contact names directly to your CRM, email list, or notification system
- Make — Build automated workflows that feed scraped contacts into HubSpot, Mailchimp, Salesforce, or any of Make's 1,500+ app connectors
- Google Sheets — Export results directly to a Google Sheet for collaborative review, filtering, or manual enrichment before CRM import
- Apify API — Trigger runs programmatically and retrieve results in JSON, CSV, XML, or Excel format — use the Python or JavaScript SDK for clean integration
- Webhooks — Receive an HTTP POST when a run completes and automatically trigger downstream processing in your own backend
- LangChain / LlamaIndex — Feed contact datasets into AI agent workflows for automated research, outreach drafting, or lead qualification pipelines
Troubleshooting
Getting empty emails despite a site visibly showing contact addresses The site likely loads contact information via JavaScript after the initial page load. CheerioCrawler parses only the static HTML returned by the server. Switch to Website Contact Scraper Pro, which uses a full browser to render dynamically loaded content.
Run takes longer than expected for large batches
Each website crawls up to maxPagesPerDomain pages with a 30-second timeout per page. A batch of 500 sites at 5 pages each could make up to 2,500 HTTP requests. Lower maxPagesPerDomain to 3 for a faster, lower-cost pass. Enabling Apify Proxy can also improve speed on sites that throttle repeated requests from the same IP.
Phone numbers are missing from output Phone extraction intentionally targets only contact-specific page areas (footer, address elements, elements with contact/phone class names) and requires recognized formatting (international prefix, parentheses, or dash/dot separators). Numbers placed in the main body copy or formatted as bare digits without separators will not be captured. This strict approach keeps false positives near zero.
Some contacts have names but no emails Name extraction and email extraction are independent processes. Not every team member lists a personal email — many sites only have a generic contact@ address. Use Email Pattern Finder to predict personal email addresses from names and the company domain.
Seeing emails from third-party domains in the output
Some pages embed forms, partner widgets, or job board integrations containing emails from external domains. Post-process the emails array to filter for addresses matching the target domain (e.g., keep only *@pinnacleventures.com).
Responsible use
- This actor only accesses publicly visible web pages that are available to any browser without authentication.
- Respect website terms of service and
robots.txtdirectives. - Comply with GDPR, CAN-SPAM, CASL, and other applicable data protection laws when using scraped contact data for commercial outreach.
- Do not use extracted personal contact information for spam, harassment, or unauthorized purposes.
- For guidance on web scraping legality, see Apify's guide.
FAQ
How many websites can I scrape for contact information in one run? There is no hard URL limit. The actor processes sites concurrently (up to 10 at once) and enforces per-domain page limits internally. A batch of 1,000 websites at the default 5 pages per domain typically completes in 60–90 minutes.
Does Website Contact Scraper extract emails hidden behind JavaScript? No. The actor uses CheerioCrawler, which parses static HTML. If contact emails are loaded via client-side JavaScript (a common pattern on React and Next.js sites), they will not appear in the output. For JavaScript-rendered sites, use Website Contact Scraper Pro.
What types of email addresses are filtered out?
The actor automatically removes noreply@, no-reply@, donotreply@, test@, admin@, webmaster@, postmaster@, mailer-daemon@, and root@ addresses. It also filters emails ending in image, CSS, or JavaScript file extensions (.png, .jpg, .css, .js) and addresses from known placeholder domains including example.com, sentry.io, wixpress.io, and placeholder.io.
Is it legal to scrape contact information from websites? Scraping publicly available contact information from websites is generally permitted in most jurisdictions — a position supported by the 2022 hiQ Labs v. LinkedIn ruling in the US. However, what you do with the data matters. GDPR in the EU and similar laws restrict how personal data can be processed and used for outreach. Always review the target site's Terms of Service and consult legal counsel for your specific use case. See Apify's web scraping legality guide for a fuller analysis.
How accurate is the team member name extraction?
Accuracy depends on the site's HTML structure. Sites using Schema.org Person markup or standard team-card CSS patterns (.team-member, .team-card, etc.) yield near-perfect results. Sites with custom or unconventional layouts may produce fewer contacts or none. The actor uses a strict proper-name regex and a 40-word junk-name blocklist to minimize false positives — headings like "Free Trial" or "Our Team" are filtered out.
Can I schedule Website Contact Scraper to run on a recurring basis? Yes. Use Apify Schedules to run the actor daily, weekly, or at any custom cron interval. This is useful for monitoring company contact pages for changes or keeping a prospect database refreshed without manual effort.
How is Website Contact Scraper different from Hunter.io or Clay? Hunter.io and Clay use proprietary databases of pre-scraped contacts — you pay to query their index, and data freshness is not guaranteed. Website Contact Scraper crawls the live website each time you run it, so results reflect the current state of the page. It also extracts structured team members with titles and direct emails, not just generic company addresses. Pricing is usage-based at $0.15/site versus Hunter.io's $49–$149/month or Clay's $149–$720/month subscription tiers.
What social media platforms does the actor extract? LinkedIn (company pages and personal profiles), Twitter/X (twitter.com and x.com), Facebook, Instagram, and YouTube (channel, user, and @ URL formats). The actor extracts the first matching link per platform per domain.
Why are some phone numbers missing or formatted differently? Phone extraction prioritizes tel: link hrefs as the most reliable source. For text-based phone numbers, only contact-specific page areas are searched (footer, address, elements with contact/phone class names), and the number must match one of three format patterns (international prefix, parentheses, or dash/dot separators). Numbers in plain body copy or formatted as bare 10-digit strings without separators are intentionally skipped to avoid false positives.
Can I use my own proxies instead of Apify Proxy?
Yes. Pass any proxy configuration in the proxyConfiguration input field. The actor supports Apify Proxy (datacenter and residential), custom proxy URLs, and proxy group configurations. For most batches under 50 sites, Apify's datacenter proxies are sufficient.
What happens if a website is down or returns an error? The actor automatically retries each failed request up to 2 times. If all retries fail, the domain is still included in the output with empty arrays for emails, phones, and contacts. The run continues processing all other domains without interruption.
How do I push scraped contacts into my CRM automatically? Use HubSpot Lead Pusher to push contacts directly into HubSpot after a scrape run. For other CRMs, use the Zapier or Make integration to route new dataset items to Salesforce, Pipedrive, or any other supported platform.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom scraping solutions or enterprise integrations, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Email Pattern Finder
Discover the email format used by any company. Enter a domain like stripe.com and detect patterns like [email protected]. Then generate email addresses for any name. Combine with Website Contact Scraper to turn company websites into complete email lists.
Waterfall Contact Enrichment
Find business emails, phones, and social profiles from a name + company domain. Cascades through MX validation, website scraping, pattern detection, and SMTP verification. Free Clay alternative.
B2B Lead Qualifier - Score & Rank Company Leads
Score and rank B2B leads 0-100 by crawling company websites. Analyzes 30+ signals across contact reachability, business legitimacy, online presence, website quality, and team transparency. No AI keys needed.
Google Maps Lead Enricher
Search Google Maps for businesses, then automatically enrich each result with emails, phone numbers, named contacts, social links, email patterns, and lead quality scores (0-100) through a 4-step pipeline.
Ready to try Website Contact Scraper?
Start for free on Apify. No credit card required.
Open on Apify Store