LEAD GENERATIONSEO TOOLS

Website Contact Scraper

Extract emails, phone numbers, team members, and social media links from any business website. Feed it URLs from Google Maps or your CRM and get structured contact data back. Fast HTTP requests, no browser — scrapes 1,000 sites for ~$0.50.

Try on Apify Store
$0.15per event
118
Users (30d)
11,284
Runs (30d)
98
Actively maintained
Maintenance Pulse
$0.15
Per event

Maintenance Pulse

98/100
Last Build
Today
Last Version
2d ago
Builds (30d)
18
Issue Response
4h avg

Cost Estimate

How many results do you need?

website-scanneds
Estimated cost:$15.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
website-scannedCharged per website domain scraped with full contact data$0.15

Example: 100 events = $15.00 · 1,000 events = $150.00

Documentation

Website Contact Scraper extracts emails, phone numbers, team member names, job titles, and social media links from any business website. Give it a list of URLs and it returns one clean, structured contact record per domain — ready for CRM import, outreach sequences, or lead databases.

The actor crawls each site's homepage, then automatically discovers and visits contact, about, team, leadership, and company pages within the same domain. All data is deduplicated across every page visited, so you never see duplicate emails or phantom contacts. No code required — paste URLs, click Start, download results.

What data can you extract?

Data PointSourceExample
📧 Email addressesmailto links, body text, anchor hrefs[email protected]
📞 Phone numberstel: links, footer/address/contact areas+1 (415) 555-0192
👤 Team member namesSchema.org Person, team cards, heading pairsMarcus Rodriguez
💼 Job titlesAdjacent to names, itemprop="jobTitle", .job-titleVP of Business Development
🔗 LinkedIn profilesCompany pages and personal profileslinkedin.com/company/pinnacle-ventures
🐦 Twitter / X profilestwitter.com and x.com linkstwitter.com/pinnaclevc
📘 Facebook pagesFacebook page linksfacebook.com/pinnacleventures
📸 Instagram profilesInstagram profile linksinstagram.com/pinnaclevc
▶️ YouTube channelsChannel, user, and @ linksyoutube.com/@pinnacleventures
🌐 DomainParsed from input URLpinnacleventures.com
🕐 Scraped timestampRun completion time2026-03-19T14:32:18.456Z
📄 Pages scrapedPer-domain page count4

Why use Website Contact Scraper?

Building prospect lists from company websites by hand means opening each site, hunting for a contact page, scanning for emails that might be buried in footers, checking an about page for team names, copying everything into a spreadsheet — then repeating that for 200 more companies. A thorough researcher might process 15 sites per hour. At that rate, 500 websites takes two full working days, and the data is already stale before you finish.

This actor automates the entire process. Paste a list of URLs, press Start, and return to a structured dataset with emails, phone numbers, team members, and social profiles for every domain. A batch of 500 websites typically completes in under 45 minutes for roughly $75 — less than two hours of minimum wage labor.

Built on Apify, the actor gives you production capabilities beyond a one-off script:

  • Scheduling — run daily or weekly to keep contact databases fresh without manual effort
  • API access — trigger runs from Python, JavaScript, or any HTTP client and pipe results directly into your stack
  • Proxy rotation — scrape large batches without IP blocks using Apify's built-in residential and datacenter proxy network
  • Monitoring — receive Slack or email alerts when runs fail or return unexpected result counts
  • Integrations — connect directly to Zapier, Make, Google Sheets, HubSpot, or webhooks with no extra code

Features

  • Three-source email extraction from mailto: link hrefs, full body text (with script, style, and noscript nodes stripped to avoid tracking pixel leakage), and all anchor href attributes — catches emails placed anywhere on the page
  • Junk email filtering that automatically removes noreply, no-reply, donotreply, test, admin, postmaster, mailer-daemon, webmaster, and root addresses, plus emails ending in image/CSS/JS file extensions and addresses from known placeholder domains (sentry.io, wixpress.io, example.com, placeholder domains)
  • Phone extraction from tel: links as the primary, most-reliable source, supplemented by formatted-number regex in contact-specific page areas (header, footer, nav, address, and elements with contact, phone, info, topbar CSS classes)
  • Phone validation that rejects all-same-digit sequences and sequential numbers (1234567) while requiring 7–15 digits and proper formatting (international prefix, parentheses, or dash/dot separators)
  • Three-strategy contact name detection: (1) Schema.org Person structured data with itemprop="name" and itemprop="jobTitle" attributes, (2) 11 team-card CSS selectors (.team-member, .team-card, .staff-member, .person-card, .member-card, .leadership-card, .employee, .bio-card, .team-item, .people-card, .about-member), and (3) heading-paragraph pairs where the h3/h4 matches a strict proper-name regex and the next sibling contains one of 35+ job title keywords
  • 40+ junk-name word filter that prevents page headings like "Free Plan" or "Our Services" from appearing in the contacts list
  • Automatic contact-page discovery that follows same-domain links matching 19 contact-related path keywords: contact, about, team, leadership, management, executives, people, staff, company, and variations
  • Configurable crawl depth from 1 to 20 pages per domain — default of 5 covers homepage + contact + about + team for most sites
  • Atomic page-slot reservation that prevents concurrent subpage handlers from exceeding the per-domain page limit even at maximum concurrency
  • Deduplication across all pages — emails by exact lowercase string, phones by digit-only key (so +1 (415) 555-0192 and 4155550192 are the same number), contacts by case-insensitive name, and social links first-match-per-platform
  • Batch processing of unlimited URLs in a single run with up to 10 simultaneous connections and 120 requests per minute
  • Built-in retry logic with 2 automatic retries per page and SSL error tolerance for sites with invalid certificates
  • Pay-per-event pricing with a per-run spending cap — the actor stops delivering results when your budget is reached so there are no surprise charges
  • JavaScript/SPA support available via Website Contact Scraper Pro, which renders React, Angular, and Vue sites with a real browser

Use cases for scraping website contacts

Sales prospecting and outreach

Sales development reps building targeted prospect lists paste company websites from a CRM or LinkedIn search into the actor, then use the output emails, direct phone numbers, and LinkedIn profiles to populate outreach sequences. Finding a decision-maker's direct email manually takes 5–10 minutes per company; this actor processes that same company in seconds.

Marketing agency lead generation

Agencies building prospect databases for clients scrape industry directories, trade association member lists, or competitor customer pages to extract contact information at scale. The structured CSV output maps directly to email marketing tools and CRM import templates.

Recruiting and talent sourcing

Recruiters extract team pages from target companies to identify hiring managers, department heads, and engineers along with their direct contact details and LinkedIn profiles. The contacts array with names and titles makes it easy to identify the right person to reach before making first contact.

Business research and market mapping

Analysts conducting competitive intelligence or market mapping run batches of hundreds of competitor or prospect websites to produce a structured dataset of who works where, what their titles are, and how to reach them. The timestamp field tracks when data was collected, making it easy to identify stale records.

Freelancer and consultant outreach

Independent consultants and agencies identify the right decision-maker to pitch at prospective client companies by scraping the about and leadership pages for names, titles, and email addresses — rather than guessing at generic info@ addresses that rarely convert.

CRM data enrichment

Operations and RevOps teams augment existing company records in HubSpot, Salesforce, or Pipedrive with fresh contact details, social profile links, and team member data scraped directly from live company websites. Combine with Bulk Email Verifier to validate addresses before import.

How to scrape website contact information

  1. Provide website URLs — Enter one or more business website homepages in the input form. Use the root domain (e.g., https://pinnacleventures.com), not a deep URL. The actor discovers internal pages automatically.
  2. Configure options — Keep maxPagesPerDomain at the default of 5 for most sites. Increase to 10–15 only if you know a site has a large staff directory spread across multiple pages.
  3. Run the actor — Click "Start". The actor crawls each site concurrently, typically finishing 50 websites in 3–5 minutes and 500 websites in 40–60 minutes.
  4. Download results — Open the Dataset tab and download your data as JSON, CSV, or Excel. Each row is one domain with its complete contact profile: emails, phones, team members, and social links.

Input parameters

ParameterTypeRequiredDefaultDescription
urlsstring[]YesBusiness website homepages to scrape. One output record per unique domain.
maxPagesPerDomainintegerNo5Pages to crawl per website (1–20). Default covers homepage + contact + about + team for most sites.
includeNamesbooleanNotrueExtract team member names and job titles from team/about pages. Disable for emails-only runs.
includeSocialsbooleanNotrueExtract social media profile links (LinkedIn, Twitter/X, Facebook, Instagram, YouTube).
proxyConfigurationobjectNoApify ProxyProxy settings. Recommended when scraping more than 20 sites.

Input examples

Single website with defaults:

{
    "urls": ["https://pinnacleventures.com"]
}

Batch of 50 sites with deep crawl:

{
    "urls": [
        "https://pinnacleventures.com",
        "https://meridiantech.io",
        "https://atlaslogistics.com"
    ],
    "maxPagesPerDomain": 10,
    "includeNames": true,
    "includeSocials": true,
    "proxyConfiguration": { "useApifyProxy": true }
}

Emails and phones only, fast pass:

{
    "urls": [
        "https://pinnacleventures.com",
        "https://meridiantech.io"
    ],
    "maxPagesPerDomain": 3,
    "includeNames": false,
    "includeSocials": false
}

Input tips

  • Start with maxPagesPerDomain: 5 — this covers the homepage plus contact, about, and team pages for the vast majority of business websites. Only increase it for sites with large employee directories spanning 6+ pages.
  • Enable proxies for batches over 20 sites — Apify Proxy rotates IPs automatically to prevent rate limiting. The default proxy configuration works for most cases.
  • Provide root homepages, not deep URLs — enter https://acmecorp.com, not https://acmecorp.com/blog/post-123. The actor discovers contact-related subpages on its own.
  • Disable includeNames for faster runs — name extraction adds DOM traversal per page. If you only need emails and phone numbers, turn it off to reduce processing time.
  • Batch everything in one run — processing 200 sites in a single run is faster than 200 separate single-site runs. The actor handles concurrency internally.

Output example

Each item in the dataset represents one website domain:

{
    "url": "https://pinnacleventures.com",
    "domain": "pinnacleventures.com",
    "emails": [
        "[email protected]",
        "[email protected]",
        "[email protected]"
    ],
    "phones": [
        "+1 (415) 555-0192",
        "+1 800-555-0134"
    ],
    "contacts": [
        {
            "name": "Marcus Rodriguez",
            "title": "Managing Partner",
            "email": "[email protected]"
        },
        {
            "name": "Sarah Chen",
            "title": "VP of Portfolio Operations"
        },
        {
            "name": "James Okafor",
            "title": "Director of Business Development"
        }
    ],
    "socialLinks": {
        "linkedin": "https://www.linkedin.com/company/pinnacle-ventures",
        "twitter": "https://twitter.com/pinnaclevc",
        "facebook": "https://www.facebook.com/pinnacleventures",
        "instagram": "https://www.instagram.com/pinnaclevc",
        "youtube": "https://www.youtube.com/@pinnacleventures"
    },
    "pagesScraped": 4,
    "scrapedAt": "2026-03-19T14:32:18.456Z"
}

Output fields

FieldTypeDescription
urlstringNormalized input URL (HTTPS, no trailing slash)
domainstringDomain with www. stripped (e.g., pinnacleventures.com)
emailsstring[]Deduplicated email addresses from all crawled pages, junk addresses filtered out
phonesstring[]Deduplicated phone numbers; deduplication keyed on digits only so format variants collapse to one entry
contactsobject[]Named team members extracted from team/about pages
contacts[].namestringPerson's full name (proper capitalization validated)
contacts[].titlestringJob title (optional; present when found adjacent to the name)
contacts[].emailstringEmail address linked to this person (optional; from mailto: in their team card)
socialLinksobjectSocial media profile URLs keyed by platform
socialLinks.linkedinstringLinkedIn company or personal profile URL
socialLinks.twitterstringTwitter/X profile URL
socialLinks.facebookstringFacebook page URL
socialLinks.instagramstringInstagram profile URL
socialLinks.youtubestringYouTube channel URL
pagesScrapednumberTotal pages processed for this domain (homepage + discovered subpages)
scrapedAtstringISO 8601 timestamp when the result was assembled

How much does it cost to scrape website contacts?

Website Contact Scraper uses pay-per-event pricing — you pay $0.15 per website scanned. Platform compute costs are included in the price.

ScenarioWebsitesCost per websiteTotal cost
Quick test1$0.15$0.15
Small batch10$0.15$1.50
Medium batch50$0.15$7.50
Large batch200$0.15$30.00
Enterprise1,000$0.15$150.00

You can set a maximum spending limit per run to control costs. The actor stops delivering results when your budget is reached, so you never pay more than you expect.

Compare this to Hunter.io at $49–$149/month or Clay at $149–$720/month — most Website Contact Scraper users spend $5–$30/month with no subscription commitment. Apify's free tier also includes $5 of monthly credits, which covers 33 website scans at no cost.

Extract website contacts using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/website-contact-scraper").call(run_input={
    "urls": [
        "https://pinnacleventures.com",
        "https://meridiantech.io",
        "https://atlaslogistics.com",
    ],
    "maxPagesPerDomain": 5,
    "includeNames": True,
    "includeSocials": True,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['domain']}: {len(item['emails'])} emails, {len(item['phones'])} phones")
    for contact in item.get("contacts", []):
        print(f"  {contact['name']} — {contact.get('title', 'no title')}")
        if contact.get("email"):
            print(f"    email: {contact['email']}")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/website-contact-scraper").call({
    urls: [
        "https://pinnacleventures.com",
        "https://meridiantech.io",
        "https://atlaslogistics.com",
    ],
    maxPagesPerDomain: 5,
    includeNames: true,
    includeSocials: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    console.log(`${item.domain}: ${item.emails.length} emails, ${item.contacts.length} contacts`);
    for (const contact of item.contacts) {
        console.log(`  ${contact.name} (${contact.title ?? "no title"})`);
    }
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~website-contact-scraper/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://pinnacleventures.com", "https://meridiantech.io"],
    "maxPagesPerDomain": 5,
    "includeNames": true,
    "includeSocials": true
  }'

# Fetch results once the run completes (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How Website Contact Scraper works

Phase 1: URL normalization and domain deduplication

Before any crawling begins, each input URL is normalized — HTTPS is enforced, trailing slashes are stripped, and the domain is extracted with www. removed. Duplicate domains are collapsed to a single entry so you never pay twice for the same site. An empty result object is created for each unique domain, and the homepage is queued with label: 'HOMEPAGE' along with the user's configuration (maxPagesPerDomain, includeNames, includeSocials).

Phase 2: Homepage crawl and contact-page discovery

CheerioCrawler fetches each homepage using got-scraping with up to 10 concurrent connections, a 120 requests/minute rate limit, 30-second timeout, and 2 automatic retries. SSL errors are silently ignored to handle sites with invalid certificates.

On the homepage, all four extraction functions run in parallel: emails (from mailto: links, body text regex, and anchor hrefs with script/style nodes stripped), phones (from tel: links and contact-area text), social links (5 platform patterns), and contacts (3 strategies). Results are merged into the domain's result object.

The homepage handler then scans every <a href> link for same-domain URLs matching any of 19 contact-page path segments. Discovered links are deduped in memory, then a batch of slots is reserved atomically on the domainPageCounts map — preventing concurrent handlers from exceeding the per-domain limit even at maximum concurrency. Reserved pages are enqueued with label: 'SUBPAGE'.

Phase 3: Subpage extraction

Contact, about, team, and leadership pages run through the same extraction pipeline as the homepage. No additional link-following occurs on subpages — crawl depth is controlled exclusively by the homepage handler. The page count was already incremented at reservation time, so no further synchronization is needed.

Phase 4: Result aggregation and output

After all pages are crawled, the actor iterates over each domain's result and pushes it to the Apify dataset. In pay-per-event mode, a website-scanned charge event fires before each push. If the user's spending limit is reached mid-batch, the actor stops gracefully and logs how many domains were not delivered. The final log line reports total emails, phones, and named contacts found across all domains.

Extraction internals

EmailEMAIL_REGEX (/\b[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,12}\b/g) runs on both stripped body text and href attributes. Thirteen junk patterns are tested against each match. All emails are lowercased before dedup.

Phone — Three regex patterns cover international (+1 (555) 123-4567), parentheses ((555) 123-4567), and separator formats (555-123-4567, 555.123.4567). Dedup key is the digit-only string, so +1 (415) 555-0192 and 14155550192 collapse to one entry.

Contacts — The strict name regex (/^[A-Z][a-z]+(?:\s[A-Z][a-z]+){1,3}$/) rejects single-word strings, all-caps text, and names over 40 characters. A 40-word junk-name blocklist filters headings like "Free Trial", "Our Services", and "Read More". Job-title detection uses 35+ keywords (CEO through Finance) checked case-insensitively against adjacent text.

Tips for best results

  1. Default to 5 pages per domain. The default maxPagesPerDomain of 5 covers the homepage, contact page, about page, and team page for most business websites. Increasing beyond 10 gives diminishing returns and raises cost.

  2. Enable proxies for batches over 20 sites. Apify Proxy rotates IP addresses automatically. Set proxyConfiguration: { "useApifyProxy": true } in your input. This is the single biggest factor in preventing blocks on large batches.

  3. Filter emails by domain post-processing. The output may include third-party emails from embedded contact forms, partner mention pages, or job boards. After downloading, filter emails to keep only those ending in @yourtargetdomain.com.

  4. Pair with Email Pattern Finder for gap coverage. If the scraper returns team member names but no personal emails, feed the names and domain into Email Pattern Finder to predict addresses based on the company's first.last@, first@, or flast@ naming convention.

  5. Verify emails before sending. Run extracted addresses through Bulk Email Verifier to check MX records and SMTP validity before importing into your outreach tool. This keeps bounce rates below 5%.

  6. Disable includeNames for pure contact runs. Name extraction performs DOM traversal with 11 CSS selectors and Schema.org queries per page. If you only need emails and phones, disabling it reduces per-page processing time.

  7. Use CSV export for CRM bulk import. Download results as CSV and map columns directly to HubSpot, Salesforce, or Pipedrive contact import templates. The flat structure (emails, phones, domain) imports without transformation.

  8. Set a spending cap for large batches. Use the run's max cost setting or Apify's budget feature to cap spend at a comfortable amount. The actor stops gracefully at the limit and logs how many domains were processed.

Combine with other Apify actors

ActorHow to combine
Email Pattern FinderWhen contacts are found but emails are missing, predict addresses from the company's email naming convention ($0.10/domain)
Bulk Email VerifierVerify extracted emails via MX and SMTP before CRM import to keep bounce rates low ($0.005/email)
B2B Lead QualifierScore scraped contacts 0–100 using company data, tech stack, and 30+ signals ($0.15/lead)
Website Contact Scraper ProUse instead for JavaScript-heavy sites (React, Angular, Vue SPAs) that require a real browser to render contact data
HubSpot Lead PusherPush scraped contact records directly into HubSpot as new contacts or update existing ones
Website Tech Stack DetectorIdentify 100+ technologies used by each company for technographic lead scoring ($0.10/site)
B2B Lead Gen SuiteFull pipeline: input URLs → scraped contacts → enrichment → scored leads, all in one actor ($0.25/lead)

Limitations

  • No JavaScript rendering — the actor uses CheerioCrawler which parses static server-rendered HTML. Single-page applications that load contact data via client-side JavaScript (React, Angular, Vue) will not have their dynamic content extracted. For JS-heavy sites, use Website Contact Scraper Pro.
  • Same-domain links only — the actor only follows links within the same domain as the input URL. Cross-domain team directories or externally hosted about pages are not discovered.
  • Name extraction depends on HTML patterns — team member detection relies on Schema.org markup, recognized CSS class names, and heading-paragraph structure. Custom or unconventional layouts may not trigger any of the three extraction strategies.
  • Phone extraction limited to contact areas — to minimize false positives from random digit sequences, phone regex runs only against header, footer, nav, address, and elements with contact/phone/info class names, not the full page body. Phones placed in non-standard page areas may be missed.
  • No authentication support — only publicly accessible pages are processed. Login-gated employee directories, intranets, and members-only portals are not supported.
  • First social link per platform — if a page contains multiple LinkedIn profiles (e.g., company page + individual employee profiles), only the first matched URL per platform is recorded.
  • One record per domain — multiple input URLs on the same domain (e.g., acmecorp.com and www.acmecorp.com) are merged into a single output record. This is by design to prevent duplicate billing.
  • Static data — scraped contact data reflects what was publicly visible at the time of the run. Contacts who have left the company or changed roles will not be reflected until you run the actor again.

Integrations

  • Zapier — Trigger a Zap when a run completes and push scraped emails and contact names directly to your CRM, email list, or notification system
  • Make — Build automated workflows that feed scraped contacts into HubSpot, Mailchimp, Salesforce, or any of Make's 1,500+ app connectors
  • Google Sheets — Export results directly to a Google Sheet for collaborative review, filtering, or manual enrichment before CRM import
  • Apify API — Trigger runs programmatically and retrieve results in JSON, CSV, XML, or Excel format — use the Python or JavaScript SDK for clean integration
  • Webhooks — Receive an HTTP POST when a run completes and automatically trigger downstream processing in your own backend
  • LangChain / LlamaIndex — Feed contact datasets into AI agent workflows for automated research, outreach drafting, or lead qualification pipelines

Troubleshooting

Getting empty emails despite a site visibly showing contact addresses The site likely loads contact information via JavaScript after the initial page load. CheerioCrawler parses only the static HTML returned by the server. Switch to Website Contact Scraper Pro, which uses a full browser to render dynamically loaded content.

Run takes longer than expected for large batches Each website crawls up to maxPagesPerDomain pages with a 30-second timeout per page. A batch of 500 sites at 5 pages each could make up to 2,500 HTTP requests. Lower maxPagesPerDomain to 3 for a faster, lower-cost pass. Enabling Apify Proxy can also improve speed on sites that throttle repeated requests from the same IP.

Phone numbers are missing from output Phone extraction intentionally targets only contact-specific page areas (footer, address elements, elements with contact/phone class names) and requires recognized formatting (international prefix, parentheses, or dash/dot separators). Numbers placed in the main body copy or formatted as bare digits without separators will not be captured. This strict approach keeps false positives near zero.

Some contacts have names but no emails Name extraction and email extraction are independent processes. Not every team member lists a personal email — many sites only have a generic contact@ address. Use Email Pattern Finder to predict personal email addresses from names and the company domain.

Seeing emails from third-party domains in the output Some pages embed forms, partner widgets, or job board integrations containing emails from external domains. Post-process the emails array to filter for addresses matching the target domain (e.g., keep only *@pinnacleventures.com).

Responsible use

  • This actor only accesses publicly visible web pages that are available to any browser without authentication.
  • Respect website terms of service and robots.txt directives.
  • Comply with GDPR, CAN-SPAM, CASL, and other applicable data protection laws when using scraped contact data for commercial outreach.
  • Do not use extracted personal contact information for spam, harassment, or unauthorized purposes.
  • For guidance on web scraping legality, see Apify's guide.

FAQ

How many websites can I scrape for contact information in one run? There is no hard URL limit. The actor processes sites concurrently (up to 10 at once) and enforces per-domain page limits internally. A batch of 1,000 websites at the default 5 pages per domain typically completes in 60–90 minutes.

Does Website Contact Scraper extract emails hidden behind JavaScript? No. The actor uses CheerioCrawler, which parses static HTML. If contact emails are loaded via client-side JavaScript (a common pattern on React and Next.js sites), they will not appear in the output. For JavaScript-rendered sites, use Website Contact Scraper Pro.

What types of email addresses are filtered out? The actor automatically removes noreply@, no-reply@, donotreply@, test@, admin@, webmaster@, postmaster@, mailer-daemon@, and root@ addresses. It also filters emails ending in image, CSS, or JavaScript file extensions (.png, .jpg, .css, .js) and addresses from known placeholder domains including example.com, sentry.io, wixpress.io, and placeholder.io.

Is it legal to scrape contact information from websites? Scraping publicly available contact information from websites is generally permitted in most jurisdictions — a position supported by the 2022 hiQ Labs v. LinkedIn ruling in the US. However, what you do with the data matters. GDPR in the EU and similar laws restrict how personal data can be processed and used for outreach. Always review the target site's Terms of Service and consult legal counsel for your specific use case. See Apify's web scraping legality guide for a fuller analysis.

How accurate is the team member name extraction? Accuracy depends on the site's HTML structure. Sites using Schema.org Person markup or standard team-card CSS patterns (.team-member, .team-card, etc.) yield near-perfect results. Sites with custom or unconventional layouts may produce fewer contacts or none. The actor uses a strict proper-name regex and a 40-word junk-name blocklist to minimize false positives — headings like "Free Trial" or "Our Team" are filtered out.

Can I schedule Website Contact Scraper to run on a recurring basis? Yes. Use Apify Schedules to run the actor daily, weekly, or at any custom cron interval. This is useful for monitoring company contact pages for changes or keeping a prospect database refreshed without manual effort.

How is Website Contact Scraper different from Hunter.io or Clay? Hunter.io and Clay use proprietary databases of pre-scraped contacts — you pay to query their index, and data freshness is not guaranteed. Website Contact Scraper crawls the live website each time you run it, so results reflect the current state of the page. It also extracts structured team members with titles and direct emails, not just generic company addresses. Pricing is usage-based at $0.15/site versus Hunter.io's $49–$149/month or Clay's $149–$720/month subscription tiers.

What social media platforms does the actor extract? LinkedIn (company pages and personal profiles), Twitter/X (twitter.com and x.com), Facebook, Instagram, and YouTube (channel, user, and @ URL formats). The actor extracts the first matching link per platform per domain.

Why are some phone numbers missing or formatted differently? Phone extraction prioritizes tel: link hrefs as the most reliable source. For text-based phone numbers, only contact-specific page areas are searched (footer, address, elements with contact/phone class names), and the number must match one of three format patterns (international prefix, parentheses, or dash/dot separators). Numbers in plain body copy or formatted as bare 10-digit strings without separators are intentionally skipped to avoid false positives.

Can I use my own proxies instead of Apify Proxy? Yes. Pass any proxy configuration in the proxyConfiguration input field. The actor supports Apify Proxy (datacenter and residential), custom proxy URLs, and proxy group configurations. For most batches under 50 sites, Apify's datacenter proxies are sufficient.

What happens if a website is down or returns an error? The actor automatically retries each failed request up to 2 times. If all retries fail, the domain is still included in the output with empty arrays for emails, phones, and contacts. The run continues processing all other domains without interruption.

How do I push scraped contacts into my CRM automatically? Use HubSpot Lead Pusher to push contacts directly into HubSpot after a scrape run. For other CRMs, use the Zapier or Make integration to route new dataset items to Salesforce, Pipedrive, or any other supported platform.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom scraping solutions or enterprise integrations, reach out through the Apify platform.

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Website Contact Scraper?

Start for free on Apify. No credit card required.

Open on Apify Store