Event Lead Extractor
Extract sponsor and speaker leads from conferences. Crawls event pages, finds company domains, enriches with emails, phone numbers, and lead scores. Works with Eventbrite, conference sites, and more.
Maintenance Pulse
93/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| lead-extracted | Charged per lead extracted from event pages. Includes event page crawling, external link discovery, and 3-actor enrichment pipeline. | $0.25 |
Example: 100 events = $25.00 · 1,000 events = $250.00
Documentation
Turn any conference, trade show, or event page into a qualified lead list — complete with emails, phone numbers, social profiles, and lead scores. Paste in event URLs from Eventbrite, Lu.ma, Sched, Bizzabo, or any custom conference website and the actor does the rest.
Why use Event Lead Extractor?
Conference sponsor lists, exhibitor pages, and speaker lineups are goldmines of B2B leads — but extracting them manually is tedious. This actor automates the entire pipeline: it crawls event pages, discovers sponsor/speaker/exhibitor subpages, extracts every company domain, classifies each one by role, then enriches every domain through three sub-actors (Contact Scraper, Email Pattern Finder, Lead Qualifier). You get a single export with event context, contact details, email patterns, and quality scores — ready for CRM import.
Features
- Multi-source event crawling — handles Eventbrite, Lu.ma, Sched, Bizzabo, conference microsites, and any HTML event page with sponsor or speaker listings
- Automatic subpage discovery — follows links to
/sponsors,/speakers,/exhibitors,/partners,/agenda, and similar subpages without manual configuration - Intelligent link classification — tags every extracted company as sponsor, exhibitor, partner, speaker, organizer, or generic linked based on surrounding HTML context and section headings
- JSON-LD and meta tag parsing — reads structured data (Schema.org Event, OpenGraph,
__NEXT_DATA__) to extract event title, date, and location automatically - Full contact enrichment — calls Website Contact Scraper, Email Pattern Finder, and B2B Lead Qualifier sub-actors to append emails, phones, social links, email patterns, and lead quality scores
- Smart domain filtering — blocks 100+ noise domains (social networks, CDNs, analytics, payment processors, event platforms) so your results contain only real business leads
- Configurable depth — control max leads per event, toggle subpage discovery on or off, and skip enrichment entirely for a fast domain-only extraction
- Batch processing — provide multiple event URLs in a single run to extract leads from an entire conference season at once
Use Cases
Pre-event outreach
Event marketers preparing outreach campaigns who need a structured list of every sponsor and exhibitor with direct contact information. Run the actor on the event page two weeks before the conference and import leads into your CRM.
Trade show prospecting
Sales development reps (SDRs) mining trade show sponsor lists to build targeted account lists enriched with decision-maker emails and company quality scores. Filter by foundAs: "sponsor" to focus on companies with marketing budget.
Competitive sponsorship tracking
Business development managers tracking competitor sponsorship activity across industry events to identify partnership opportunities and understand market positioning.
Event audit
Conference organizers auditing their own events to understand which sponsors and speakers are most visible and how their contact information appears online.
Investor and partner discovery
Startup founders scanning accelerator demo days, pitch events, and investor conferences to identify potential partners, mentors, or customers exhibiting at those events.
Industry ecosystem mapping
Market researchers aggregating company participation data across multiple events to map industry ecosystems and identify trending vendors.
How to Use
- Gather event URLs. Copy the main event page URL from Eventbrite, Lu.ma, or any conference website. For best results, also include direct links to pages like
/sponsorsor/speakersif they exist. - Configure the input. Paste the URLs into the Event URLs field. Leave Discover Subpages enabled so the actor automatically finds sponsor and speaker listing pages. Set Max Leads Per Event to control output size.
- Choose enrichment level. Keep Skip Enrichment unchecked for full pipeline output (emails, phones, lead scores). Enable it if you only need a quick domain list without the extra cost of sub-actor calls.
- Run the actor. Click Start and wait for the pipeline to complete. The actor first crawls all event pages, then sequentially runs the Contact Scraper, Email Pattern Finder, and Lead Qualifier on every extracted domain.
- Export your leads. Download results as JSON, CSV, or Excel from the dataset tab.
Input Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
eventUrls | string[] | Yes | — | URLs of event or conference pages. Include main page and optionally direct /sponsors, /speakers, or /exhibitors links |
discoverSubpages | boolean | No | true | Automatically find and crawl subpages like /sponsors, /speakers, /exhibitors, and /partners linked from the event URL |
maxLeadsPerEvent | integer | No | 50 | Maximum number of unique company domains to extract per event URL. Range: 1–500 |
skipEnrichment | boolean | No | false | When enabled, only extracts domains and skips enrichment sub-actors. Much faster and cheaper |
proxyConfiguration | object | No | — | Proxy settings for crawling event pages. Recommended for Eventbrite and other sites that block datacenter IPs |
Input Examples
Single conference, full enrichment:
{
"eventUrls": ["https://www.tech-week.com/"],
"discoverSubpages": true,
"maxLeadsPerEvent": 100,
"skipEnrichment": false
}
Multiple events, quick domain scan:
{
"eventUrls": [
"https://www.tech-week.com/",
"https://websummit.com/",
"https://www.ces.tech/"
],
"discoverSubpages": true,
"maxLeadsPerEvent": 50,
"skipEnrichment": true
}
Direct sponsor page with proxy:
{
"eventUrls": [
"https://www.eventbrite.com/e/my-event-12345",
"https://www.eventbrite.com/e/my-event-12345/sponsors"
],
"discoverSubpages": false,
"maxLeadsPerEvent": 200,
"skipEnrichment": false,
"proxyConfiguration": { "useApifyProxy": true }
}
Input Tips
- Include direct subpage URLs for guaranteed coverage. Auto-discovery handles most cases, but explicit URLs ensure nothing is missed.
- Use Skip Enrichment for reconnaissance. Run with enrichment off first to preview lead counts, then re-run with enrichment on for high-value events only.
- Set Max Leads thoughtfully. Large trade shows (CES, Web Summit) may have 500+ exhibitors. Small meetups rarely exceed 20.
- Enable proxy for Eventbrite. Eventbrite serves limited HTML to bots — residential or auto proxy ensures the full sponsor list renders.
Output Example
Each item in the output dataset represents one company found at one event, enriched with contact data and a lead quality score:
{
"eventUrl": "https://www.tech-week.com/",
"eventTitle": "London Tech Week 2025",
"eventDate": "2025-06-09",
"eventLocation": "ExCeL London, London",
"domain": "datadog.com",
"companyUrl": "https://www.datadog.com",
"foundAs": "sponsor",
"linkText": "Datadog",
"emails": ["[email protected]", "[email protected]"],
"phones": ["+1-866-329-4466"],
"contacts": [
{
"name": "Sarah Chen",
"title": "VP of Marketing",
"email": "[email protected]"
}
],
"socialLinks": {
"linkedin": "https://www.linkedin.com/company/datadog",
"twitter": "https://twitter.com/datadoghq"
},
"emailPattern": "{first}.{last}@datadog.com",
"emailPatternConfidence": 0.92,
"generatedEmails": [
{
"name": "Sarah Chen",
"email": "[email protected]"
}
],
"score": 87,
"grade": "A",
"pipelineSteps": [
"event-crawl",
"contact-scraper",
"email-pattern-finder",
"lead-qualifier"
],
"extractedAt": "2025-06-15T14:32:18.000Z"
}
When enrichment is skipped (skipEnrichment: true), the contact, pattern, and scoring fields will be empty arrays, empty objects, or null.
Output Fields
Event Context Fields
| Field | Type | Description |
|---|---|---|
eventUrl | string | The input URL this lead was found on |
eventTitle | string/null | Event name extracted from page title, OpenGraph, JSON-LD, or __NEXT_DATA__ |
eventDate | string/null | Event start date (ISO 8601 or free text, depending on source) |
eventLocation | string/null | Venue name and/or city extracted from structured data |
Company Fields
| Field | Type | Description |
|---|---|---|
domain | string | Normalized company domain (e.g., datadog.com) |
companyUrl | string | Full URL as found on the event page |
foundAs | string | Classification: sponsor, exhibitor, partner, speaker, organizer, or linked |
linkText | string/null | Anchor text of the link on the event page |
Contact Scraper Fields (from sub-actor)
| Field | Type | Description |
|---|---|---|
emails | string[] | Email addresses found on the company website |
phones | string[] | Phone numbers found on the company website |
contacts | array | Named contacts with name, title, and email |
socialLinks | object | Social media profile URLs (LinkedIn, Twitter, etc.) |
Email Pattern Fields (from sub-actor)
| Field | Type | Description |
|---|---|---|
emailPattern | string/null | Detected email format (e.g., {first}.{last}@domain.com) |
emailPatternConfidence | number/null | Confidence score (0–1) for the detected pattern |
generatedEmails | array | Emails generated using the pattern for discovered contacts |
Lead Qualifier Fields (from sub-actor)
| Field | Type | Description |
|---|---|---|
score | number/null | Lead quality score (0–100) |
grade | string/null | Letter grade (A through F) based on score |
Meta Fields
| Field | Type | Description |
|---|---|---|
pipelineSteps | string[] | Which pipeline steps completed (e.g., ["event-crawl", "contact-scraper", "email-pattern-finder", "lead-qualifier"]) |
extractedAt | string | ISO 8601 timestamp of extraction |
Programmatic Access (API)
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/event-lead-extractor").call(run_input={
"eventUrls": ["https://www.tech-week.com/"],
"discoverSubpages": True,
"maxLeadsPerEvent": 100,
"skipEnrichment": False,
})
for lead in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{lead['domain']} ({lead['foundAs']}) — "
f"Score: {lead['score']}, Grade: {lead['grade']}")
if lead["emails"]:
print(f" Emails: {', '.join(lead['emails'])}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/event-lead-extractor").call({
eventUrls: ["https://www.tech-week.com/"],
discoverSubpages: true,
maxLeadsPerEvent: 100,
skipEnrichment: false,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items
.filter((lead) => lead.foundAs === "sponsor")
.forEach((lead) => {
console.log(`${lead.domain}: Score ${lead.score} (${lead.grade})`);
console.log(` Emails: ${lead.emails.join(", ")}`);
});
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~event-lead-extractor/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"eventUrls": ["https://www.tech-week.com/"],
"discoverSubpages": true,
"maxLeadsPerEvent": 100,
"skipEnrichment": false
}'
# Fetch results (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How It Works
The actor runs a four-step pipeline:
Step 1: Event Page Crawling
A CheerioCrawler fetches each input URL and parses the HTML. For each page:
- Event metadata extraction — reads
<title>, OpenGraph meta tags, JSON-LD@type: "Event", and Eventbrite's__NEXT_DATA__to find the event name, date, and venue. - External link extraction — finds all
<a href>tags pointing to external domains. Each link's surrounding context (section headings, parent element classes, nearby text) is captured for classification. - Domain filtering — checks every extracted domain against a blocklist of 100+ noise domains (social media, CDNs, analytics, event platforms, URL shorteners, big tech, email providers). Only real business domains pass through.
- Link classification — analyzes the surrounding HTML text for keywords like "sponsor," "platinum," "exhibitor," "booth," "speaker," "keynote" to tag each company by role.
- Subpage discovery — if enabled, identifies internal links with paths containing keywords like
/sponsors,/speakers,/exhibitors,/partnersand enqueues them for crawling.
Step 2: Contact Scraper (sub-actor)
Calls ryanclinton/website-contact-scraper with all unique domains. Crawls up to 3 pages per domain to extract emails, phone numbers, named contacts (name + title + email), and social media profile URLs.
Step 3: Email Pattern Finder (sub-actor)
Calls ryanclinton/email-pattern-finder with all unique domains. Analyzes discovered emails to detect the company's email format (e.g., {first}.{last}@domain.com) and generates predicted emails for any named contacts found in Step 2.
Step 4: Lead Qualifier (sub-actor)
Calls ryanclinton/b2b-lead-qualifier with all unique domains. Scores each company 0–100 based on contact reachability, business legitimacy, online presence, website quality, and team transparency. Assigns letter grades A through F.
Pipeline Resilience
Each enrichment step runs independently. If the Contact Scraper fails, the actor continues with Pattern Finder and Lead Qualifier (which will have less data). If any sub-actor fails, the pipeline continues and failed steps produce empty fields rather than crashing the entire run.
Domain Deduplication
When processing multiple events, the actor deduplicates domains globally. A company appearing at three different events is enriched only once, but appears in the output for each event it was found at.
Classification Reference
The actor classifies each extracted company based on keywords found in surrounding HTML:
| Classification | Trigger Keywords |
|---|---|
sponsor | sponsor, sponsored, sponsorship, presented by, brought to you by, platinum, gold, silver, bronze, diamond, title sponsor |
exhibitor | exhibitor, exhibit, booth, exhibition, expo |
partner | partner, partnership, community partner, media partner, technology partner |
speaker | speaker, keynote, panelist, presenter, moderator, fireside |
organizer | organizer, organiser, hosted by, organized by, organised by |
linked | No matching keywords found — generic external link |
Blocked Domain Categories
The domain filter blocks 100+ domains across these categories to ensure only real business leads appear in results:
| Category | Examples |
|---|---|
| Social media | facebook.com, twitter.com, linkedin.com, instagram.com, tiktok.com |
| Event platforms | eventbrite.com, lu.ma, meetup.com, hopin.com, bizzabo.com, cvent.com |
| CDNs / Infrastructure | cloudflare.com, googleapis.com, cloudfront.net, fastly.net |
| Analytics / Ads | google-analytics.com, doubleclick.net, hotjar.com, mixpanel.com |
| Big tech (too generic) | google.com, apple.com, microsoft.com, amazon.com |
| Email providers | gmail.com, outlook.com, yahoo.com |
| URL shorteners | bit.ly, t.co, tinyurl.com |
| CMS / Website builders | wordpress.com, squarespace.com, wix.com, medium.com |
| Payment processors | paypal.com, stripe.com |
Subdomains of blocked domains (e.g., cdn.example.com, static.example.com) are also filtered.
How Much Does It Cost?
Cost depends on how many events you process, how many company domains are found, and whether enrichment is enabled.
| Scenario | Events | Leads Found | Enrichment | Estimated Cost |
|---|---|---|---|---|
| Quick domain scan | 1 | ~30 | Skipped | ~$0.01 |
| Single event, full pipeline | 1 | ~50 | Full | ~$0.50 |
| 5 events, full pipeline | 5 | ~200 | Full | ~$2.00 |
| Conference season (20 events) | 20 | ~800 | Full | ~$8.00 |
Free plan users can process approximately 1–2 small events with enrichment per month. Skipping enrichment dramatically reduces cost since no sub-actors are called.
The enrichment phase accounts for most of the cost because it invokes three separate sub-actors once per unique domain. To reduce cost, enable Skip Enrichment for initial exploration and run the full pipeline only on high-value events.
Tips
- Include direct subpage URLs for best coverage. If the conference site has a dedicated
/sponsorsor/exhibitorspage, add it alongside the main event URL. - Use Skip Enrichment for quick reconnaissance. Run with enrichment off first to preview how many leads an event yields. Then re-run with enrichment on only if the domain list looks promising.
- Set Max Leads Per Event thoughtfully. For large trade shows with hundreds of exhibitors, increase the limit to 200–500. For small meetups, the default of 50 is more than enough.
- Enable proxy for Eventbrite and gated sites. Some event platforms serve limited HTML to bots. Using a residential or auto proxy ensures the full sponsor list is rendered.
- Batch related events together. Process an entire conference series or industry vertical in a single run. The actor deduplicates domains across events, so companies appearing at multiple events are enriched only once.
- Filter by foundAs field after export. If you only want sponsors, filter the CSV or JSON output by
foundAs === "sponsor"to exclude speakers, organizers, and generic links.
Combine with Other Actors
| Actor | How to combine |
|---|---|
| Website Contact Scraper | Used internally by this actor for enrichment. Run standalone for deeper crawls (more pages per domain) |
| Email Pattern Finder | Used internally. Run standalone to test patterns for specific domains |
| B2B Lead Qualifier | Used internally. Run standalone for detailed score breakdowns with signal-level detail |
| B2B Lead Gen Suite | Full-pipeline B2B lead gen from any domain list — use when you already have domains and don't need event context |
| HubSpot Lead Pusher | Push event leads directly to HubSpot CRM with automatic field mapping |
| Company Deep Research | Deep-dive research on high-scoring event leads before outreach |
| Brand Protection Monitor | Monitor competitor event sponsorship activity over time |
Limitations
- HTML-only crawling — uses CheerioCrawler (no browser). JavaScript-rendered single-page apps (SPAs) that load sponsor lists dynamically are not supported.
- No login/CAPTCHA support — only crawls publicly accessible pages. Cannot handle password-protected, login-gated, or CAPTCHA-protected event pages.
- No attendee data — only extracts company-level data from public sponsor, exhibitor, and speaker listings. Does not scrape attendee lists, registration forms, or personal profiles.
- Classification is heuristic — link classification depends on surrounding HTML context (headings, class names, text). Unconventional page structures may result in
linkedinstead of a specific role. - Sub-actor availability — enrichment depends on three sub-actors running successfully. If any sub-actor is temporarily unavailable or has a bug, those fields will be empty.
- Timeout for large events — events with 500+ exhibitors and full enrichment may exceed the default 2-hour actor timeout. Increase the timeout in settings or split across runs.
Responsible Use
- This actor only accesses publicly visible event pages and company websites.
- Extracted contact information (emails, phone numbers, names) should be used in compliance with applicable data protection laws (GDPR, CAN-SPAM, CCPA).
- Do not use this actor for unsolicited bulk email campaigns without proper opt-in consent.
- Respect rate limits and terms of service for event platforms.
- See Apify's guide on web scraping legality for general guidance.
FAQ
What event platforms are supported? Any event website that renders sponsor, exhibitor, or speaker links in HTML. Tested on Eventbrite, Lu.ma, Sched, Bizzabo, Swapcard, and hundreds of custom conference microsites.
How does the actor classify companies as sponsor vs. speaker vs. exhibitor?
It analyzes the HTML context around each external link — section headings, parent element class names, and nearby text. Keywords like "sponsor," "platinum," "exhibitor," "booth," "speaker," and "keynote" trigger the appropriate classification. Links without clear context are tagged as linked.
What happens if enrichment sub-actors fail?
Each enrichment step runs independently. If one fails, the pipeline continues with the remaining steps. Failed steps produce empty fields in the output rather than crashing the entire run. The pipelineSteps array shows which steps completed.
Can I process password-protected or login-gated event pages? No. The actor crawls publicly accessible HTML pages only.
How many events can I process in a single run? No hard limit on input URLs. In practice, runs with 20–50 event URLs complete within the default 2-hour timeout. For larger batches, increase the actor timeout or split across multiple runs.
Does the actor find individual attendee data? No. The actor only extracts company-level data from public sponsor, exhibitor, and speaker listings.
What does the lead score mean? The score (0–100) and letter grade (A through F) come from the B2B Lead Qualifier sub-actor. It evaluates contact reachability, business legitimacy, online presence, website quality, and team transparency. Higher scores indicate companies that are easier to reach and more likely to be legitimate business targets.
Can I use the output with my CRM? Yes. Export as CSV or JSON and import directly into HubSpot, Salesforce, Pipedrive, or any CRM that accepts flat file imports. Or use the HubSpot Lead Pusher actor for automated CRM sync.
Integrations
- Zapier — trigger a Zap when the run finishes and push leads to HubSpot, Salesforce, Slack, or 5,000+ other apps
- Make — use the Apify module to watch for completed runs and route leads through multi-step workflows
- Google Sheets — export directly to a spreadsheet for team collaboration and filtering
- Apify API — call the actor programmatically and fetch results as JSON for custom integrations
- Webhooks — receive a POST notification with the dataset ID when the run completes
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Website Contact Scraper
Extract emails, phone numbers, team members, and social media links from any business website. Feed it URLs from Google Maps or your CRM and get structured contact data back. Fast HTTP requests, no browser — scrapes 1,000 sites for ~$0.50.
Email Pattern Finder
Discover the email format used by any company. Enter a domain like stripe.com and detect patterns like [email protected]. Then generate email addresses for any name. Combine with Website Contact Scraper to turn company websites into complete email lists.
Waterfall Contact Enrichment
Find business emails, phones, and social profiles from a name + company domain. Cascades through MX validation, website scraping, pattern detection, and SMTP verification. Free Clay alternative.
B2B Lead Qualifier - Score & Rank Company Leads
Score and rank B2B leads 0-100 by crawling company websites. Analyzes 30+ signals across contact reachability, business legitimacy, online presence, website quality, and team transparency. No AI keys needed.
Ready to try Event Lead Extractor?
Start for free on Apify. No credit card required.
Open on Apify Store