Lead Scoring Engine — ICP Score Leads 0-100 is an Apify actor on ApifyForge. Score leads 0-100 against your Ideal Customer Profile across 6 weighted dimensions: industry, company size, services, contact presence, intent signals, and data completeness. Returns A-F grades + per-dimension notes. It costs $0.03 per lead-scored. Best for sales teams and marketers who need verified contact data, lead lists, or prospect enrichment at scale. Not ideal for real-time monitoring or historical data analysis. Maintenance pulse: 90/100. Last verified March 27, 2026. Built by Ryan Clinton (ryanclinton on Apify).
Lead Scoring Engine — ICP Score Leads 0-100
Lead Scoring Engine — ICP Score Leads 0-100 is an Apify actor available on ApifyForge at $0.03 per lead-scored. Score leads 0-100 against your Ideal Customer Profile across 6 weighted dimensions: industry, company size, services, contact presence, intent signals, and data completeness. Returns A-F grades + per-dimension notes. No API calls. $0.03/lead.
Best for sales teams and marketers who need verified contact data, lead lists, or prospect enrichment at scale.
Not ideal for real-time monitoring or historical data analysis.
What to know
- Results depend on publicly available data; private or gated contacts may not be found.
- Email verification accuracy varies by domain and provider policies.
- Requires an Apify account — free tier available with limited monthly usage.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| lead-scored | One lead record scored against the Ideal Customer Profile and pushed to the dataset. | $0.03 |
Example: 100 events = $3.00 · 1,000 events = $30.00
Documentation
A lead scoring tool for outbound sales that prioritises raw prospect lists without requiring a CRM.
This actor converts raw signals into deterministic, automation-ready scoring + prioritisation decisions.
Apify GTM Pipeline: Scrape → Enrich → Verify → Score → Research → Push to CRM Role of this actor: Prioritisation + qualification layer.
Lead scoring — also known as lead qualification, prospect prioritisation, sales lead ranking, pipeline filtering, or B2B go-to-market targeting — transforms a raw list of prospects into a ranked, decision-graded shortlist so your sales team contacts the right companies first. This actor scores every lead 0-100 against your Ideal Customer Profile, classifies the decision (qualify / nurture / disqualify), attaches a recommendedAction with owner + ETA, and emits a dryRun task + agentContract ready for downstream automation — all for $0.03 per lead with no API subscriptions. Go from raw list to prioritised outreach queue in under 2 minutes.
Acts as a HubSpot and Apollo alternative for lead qualification and pipeline prioritisation without requiring a CRM. This is a no-CRM lead-scoring layer for raw prospect lists. Unlike CRM-native scoring, which is limited to data already inside the system, this engine works on external datasets and validates scoring against real outcomes. Traditional tools score leads inside a CRM — this scores them before they ever enter one. HubSpot scoring is workflow-driven; this is dataset-driven. Best suited for outbound SDR teams working from scraped or exported lead lists, agencies pre-filtering before paying for enrichment, and Apify pipeline builders chaining a scoring layer between scraping and outreach actors.
For example, instead of manually sorting a spreadsheet of 1,000 prospects, this engine produces a ranked outreach queue in minutes — every lead tagged with a decision verdict, recommended action, owner, ETA, and opening angle. The engine runs six weighted dimensions against each lead record: industry match, company size, services alignment, contact presence, intent signals, and data completeness. Weights are fully configurable and normalised automatically. The computation is deterministic — same input → same output — and requires no external API calls, so runs complete in seconds regardless of batch size.
Key properties
- Deterministic — same input always produces the same output, which makes scores reproducible, auditable, and defensible in a sales review where reps can override.
- No external APIs — pure compute, which removes rate limits, hidden costs, and third-party failure points from the scoring path.
- Decision-first — outputs actions, not just scores; every lead carries a
decision, arecommendedAction, and ataskready for execution so downstream consumers branch on the verdict, not the raw number. - Cost-aware — optimises for ROI, not just fit; opt-in
enableEconomicsadds expectedValue + an act / delay / ignore gate so leads below break-even ROI are skipped before charging. - Automation-ready — every lead includes a
task(dryRun=true), anslarouting block, and anautomationSafeflag, so downstream systems (Zapier, Slack, agent loops) can act without writing additional gating logic. - Suite-aware — every lead carries
actorGraph,pipelineState,dataGaps, andnextBestActorSlug, which lets Dify / n8n / agent loops chain cleanly to sibling Apify actors with no glue code. - Validatable — pass past won / lost outcomes via
outcomeDatasetIdand the actor joins them on canonical domain to prove whether higher-graded leads actually win more, instead of asking you to trust the model.
Trust comes from showing the priors, not hiding them. Calibration grade, score bands, and benchmark conversion rates are all surfaced in the run summary so you can defend the model in a pipeline review without reverse-engineering it.
Problems this solves
- "Which leads should my SDR team contact first?"
- "How do I prioritise a large list of prospects?"
- "How do I reduce wasted outreach on bad-fit companies?"
- "How do I score leads without HubSpot or Salesforce?"
- "How do I prove whether my lead-scoring model is actually predicting wins?"
- "How do I allocate a fixed SDR / enrichment budget across hundreds of leads?"
- "How do I stop SDRs ignoring scores they don't trust?"
- "Which leads are sales-ready right now vs need nurture?"
Who this is for
- SDR teams prioritising outbound lists from raw scrapes / LinkedIn exports / trade-show data
- Agencies qualifying scraped leads before paying for enrichment (cuts enrichment cost 50-70%)
- B2B founders running lean GTM without a CRM seat per rep
- ABM teams rolling per-lead signal into account-level readiness for buying-committee outreach
- Pipeline ops pre-filtering inbound or trade-show lists before SDR handoff
- Apify pipeline builders chaining a scoring layer between scraping and outreach actors
How to think about the output
A good lead scoring system needs four things: fit, timing, cost, and data quality — everything else is implementation detail. This engine surfaces all four as orthogonal axes the consumer can branch on. Each answers a different question:
decision— is this lead a fit? (qualify/nurture/disqualify)actionDecision— should we act on it now? (act/delay/ignore— driven by ROI, not fit)dataHygiene.automationSafe— can this be auto-processed? (true only when no critical issues + email verified + not stale)expectedValue.expectedRoi— is it worth the cost? (revenue ÷ cost-to-act)
Together: fit + timing + cost + data quality = action. Every other field in the output supports one of these four decisions or explains why the actor reached it.
Before vs After
Before:
- Spreadsheet of 1,000 prospects, no prioritisation
- SDRs dialling top-to-bottom — 40-60% of effort wasted on bad-fit companies
- No way to validate which scoring rules actually predict closed deals
- Enrichment budget burned on leads that would never qualify
After:
- Ranked queue with A-F grades + decision verdict per lead
- SDRs work top-of-list, every entry already has an opening angle
outcomeDatasetIdjoin shows you whether higher-graded leads actually win more- Allocation block excludes low-ROI leads BEFORE charging —
savingsfield reports avoided spend
v1.1 (decision layer, May 2026): every scored lead now carries decision, confidence, recommendedAction, task (dryRun=true), agentContract, openingAngle, scoringTrace, dataGaps, fixPlan, and a buying-committee classification when titled contacts are present. Cohort-level cohort/coverage/trust/notifications blocks are emitted in the run summary. Mode (auto/fast/balanced/thorough) and persona (outbound-sdr/account-exec/growth-marketer) presets shape weighting without manual tuning.
v1.2 (suite intelligence, May 2026): goal preset (pipeline-growth / quick-wins / cost-efficiency / high-ltv) layers WHAT outcome on top of mode (HOW) and persona (WHO). pipelineState per record (enriched / emailVerified / intentChecked / crmSynced / deduped) detected from input. actorGraph per record (previous → current → next[]) for suite navigation. executionReadiness block with blockers + steps-to-ready. improvementSuggestions[] with projected score deltas. Optional watchlistName enables cross-run temporalSignals (trend / momentum / re-engage). Optional enableIcpInsights surfaces ICP-drift from top performers. Optional enableDedup flags same-run duplicates by canonical domain.
v1.3 (ROI + allocation + simulation, May 2026): opt-in enableEconomics adds expectedValue per lead — conversionProbability × estimatedDealSize ÷ costToAct = expectedRoi — plus actionDecision: act|delay|ignore driven by ROI. Industry × company-size deal-size proxies are conservative public benchmarks; override with industryDealSizeOverrides for accuracy. Optional constraints input (maxOutreachPerRun / maxEnrichmentPerRun / budgetUsd) triggers run-level allocation: leads sorted by ROI, top-N selected within budget, each gets allocationDecision. Optional simulate input re-scores every lead with override weights and emits a simulation block with score delta + decision change — test ICP hypotheses without a second run. Plus per-lead actionPlan (multi-step), timingWindow (early/optimal/late), relativePosition (top-1%/5%/10% tier), disqualificationAnalysis (recoverable + pathToQualify), upstreamQuality (per-source confidence + known weakness).
v1.4 trust, calibration & buyer control (May 2026)
Scorecard templates
Four pre-built configuration bundles for common go-to-market motions: local-agency-outbound (SMB agencies), b2b-saas-abm (enterprise SaaS, AE-led), ecommerce-services (DTC brands), recruiter-sourcing (intent-heavy for actively-hiring companies). One dropdown collapses 8-10 fields (ICP targets + thresholds + mode + persona + goal + negative rules + economics) into one click. User-supplied values always win against template defaults. Beats Clay for non-technical users who don't want to assemble a GTM model from scratch.
Outcome replay
Optional outcomeDatasetId joins your scored leads against past won/lost data on canonical domain. Outputs winRateByGrade, falsePositiveRate, falseNegativeRate, and a scoreIsPredictive boolean. This is the unfair-advantage feature — competitors talk about predictive scoring, this lets you validate it on your own data without leaving the platform. Pure deterministic JOIN, no ML, no LLM.
Calibration grade
Run-level calibration block grades the score model A-F based on cohort size + outcome alignment. Score bands map to expected B2B conversion priors (A: 18%, B: 9%, C: 4%, D: 1.5%, F: 0.5%) drawn from public benchmarks. When outcomeDatasetId is supplied, actual rates are attached to each band. confidenceWarning is plain English — "No outcome history supplied; using benchmark priors." Trust comes from showing the priors, not hiding them.
Sales-trust diagnostics
Per-lead salesTrust block: trustScore (0-100), reasons[], plus pre-built repObjection + answer for common decision shapes. "Why is this an A lead with no good contact info?" gets a deterministic answer rooted in the score breakdown. Sales adoption depends on this — reps don't trust black-box scores, they trust scores they can defend in a pipeline review.
Data hygiene severity
Per-lead dataHygiene: score + severity (critical / high / medium / low / clean) + criticalIssues[] + automationSafe boolean. Critical issues (malformed emails, missing identity, domain-with-whitespace) BLOCK auto-action. Normalisation issues (mixed-case domains, placeholder phones) softer. Cohort rollup in summary: cohortDataHygiene.automationSafeShare.
Negative scoring rules
User-configurable negativeRules array: [{ field, contains, penalty, reason }]. Match on substring, exact, or regex. Total penalty per lead capped at 50 to prevent over-correction. Common rules ship as scorecard-template defaults (personal-email domains in b2b-saas-abm). Matches HubSpot's "negative point values" pattern — power users get precision without writing custom code.
Freshness decay
Optional freshnessConfig: dateField (auto-detected from common date fields if blank) + decayAfterDays + maxPenalty. Linear ramp from 0 at decay-after-days to maxPenalty/2 at 2× decay, capped at maxPenalty beyond. Per-lead freshness block: status (fresh / aging / stale / unknown) + ageDays + scorePenalty + recommendedAction: refresh-first. Solves the stale-CRM-data problem.
SLA routing
Per-lead sla block: routeTo (sdr / ae / marketing / ops / archive) + respondWithinHours + breachRisk. A-grade qualified + enterprise dealSize → AE, 1h. A-grade qualified, smaller deal → SDR, 1h. B-grade qualified → SDR, 24h. Nurture → marketing, 168h. High ROI tightens the SLA. Plug-and-play with Zapier / Make / Slack auto-assignment rules.
Account-level rollup (ABM)
Optional enableAccountRollup groups leads by canonical domain. Emits accountReadiness[] in summary: per-account contact counts, decision-makers, champions, blockers, coverage (single-thread / multi-threaded / no-coverage), readiness (sales-ready / developing / cold). For B2B / ABM workflows where account-level signal matters more than per-lead.
Savings report
Run-level savings block (auto-on with constraints): leadsSkipped + estimatedSpendAvoidedUsd + estimatedSdrTouchesAvoided + reason. Proves the actor's value as a resource allocator, not just a scorer. "This run prevented $82.60 of wasted enrichment + 413 wasted SDR touches."
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| ⚡ Decision | Top-level routing — qualify / nurture / disqualify | "qualify" |
| 🎯 Recommended Action | actionId + label + owner + eta + costEstimate | { actionId: "outreach-now", owner: "sdr", eta: "this-week" } |
| 📋 Task (dryRun) | Universal task object for Jira / Linear / internal queue | { id, kind: "outreach", target, payload, owner, deadline, dryRun: true } |
| 🤖 Agent Contract | Compact decision surface for agent loops | { decision, confidence, nextAction, costToAct } |
| 📊 ICP Score | Computed across 6 dimensions | 82.5 |
| 🏅 ICP Grade | Derived from score thresholds | "A" |
| 🔬 Confidence | Weighted components, score, and band | { score: 0.78, level: "medium", components: [...] } |
| 💡 Opening Angle | First-touch sentence referencing something specific | "Saw BrightEdge — 51-200 Marketing Agency doing SEO + Content Marketing..." |
| 👥 Buying Committee | Contacts classified by title regex | { decisionMaker: [...], champion: [...], blocker: [...], user: [...] } |
| 📐 Scoring Trace | Per-rule weight + raw + contribution (reproducibility) | [{ rule: "industryMatch", weight: 25, rawScore: 100, contribution: 25 }, ...] |
| 🚧 Data Gaps | Missing fields with suggestedFix actor slug | [{ field: "emails", suggestedFix: "Run lead-enrichment-pipeline" }] |
| 🛠 Fix Plan | Ordered remediation steps when gaps exist | { steps: [{ order: 1, action, owner, command }] } |
| 💰 Expected Value (v1.3, opt-in) | conversionProb × dealSize ÷ costToAct = ROI | { expectedRoi: 12.4, expectedRevenueUsd: 2160, costToActUsd: 174 } |
| ⚖️ Action Decision (v1.3) | act / delay / ignore — driven by ROI, not just fit | "act" |
| 🎯 Allocation Decision (v1.3, when constraints set) | Top-N leads selected within budget + outreach cap | { selected: true, rankInAllocation: 7 } |
| 🔁 Simulation Result (v1.3) | Re-scored with override weights — test ICP hypotheses | { newScore: 78.2, delta: +6.4, decisionChange: "nurture→qualify" } |
| ⏱ Timing Window (v1.3) | early / optimal / late — should we act NOW? | { status: "optimal", reason: "Active hiring detected" } |
| 📊 Relative Position (v1.3) | top-1% / top-5% / top-10% tier in cohort | { tier: "top-10%", competitiveRank: 3, shouldPrioritise: true } |
| 🎓 Calibration (v1.4) | Run-level scoring model grade A-F + benchmark conversion priors | { calibrationGrade: "B", scoreBands: [{ band: "80-100", expectedConversionRate: 0.18 }] } |
| ✅ Outcome Validation (v1.4) | Joined against your past won/lost data — proves scoring is predictive | { matchedOutcomes: 312, winRateByGrade: { A: 0.18, B: 0.09 }, scoreIsPredictive: true } |
| 🤝 Sales Trust (v1.4) | trustScore + plain-English reasons + pre-built rep-objection answers | { trustScore: 84, reasons: [...], answer: "Strong contact data, but..." } |
| 🩺 Data Hygiene (v1.4) | Operational data-quality block: severity + criticalIssues + automationSafe | { score: 72, severity: "medium", automationSafe: false } |
| ⏰ SLA (v1.4) | Routing + response window: routeTo + respondWithinHours + breachRisk | { routeTo: "sdr", respondWithinHours: 1, breachRisk: "high" } |
| 🏢 Account Readiness (v1.4, ABM) | Account-level rollup grouping leads by canonical domain | { companyKey: "brightedge.com", coverage: "multi-threaded", readiness: "sales-ready" } |
| 💵 Savings Report (v1.4) | Avoided cost when allocation excludes leads — proves resource-allocator value | { leadsSkipped: 413, estimatedSpendAvoidedUsd: 82.60, reason: "Low ROI..." } |
| 📝 ICP Notes | Human-readable per-dimension explanation | "Exact industry match: 'Marketing Agency'" |
| 📊 Cohort Stats | Mean, stdev, percentiles, grade distribution (run summary) | { n: 127, mean: 68.4, p75: 78.5, p90: 88.0, ... } |
| 📦 CSV Export | Apollo / Outreach.io / Salesloft compatible CSV in KV | OUTPUT.csv |
Why use Lead Scoring Engine?
Without a scoring system, sales teams work in gut-feel order. A rep opens the spreadsheet at row 1 and dials down. Half the list is the wrong industry, wrong size, or missing contact details — which means wasted calls, ignored emails, and a pipeline that looks fuller than it is.
This actor automates the entire ICP qualification process. Pass in leads from any upstream actor — Google Maps Email Extractor, Website Contact Scraper, B2B Lead Gen Suite, or your own enrichment output — and get back a scored, sorted, grade-filtered dataset ready for your CRM.
- Scheduling — run daily or weekly to re-score updated datasets as new leads enter the top of the funnel
- API access — trigger scoring runs from Python, JavaScript, or any HTTP client inside your existing pipeline
- Budget control — set a per-run spending limit; the actor stops when your cap is reached so there are no surprise bills
- Monitoring — connect Apify's Slack or email alerts to catch runs that fail or return unexpected grade distributions
- Integrations — push scored leads directly to HubSpot via HubSpot Lead Pusher, or export to Google Sheets, Zapier, or Make
Features
Decision layer (v1.1)
- Top-level decision — every lead is classified
qualify/nurture/disqualifyso downstream branching nodes route without traversing the score - Recommended action with owner + ETA — every record has
recommendedAction: { actionId, label, owner, eta, costEstimate }(e.g.,outreach-now / sdr / this-week / medium) so SDR queues build themselves - DryRun task per lead —
task: { id, kind, target, payload, owner, deadline, dryRun: true }is wire-compatible with universal task schemas; flipdryRunupstream when ready to execute - Agent contract surface —
agentContract: { decision, confidence, nextAction, costToAct }lets MCP/agent consumers act without traversing the rest of the record - Confidence band + components —
confidence: { score: 0-1, level: "high|medium|low|very-low", components: [...] }; cold-start cap (sample <25) prevents over-confidence on small cohorts - Scoring trace for reproducibility —
scoringTrace: [{ rule, weight, rawScore, contribution }]per dimension, so any score is auditable end-to-end - Decision risk asymmetry —
decisionRisk: { downsideIfWrong, upsideIfRight, asymmetryRatio }per lead so high-asymmetry decisions get prioritised - Send + shouldAct gates —
send: "yes|no|hold"for outreach;shouldAct: booleanis the hard gate for auto-execution loops - Why-this-matters / why-now / opening angle — plain-English rationale strings (≤200 chars each) that paste straight into CRM notes or LLM prompts
- Mode + persona presets —
mode: "auto|fast|balanced|thorough"andpersona: "outbound-sdr|account-exec|growth-marketer|generic"shape weighting; per-dimension overrides still win - Output profile filter —
outputProfile: "minimal|standard|full|llm"strips/keeps fields without forcing the user to write a JSONata projection
Cohort & remediation
- Cohort statistics in summary — n, mean, stdev, median, p25/p75/p90 plus grade distribution; per-record
percentileInCohortandpriorityRank - Coverage block — top-level
coverage: { requested, scored, qualified, nurtured, disqualified, errored }for at-a-glance run health - Notifications block — automatic notifications when ≥50% of leads are missing email, when zero leads qualified, or when the cohort hits cold-start
- Trust block —
trust: { provenance, sourceCoverage, conflictCount, sampleSize }for downstream provenance - Data gaps + fix plan — every record carries
dataGaps: [{ field, reason, suggestedFix }]plus an orderedfixPlanpointing at the right enrichment actor - Buying committee classification — when contacts have titles, classified into
decisionMaker / champion / blocker / userby title regex - Contradictions surfaced, not averaged — when industry says match but services don't, or intent is high but contact data is missing, those conflicts are emitted explicitly
- Stable eventId hash —
eventId = sha256(domain + companyName)so the same lead in two runs produces the same ID — cohort diffing works out of the box
Scoring engine
- Six independent scoring dimensions — industry match, company size, services alignment, contact presence, intent signals, and data completeness, each returning a 0–100 raw score before weight application
- Fuzzy industry matching with 10 built-in alias groups — "digital agency" matches "marketing agency", "saas" matches "software", "google ads" matches "ppc", and more; exact matches score 100, partial substring matches score 60
- 8-band employee sizing with human-readable aliases — accepts numeric counts ("25 employees", 150), range strings ("11-50"), or plain-language labels ("small", "mid-market", "enterprise", "fortune 500"); adjacent-band leads score 50 rather than 0
- 15-service synonym library — "SEO" automatically matches "search engine optimisation", "organic search", "link building", "on-page seo", and 5 more variants; scores 100 for 3+ matches, 80 for 2, 50 for 1
- Additive contact presence scoring — 40 pts for a valid email, 20 pts for a named contact or LinkedIn URL, 20 pts for a phone number, 20 pts for 2+ named contacts or a titled decision-maker
- Four-signal intent scoring — high review count (100+) or high rating (4.5+), active hiring via job postings or hiringCount, chat widget or contact form present, and intent/tech keywords in the record; additive up to 100
- 5-group data completeness scoring — identity (domain/website), company name, contact info (email or phone), location (address/city/country), and profile (description, founded year, revenue); 20 pts per group
- A–F letter grades — A (80–100), B (65–79), C (50–64), D (35–49), F (0–34); thresholds for
decision(qualifyThreshold / disqualifyThreshold) are independently configurable
Operations
- Inline leads or dataset ID — pass leads directly as a JSON array, or point to any Apify dataset by ID to chain with an upstream scraping actor
- Paginated dataset loading — loads large upstream datasets in 1,000-item batches to avoid out-of-memory errors on datasets of 10,000+ leads
- CSV export to Key-Value Store —
OUTPUT.csvwritten with Apollo / Outreach.io / Salesloft compatible columns (Company Name, Website, Industry, Email, First Name, Title, Phone, Decision, Recommended Action, Opening Angle, ...) - KV mirrors —
SUMMARY(full summary record),OUTPUT(top 25 decisions for fast dashboard polling),RECEIPTS(per-charge audit trail with timestamp + eventId) - Decision-first dataset views — four pre-defined views: Decisions (decision + grade + score + action first), Qualified Only (ready-for-outreach), Errors, Run Summary
- Charge-after-push — PPE charge fires only after
pushDatasucceeds, so a network failure never charges for output that didn't arrive - Pre-charge filter —
minScoreToIncluderuns before charging; filtered leads are never pushed, never charged - Spending limit awareness — when PPE spending cap is reached the actor stops cleanly and sets
spendingLimitReached: truein the summary record
When NOT to use this actor
| You need | Use this instead |
|---|---|
| Raw lead data (domains, emails, phones from Google Maps or directories) | Google Maps Email Extractor, Website Contact Scraper, Agency Directory Scraper |
| Email enrichment for leads missing emails | Lead Enrichment Pipeline, Email Pattern Finder |
| Email verification to remove bounces | Bulk Email Verifier |
| 30-signal deep qualification beyond ICP fit | B2B Lead Qualifier |
| Buying-intent signals from external sources (job posts, funding, tech changes) | Intent Signal Tracker |
| AI-generated outreach copy for each lead | AI Outreach Personalizer |
| CRM auto-push of scored leads | HubSpot Lead Pusher, Salesforce Lead Pusher |
This actor is the scoring + decision layer — it takes lead records you already have and returns decisions about each. It does not scrape data, find emails, verify deliverability, or push to CRMs. Pair it with the actors above for the full pipeline.
Capability comparison
| Feature | Lead Scoring Engine | HubSpot Lead Scoring | Apollo Engagement Score | Manual qualification |
|---|---|---|---|---|
| ICP fit score 0-100 | ✅ Configurable across 6 dimensions | ✅ Static rules engine | ✅ Black-box ML model | ✅ Reps' judgement |
| Decision routing (qualify/nurture/disqualify) | ✅ Built-in, threshold-tunable | ⚠️ Workflow-defined | ❌ Score only | ⚠️ Implicit |
| Recommended next action with owner + ETA | ✅ Per-lead | ❌ | ❌ | ⚠️ Notes |
| Opening angle / first-touch sentence | ✅ Per-lead | ❌ | ❌ | ⚠️ Manual |
| Buying committee classification | ✅ Title-regex based | ⚠️ Manual tagging | ❌ | ⚠️ Manual |
| Cohort statistics (percentiles, stdev) | ✅ Per-run summary | ❌ | ⚠️ Aggregate dashboards | ❌ |
| Scoring trace for reproducibility | ✅ Per-rule contribution | ⚠️ Audit log | ❌ Black-box | ❌ |
| Cold-start protection (sample <25) | ✅ Confidence capped 0.5 | ❌ | ❌ | N/A |
| Works on any lead source | ✅ JSON in, JSON out | ❌ HubSpot only | ❌ Apollo only | ✅ |
| Cost per lead | $0.03 | per-seat subscription (verify current plans) | per-user subscription (verify current plans) | $0.40-$2.50 in SDR labour |
| Apify-native (chain with scrapers, MCP, n8n) | ✅ | ❌ | ❌ | ❌ |
Pipeline overview
┌─────────────────────────────────────────────────────────────────────┐
│ INPUT │
│ • leads[] inline OR datasetId (upstream actor) │
│ • mode (auto / fast / balanced / thorough) │
│ • persona (outbound-sdr / account-exec / growth-marketer) │
│ • ICP targets + dimension weight overrides │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ PHASE 1 — SCORE (deterministic, no I/O, no charging) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ 6 dimensions in parallel per lead │ │
│ │ • industry • size • services • contact • intent • data│ │
│ └─────────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ decision + confidence + recommendedAction + task │ │
│ │ + agentContract + scoringTrace + dataGaps + fixPlan │ │
│ │ + openingAngle + buyingCommittee + warnings + contradicts │ │
│ └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ COHORT PASS │
│ • mean / stdev / p25/p75/p90 / grade distribution │
│ • percentileInCohort + priorityRank per lead │
│ • coverage + notifications + trust blocks │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ PHASE 2 — PUSH + CHARGE (lockstep, per-lead) │
│ for each lead: pushData(applyOutputProfile(lead)) → charge │
│ if eventChargeLimitReached → stop cleanly, set summary flag │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ KV MIRRORS │
│ SUMMARY → full summary record │
│ OUTPUT → top 25 decisions (fast dashboard polling) │
│ OUTPUT.csv → Apollo/Outreach.io/Salesloft compatible CSV │
│ RECEIPTS → per-charge audit trail │
└─────────────────────────────────────────────────────────────────────┘
Use cases for lead scoring
Sales prospecting and SDR prioritisation
Sales development reps waste 40–60% of dial time on companies that were never a real fit. Score every inbound lead from a trade show list, LinkedIn export, or Google Maps scrape against your ICP before the list ever reaches an SDR. Set minScoreToInclude: 65 to hand reps only B+ leads, and sort by score so grade-A prospects appear at the top of their queue.
Marketing agency lead generation
Agencies building prospect lists for outreach campaigns typically work with raw scrapes from directories, Google Maps, or contact databases. Running those lists through this actor before enrichment identifies which leads are worth paying to enrich further — saving enrichment budget on companies that will never convert. Combine with Waterfall Contact Enrichment for a cost-efficient pipeline.
CRM data quality and re-engagement
Existing CRM records go stale. Score your full contact database against a tightened ICP to surface dormant leads who now fit your profile, and identify records that no longer qualify. Export score and grade into a CRM custom field to drive automated re-engagement sequences based on grade changes over time.
Pipeline qualification and deal prioritisation
For teams running inbound pipelines, scoring provides an objective qualification signal alongside BANT. Integrate with HubSpot Lead Pusher to write icpScore and icpGrade directly into HubSpot contact properties, then use HubSpot workflows to route A-grade leads to senior reps and F-grade leads to nurture sequences automatically.
Recruiting and talent sourcing
Talent teams sourcing from job boards or LinkedIn can adapt the ICP model: set targetIndustries to the verticals you hire from, targetCompanySizes to the company sizes your candidates typically work at, and weight intentSignals heavily so actively hiring companies score highest. The intent signals dimension detects job posting activity in the lead record directly.
Market segmentation and research
Analysts who collect company data for market research use the scoring engine to segment a broad universe of companies into tiers. The per-dimension factor scores reveal where a segment is strong or weak across the six dimensions — useful for characterising an addressable market before building a go-to-market strategy.
How to score leads against your ICP
- Provide your leads — paste a JSON array of lead objects into the "Leads (inline)" field, or enter the dataset ID from an upstream actor run (e.g. from B2B Lead Gen Suite) into the "Dataset ID" field.
- Define your ICP — fill in Target Industries (e.g. "Marketing Agency", "SaaS"), Target Company Sizes (e.g. "11-50", "51-200"), and Target Services (e.g. "SEO", "PPC"). Leave a field blank to treat that dimension as neutral.
- Run the actor — click "Start" and wait. Scoring 100 leads typically takes under 10 seconds. Scoring 10,000 leads takes 2–3 minutes.
- Download results — open the Dataset tab, filter to records where
icpGradeis "A" or "B", and export as CSV, JSON, or Excel. The output is sorted by score descending by default.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
leads | array | One of these | — | Array of lead objects to score inline. Use this OR datasetId. |
datasetId | string | One of these | — | Apify dataset ID to load leads from. Use when chaining with an upstream actor. |
mode | string | No | "auto" | Scoring preset: auto (picks based on cohort + data), fast (industry+size weighted), balanced (default 25/20/20/15/10/10), thorough (intent + completeness up-weighted). |
persona | string | No | "generic" | Persona preset: outbound-sdr (contact + intent), account-exec (industry + size), growth-marketer (intent + completeness), generic (no persona bias). |
outputProfile | string | No | "standard" | Record shape: minimal (decision + action only), standard (default — full decision layer minus scoringTrace), full (every field), llm (agent-optimised: summary + why + opening angle + scoringTrace). |
qualifyThreshold | integer | No | 65 | Score at or above this → decision: "qualify". Default = grade B threshold. |
disqualifyThreshold | integer | No | 35 | Score below this → decision: "disqualify". Default = grade D threshold. Between thresholds → nurture. |
csvExport | boolean | No | true | Write OUTPUT.csv with Apollo/Outreach.io/Salesloft compatible columns to the run's Key-Value Store. |
targetIndustries | array | No | ["Marketing Agency", "Digital Agency"] | Industry names matching your ICP. Fuzzy matching and aliases applied automatically. |
targetCompanySizes | array | No | ["11-50", "51-200"] | Employee bands matching your ICP. Accepts range strings, plain numbers, or labels like "small". |
targetServices | array | No | ["SEO", "Content Marketing"] | Services your ideal clients offer. Matched against lead's services field with synonym expansion. |
targetTechStack | array | No | [] | Technologies your ideal clients use. Matched against lead's techStack and techKeywords fields. |
weightIndustry | integer | No | preset | Override the preset's industry weight (0-100). Leave blank to use mode + persona resolution. |
weightCompanySize | integer | No | preset | Override the preset's company size weight. |
weightServices | integer | No | preset | Override the preset's services weight. |
weightContactPresence | integer | No | preset | Override the preset's contact presence weight. |
weightIntentSignals | integer | No | preset | Override the preset's intent signals weight. |
weightDataCompleteness | integer | No | preset | Override the preset's data completeness weight. |
minScoreToInclude | integer | No | 0 | Exclude leads below this score from output. Filter runs BEFORE charging — filtered leads are never pushed and never charged. |
outputSortedByScore | boolean | No | true | Sort output descending by icpScore so best leads appear first. |
maxLeads | integer | No | 10000 | Safety cap on leads processed. Prevents runaway costs on very large datasets. |
watchlistName | string | No | — | Set to enable cross-run trend tracking. Two runs with the same watchlistName attach temporalSignals (trend / momentum / scoreDelta / runsSeen / reengage flag) to each lead by canonical entityId. |
monitorStateKey | string | No | — | Suite-aligned alias for watchlistName. Either input works; if both are set, watchlistName wins. Use this for one consistent field name across lead-scoring-engine, waterfall-contact-enrichment, phone-number-finder, bulk-email-verifier, company-deep-research, and lead-enrichment-pipeline. |
lastAction | object | No | — | Closes the feedback loop. Pass { type, takenAt: ISO date, note? } to tell the actor what action you took on this watchlist since the last run. On the next scheduled run the actor compares ICP scores against the snapshot at action time and emits decisionMemory with an inferred outcome. Honest: only signal-change is observable. Requires watchlistName / monitorStateKey. |
Input examples
Score leads from an upstream actor run (most common pipeline use):
{
"datasetId": "aBcDeFgHiJkLmNoP",
"targetIndustries": ["Marketing Agency", "Digital Agency"],
"targetCompanySizes": ["11-50", "51-200"],
"targetServices": ["SEO", "PPC", "Content Marketing"],
"targetTechStack": ["HubSpot", "Google Analytics"],
"minScoreToInclude": 50,
"outputSortedByScore": true
}
Score inline leads with custom ICP weights (B2B SaaS targeting):
{
"leads": [
{
"domain": "pinnacletech.io",
"companyName": "Pinnacle Technologies",
"industry": "SaaS",
"services": ["CRM", "Marketing Automation"],
"companySize": "51-200",
"emails": ["[email protected]"],
"contacts": [{ "name": "James Okafor", "title": "CEO", "email": "[email protected]" }],
"phones": ["+44 20 7946 0123"],
"techStack": ["HubSpot", "Salesforce"],
"rating": 4.8,
"reviewCount": 214,
"hasChatWidget": true,
"hasContactForm": true,
"city": "London",
"country": "UK",
"foundedYear": 2016,
"description": "B2B SaaS platform for marketing operations teams."
}
],
"targetIndustries": ["SaaS", "Software"],
"targetCompanySizes": ["51-200", "201-500"],
"targetServices": ["CRM", "Marketing Automation"],
"targetTechStack": ["HubSpot", "Salesforce"],
"weightIndustry": 30,
"weightCompanySize": 25,
"weightServices": 20,
"weightContactPresence": 15,
"weightIntentSignals": 5,
"weightDataCompleteness": 5,
"outputSortedByScore": true
}
Quick filter — keep only grade A leads, no ICP on services:
{
"datasetId": "xYz123datasetId",
"targetIndustries": ["Ecommerce", "Online Retail"],
"targetCompanySizes": ["11-50", "51-200", "201-500"],
"minScoreToInclude": 80,
"outputSortedByScore": true
}
Input tips
- Start with the default weights — industry 25, company size 20, services 20, contact 15, intent 10, completeness 10 covers most B2B agency use cases without adjustment.
- Leave unused dimensions at zero — if you don't care about tech stack, set
targetTechStack: []; the services dimension returns neutral (50) when no target is configured, which does not hurt scores. - Use
minScoreToIncludeto reduce output volume — setting it to 50 cuts typical output by 30–50% and makes the dataset easier to action in a CRM import. - Batch all leads in one run — scoring 500 leads in one run is significantly faster than 500 single-lead runs; load time and actor startup overhead is paid once per run.
- Pass
datasetIdfrom upstream actors — the actor reads any Apify dataset directly; there is no need to download and re-upload data between pipeline steps.
Output example
Scored lead (recordType: "lead"):
{
"schemaVersion": "1.1.0",
"recordType": "lead",
"eventId": "a3f2c8d4e1b07f29",
"domain": "brightedge.com",
"companyName": "BrightEdge",
"industry": "Marketing Agency",
"services": ["SEO", "Content Marketing", "Analytics"],
"companySize": "51-200",
"emails": ["[email protected]"],
"contacts": [{ "name": "Sarah Chen", "title": "Head of SEO", "email": "[email protected]" }],
"icpScore": 87.5,
"icpGrade": "A",
"icpFactors": {
"industryMatch": 100,
"companySizeMatch": 100,
"servicesMatch": 100,
"contactPresence": 100,
"intentSignals": 100,
"dataCompleteness": 100
},
"icpNotes": ["Industry (25pts weight): Exact industry match: 'Marketing Agency' in lead data", "..."],
"decision": "qualify",
"confidence": {
"score": 0.91,
"level": "high",
"components": [
{ "name": "industrySignal", "weight": 0.25, "value": 1 },
{ "name": "companySizeSignal", "weight": 0.15, "value": 1 },
{ "name": "contactPresence", "weight": 0.25, "value": 1 },
{ "name": "dataCompleteness", "weight": 0.20, "value": 1 },
{ "name": "intentSignals", "weight": 0.15, "value": 1 }
]
},
"confidenceLevel": "high",
"recommendedAction": {
"actionId": "outreach-now",
"label": "Send personalised cold email this week.",
"owner": "sdr",
"eta": "this-week",
"costEstimate": "medium"
},
"task": {
"id": "9f2e1c4d8a07b3f1",
"kind": "outreach",
"target": "brightedge.com",
"payload": { "decision": "qualify", "actionId": "outreach-now", "label": "...", "companyName": "BrightEdge", "domain": "brightedge.com" },
"owner": "sdr",
"deadline": "2026-05-11T09:22:31.000Z",
"dryRun": true
},
"agentContract": {
"decision": "qualify",
"confidence": 0.91,
"nextAction": "Send personalised cold email this week.",
"costToAct": "medium"
},
"send": "yes",
"shouldAct": true,
"summary": "BrightEdge — 87.5 (grade A); qualify. Next: Send personalised cold email this week.",
"whyThisMatters": "BrightEdge qualifies because industry + company size + services aligned — predicts above-baseline conversion.",
"whyNow": "Active buying signals (high reviews/hiring/engagement) — reach out before competitors do.",
"openingAngle": "Saw BrightEdge — 51-200 Marketing Agency doing SEO + Content Marketing. Curious how you're handling [your-product-fit] right now.",
"scoringTrace": [
{ "rule": "industryMatch", "weight": 25, "rawScore": 100, "contribution": 25 },
{ "rule": "companySizeMatch", "weight": 20, "rawScore": 100, "contribution": 20 },
{ "rule": "servicesMatch", "weight": 20, "rawScore": 100, "contribution": 20 }
],
"decisionRisk": {
"downsideIfWrong": "Outreach effort wasted on a non-fit lead.",
"upsideIfRight": "Closed deal in pipeline; ICP-aligned customer with high LTV.",
"asymmetryRatio": 8
},
"warnings": [],
"contradictions": [],
"dataGaps": [],
"buyingCommittee": {
"decisionMaker": [],
"champion": [{ "name": "Sarah Chen", "title": "Head of SEO", "email": "[email protected]" }],
"blocker": [],
"user": []
},
"qualificationRisk": 9,
"coldStart": false,
"agenticReadiness": 100,
"priorityRank": 1,
"percentileInCohort": 100,
"scoredAt": "2026-05-04T09:22:31.000Z"
}
Run summary record (recordType: "summary", last in the dataset):
{
"schemaVersion": "1.1.0",
"recordType": "summary",
"runId": "abc123runid",
"totalInput": 150,
"totalScored": 127,
"totalPushed": 127,
"filteredOut": 23,
"minScoreFilter": 50,
"averageScore": 68.4,
"topScore": 92.5,
"gradeDistribution": { "A": 18, "B": 41, "C": 35, "D": 21, "F": 12 },
"cohort": {
"n": 127, "mean": 68.4, "stdev": 14.2, "median": 67.5,
"p25": 58.0, "p75": 78.5, "p90": 85.3,
"gradeDistribution": { "A": 18, "B": 41, "C": 35, "D": 21, "F": 12 }
},
"coverage": { "requested": 150, "scored": 127, "qualified": 59, "nurtured": 47, "disqualified": 21, "errored": 0 },
"trust": { "provenance": "apify-dataset:abcDEF...", "sourceCoverage": 0.78, "conflictCount": 4, "sampleSize": 150 },
"notifications": [],
"modeUsed": "balanced",
"personaUsed": "outbound-sdr",
"outputProfile": "standard",
"weightsUsed": { "industry": 22.5, "companySize": 17.5, "services": 17.5, "contactPresence": 20, "intentSignals": 12.5, "dataCompleteness": 10 },
"icpConfig": {
"targetIndustries": ["Marketing Agency", "Digital Agency"],
"targetCompanySizes": ["11-50", "51-200"],
"targetServices": ["SEO", "PPC", "Content Marketing"],
"targetTechStack": ["HubSpot", "Google Analytics"]
},
"spendingLimitReached": false,
"chargedEvents": 127,
"chargedUsd": 3.81,
"summary": "Scored 127 of 150 leads | 59 qualify, 47 nurture, 21 disqualify | mode=balanced, persona=outbound-sdr",
"scoredAt": "2026-05-04T09:22:45.000Z"
}
Output fields
Per-lead record (recordType: "lead")
| Field | Type | Description |
|---|---|---|
schemaVersion | string | Output contract version. Additive only across minor versions (1.1.0). |
recordType | string | Discriminator: "lead" for scored leads. |
entityId | string | Stable cross-suite canonical id (sha256 of domain + companyName). Suite-aligned name; same join key as waterfall-contact-enrichment, phone-number-finder, bulk-email-verifier, company-deep-research, and lead-enrichment-pipeline. |
eventId | string | Legacy alias of entityId (same value). Kept for back-compat with existing downstream pipelines. |
signalIndependence | object | { score, distinctSourceCount, totalComponentCount, interpretation, warning? }. Catches the "looks like 6 corroborating signals but really 1 echoed 6 times" trap. Aligned with waterfall-contact-enrichment, company-deep-research, phone-number-finder, and bulk-email-verifier. |
counterfactual | object | { droppedComponent, withoutThisSignal: { score, level, grade }, interpretation }. Drops the highest-weight ICP factor and recomputes — tells you whether the lead's grade is load-bearing on a single factor. |
decisionMemory | object|null | Closes the feedback loop when lastAction is provided as input. { outcome: 'engaged' | 'no-response' | 'no-change' | 'resolved' | 'too-soon-to-tell', daysSinceAction, confidence, inferenceMethod, epistemicStatus }. Honest: only ICP-score movement is observable. |
decision | string | Top-level routing: "qualify" | "nurture" | "disqualify". |
confidence | object | { score: 0-1, level: "high|medium|low|very-low", components: [...] }. |
confidenceLevel | string | Banded confidence string for branching: high (≥0.8), medium (≥0.6), low (≥0.4), very-low (<0.4). |
recommendedAction | object | { actionId, label, owner, eta, costEstimate }. ActionIds: outreach-now, personalised-outreach, nurture-campaign, enrich-first, skip. |
task | object | DryRun task: { id, kind, target, payload, owner, deadline, dryRun: true }. Kinds: outreach, nurture, enrich, archive. |
agentContract | object | Compact agent surface: { decision, confidence, nextAction, costToAct }. |
send | string | Outreach send decision: "yes" | "no" | "hold". |
shouldAct | boolean | Hard gate for auto-execution. True only when decision=qualify and contact data is present. |
summary | string | ≤280-char plain-English summary. LLM-friendly. |
whyThisMatters | string | ≤200-char rationale for the decision. |
whyNow | string | ≤200-char timing rationale (only present when decision=qualify). |
openingAngle | string | ≤200-char first-touch sentence referencing something specific to the lead. |
icpScore | number | Overall ICP score, 0–100 (one decimal place). Higher is better. |
icpGrade | string | Letter grade: A (80–100), B (65–79), C (50–64), D (35–49), F (0–34). |
icpFactors | object | industryMatch, companySizeMatch, servicesMatch, contactPresence, intentSignals, dataCompleteness — each 0-100 raw before weight application. |
icpNotes | string[] | Per-dimension text explanations. |
scoringTrace | object[] | Per-rule reproducibility: [{ rule, weight, rawScore, contribution }, ...]. |
decisionRisk | object | { downsideIfWrong, upsideIfRight, asymmetryRatio }. |
warnings | object[] | [{ severity: "critical|warning|info", code, message }, ...]. |
contradictions | object[] | Pairs of signals pointing opposite ways. Empty when no conflict detected. |
dataGaps | object[] | [{ field, reason, suggestedFix }, ...] — missing fields with the right enrichment actor to fix them. |
fixPlan | object | Ordered remediation steps. Present only when dataGaps is non-empty. |
nextBestActorSlug | string | Apify actor slug to chain after this one (e.g., ryanclinton/lead-enrichment-pipeline). |
buyingCommittee | object | { decisionMaker[], champion[], blocker[], user[] } — present when contacts have titles. |
qualificationRisk | number | 0-100; inverse of confidence. Higher = more risk the decision is wrong. |
coldStart | boolean | True when cohort has <25 leads. Confidence is capped at 0.5 in this case. |
agenticReadiness | number | 0-100; how well-equipped this record is for an agent loop to act on. |
priorityRank | number | 1-indexed rank in the (sorted) cohort. 1 = top lead. |
percentileInCohort | number | 0-100; this lead's score percentile within the run's cohort. |
scoredAt | string | ISO 8601 timestamp. |
| (all original lead fields) | mixed | Every field from the input lead record is preserved unchanged in the output. |
Run summary record (recordType: "summary")
| Field | Type | Description |
|---|---|---|
schemaVersion / recordType | string | "1.1.0" / "summary". |
runId | string | Apify run ID (or local-<timestamp> outside the platform). |
totalInput / totalScored / totalPushed / filteredOut | number | Cohort funnel counts. |
cohort | object | { n, mean, stdev, median, p25, p75, p90, gradeDistribution }. |
coverage | object | { requested, scored, qualified, nurtured, disqualified, errored }. |
trust | object | { provenance, sourceCoverage, conflictCount, sampleSize }. |
notifications | object[] | Auto-generated alerts (low coverage, cold start, zero qualified, ICP overfit). |
modeUsed / personaUsed / outputProfile | string | Resolved preset values (after auto-resolution if mode: "auto"). |
weightsUsed | object | The normalised weights actually applied. |
chargedEvents / chargedUsd | number | PPE charge totals for this run. |
spendingLimitReached | boolean | true if the PPE spending cap was hit mid-run. |
summary | string | One-line summary. |
Key-Value Store mirrors
| Key | Format | Description |
|---|---|---|
SUMMARY | JSON | Same content as the summary dataset record — pin this for cross-run polling. |
OUTPUT | JSON | Top 25 decisions in compact shape — fast for dashboard polling without listing the dataset. |
OUTPUT.csv | CSV | Apollo / Outreach.io / Salesloft compatible columns. Written when csvExport: true (default). |
RECEIPTS | JSON | Per-charge audit trail: [{ timestamp, action, cost, eventId }, ...]. |
How much does it cost to score leads?
Lead Scoring Engine uses pay-per-event pricing — you pay $0.03 per lead scored. Platform compute costs are included. Scoring happens in-process with no external API calls, so there are no variable costs from third-party services.
| Scenario | Leads | Cost per lead | Total cost |
|---|---|---|---|
| Quick test | 10 | $0.03 | $0.30 |
| Small batch | 100 | $0.03 | $3.00 |
| Medium batch | 500 | $0.03 | $15.00 |
| Large batch | 2,000 | $0.03 | $60.00 |
| Enterprise | 10,000 | $0.03 | $300.00 |
You can set a maximum spending limit per run in the Apify console. The actor stops when your budget is reached and marks spendingLimitReached: true in the summary record. Per-charge audit trail is written to the run's KV under the RECEIPTS key.
Charges fire only after a lead is pushed to the dataset. If minScoreToInclude filters a lead out, it is never pushed and never charged. The PPE charge happens after pushData succeeds, so a network or platform failure cannot leave you paying for output that didn't arrive.
Compare this to manual qualification: an SDR spending 3 minutes qualifying each lead costs $25–50 per hour, meaning 100 leads cost $125–250 in labour. With this actor the same 100 leads cost $3.00 and return in under 30 seconds.
Score leads using the API
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/lead-scoring-engine").call(run_input={
"datasetId": "aBcDeFgHiJkLmNoP",
"targetIndustries": ["Marketing Agency", "Digital Agency"],
"targetCompanySizes": ["11-50", "51-200"],
"targetServices": ["SEO", "PPC", "Content Marketing"],
"minScoreToInclude": 65,
"outputSortedByScore": True,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
if item.get("recordType") == "summary":
print(f"Run complete — A:{item['gradeDistribution']['A']} B:{item['gradeDistribution']['B']} avg:{item['averageScore']}")
print(f"Coverage: {item['coverage']} | Mode: {item['modeUsed']}, Persona: {item['personaUsed']}")
elif item.get("recordType") == "lead":
print(f"{item.get('companyName')} | Score: {item['icpScore']} | Decision: {item['decision']} | Action: {item['recommendedAction']['label']}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/lead-scoring-engine").call({
datasetId: "aBcDeFgHiJkLmNoP",
targetIndustries: ["Marketing Agency", "Digital Agency"],
targetCompanySizes: ["11-50", "51-200"],
targetServices: ["SEO", "PPC", "Content Marketing"],
minScoreToInclude: 65,
outputSortedByScore: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
if (item.recordType === "summary") {
console.log(`Run complete — avg ${item.averageScore}, qualified ${item.coverage.qualified}, mode=${item.modeUsed}, persona=${item.personaUsed}`);
} else if (item.recordType === "lead") {
console.log(`${item.companyName} — ${item.icpScore} (${item.icpGrade}) → ${item.decision}: ${item.recommendedAction.label}`);
}
}
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~lead-scoring-engine/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"datasetId": "aBcDeFgHiJkLmNoP",
"targetIndustries": ["Marketing Agency", "Digital Agency"],
"targetCompanySizes": ["11-50", "51-200"],
"targetServices": ["SEO", "PPC", "Content Marketing"],
"minScoreToInclude": 65,
"outputSortedByScore": true
}'
# Fetch results (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How Lead Scoring Engine works
Weight normalisation
When the actor starts, normaliseWeights() in scorer.ts accepts the six raw weight inputs and forces them to sum exactly to 100. It clamps each value to zero minimum, sums all six, and scales each proportionally using weight = rawWeight / total * 100, rounded to one decimal place. If all weights are zero (an edge case), the defaults (25/20/20/15/10/10) are restored. This means you can set weights like 3/2/2/1/1/1 and the engine will scale them correctly to 30/20/20/10/10/10 — no manual arithmetic required.
Per-dimension scoring
Each of the six dimension modules (dimensions/industry.ts, dimensions/company-size.ts, etc.) receives the raw lead record and the relevant ICP targets, and returns a DimensionResult object with a score (0–100), a maxScore (always 100), and a notes array.
- Industry: reads
industry,vertical,category,niche, andservicesfields; resolves targets through a 10-group alias map; returns 100 for exact match, 60 for partial substring match, 50 for no-target (neutral), 0 for no match - Company size: parses numeric counts, range strings ("11-50"), and text labels ("mid-market") into one of 8 employee bands; returns 100 for exact band match, 50 for adjacent band (±1 position on the ordered band list), 0 for all others
- Services: expands each target through a 15-service synonym library; checks
servicesandtechStackfields; scores 100 for 3+ matches, 80 for 2, 50 for 1, 0 for none - Contact presence: additive scoring — 40 pts for validated email (regex
/^[^@\s]+@[^@\s]+\.[^@\s]+$/), 20 pts for named contact or LinkedIn URL, 20 pts for phone, 20 pts for 2+ named contacts or a titled contact - Intent signals: additive scoring — 30 pts for rating ≥4.5 or reviewCount ≥100, 25 pts for active hiring (jobPostings or hiringCount), 25 pts for chat widget or contact form, 20 pts for intent/tech keywords; returns neutral 50 if no signal data present
- Data completeness: 5 groups × 20 pts each — identity (domain/website), company name, contact info (email or phone), location (address/city/country), and profile (description, foundedYear, revenue, or minProjectSize)
Score assembly and grading
The final icpScore is computed as the sum of each dimension's proportional contribution: (dimensionScore / 100) * dimensionWeight. The six weighted values are summed and rounded to one decimal place. The grade thresholds are fixed: A ≥80, B ≥65, C ≥50, D ≥35, F <35. Both the score and grade are written directly onto the lead record alongside the per-dimension icpFactors and icpNotes arrays.
Dataset loading and PPE charging
When datasetId is provided, the actor paginates through the dataset in 1,000-item batches using the Apify client's listItems with limit and offset parameters, accumulating records up to maxLeads. The run runs in two phases: Phase 1 scores every lead in memory (deterministic, no I/O, no charging). Phase 2 sorts the cohort, then iterates lead-by-lead — pushing the record to the dataset and charging the PPE event "lead-scored" after the push succeeds. If eventChargeLimitReached flips true mid-batch, the actor stops cleanly and writes spendingLimitReached: true in the summary. Charge-after-push means a network or platform failure cannot leave the customer paying for output that didn't arrive.
Decision layer (v1.1)
After dimension scoring, every record passes through a decision-derivation layer:
- Decision — score is mapped to
qualify/nurture/disqualifyagainstqualifyThreshold(default 65) anddisqualifyThreshold(default 35). - Confidence — five weighted components (industry signal, company-size signal, contact presence, data completeness, intent) produce a 0-1 score, banded into
high/medium/low/very-low. Cohorts of <25 leads have confidence capped at 0.5 (cold-start protection). - Recommended action — derived from decision + persona + dataGaps. If the lead is qualifyable but missing an email, action is
enrich-first(with the right enrichment actor named in the fix plan). If qualified and persona isaccount-exec, action ispersonalised-outreach. - DryRun task — a universal task object is built from the action:
{ id: hash(eventId+actionId), kind, target: domain, payload, owner, deadline, dryRun: true }. - Agent contract — compact surface for downstream agents:
{ decision, confidence, nextAction, costToAct }. - Cohort pass — after all leads are scored, cohort statistics (mean, stdev, p25/p75/p90) and
percentileInCohortper lead are computed;priorityRankis assigned after sort.
Mode and persona presets
mode shapes WHAT scoring optimises for: fast (industry+size, ignore completeness), balanced (default 25/20/20/15/10/10), thorough (intent + completeness up-weighted). auto picks based on cohort size + estimated data richness.
persona shapes WHO scoring serves: outbound-sdr (contact + intent), account-exec (industry + size), growth-marketer (intent + completeness). When set, persona weights are averaged with mode weights, and per-dimension weight* overrides win.
Stable enum tokens (additive across minor versions)
| Field | Tokens |
|---|---|
decision | qualify, nurture, disqualify |
confidenceLevel | high, medium, low, very-low |
recommendedAction.actionId | outreach-now, personalised-outreach, nurture-campaign, enrich-first, skip |
task.kind | outreach, nurture, enrich, archive |
send | yes, no, hold |
mode (input) | auto, fast, balanced, thorough |
persona (input) | generic, outbound-sdr, account-exec, growth-marketer |
outputProfile (input) | minimal, standard, full, llm |
recordType | lead, summary, error |
severity (in warnings, notifications) | critical, warning, info |
Tips for best results
-
Define at least
targetIndustriesandtargetCompanySizesbefore anything else. These two dimensions account for 45 points at default weights and have the largest impact on final scores. An ICP with only these two configured will already produce meaningful lead tiers. -
Use
minScoreToInclude: 50for most pipeline use cases. Leads scoring below 50 (grade C or lower) rarely convert without further qualification. Filtering them at scoring time reduces CRM clutter and downstream enrichment costs. -
Increase
weightContactPresencefor cold outreach campaigns. If your SDRs need an email or phone to reach out, raise this weight to 25 or 30. Leads with no contact data will score significantly lower and sort below actionable leads. -
Increase
weightIntentSignalsfor growth-focused targeting. If you sell to companies that are actively scaling, raise the intent weight to 20 or 25. Leads with active hiring, high review volume, and engagement tools will rank higher than equivalently-sized but dormant companies. -
Chain with Waterfall Contact Enrichment after scoring. Run enrichment only on A and B grade leads (pass
minScoreToInclude: 65). This cuts enrichment cost by 50–70% compared to enriching the entire raw list. -
Use the
icpNotesfield to diagnose score distribution issues. If most leads are scoring 30–40 and you expect higher, read the notes on a few low-scoring records. A common cause istargetCompanySizesbands that don't match the size format in your lead data — for example, passing "small" when the lead hasemployeeCount: 45works, but passing "SMB" with no numeric count does not. -
Re-score the same dataset with different weights to test ICP hypotheses. Because the engine is deterministic, you can run the same
datasetIdtwice with different weight configs and compare grade distributions in the summary records to see which ICP definition produces the most A-grade leads from your existing data. -
Set
outputSortedByScore: falseif you need to preserve the original lead order — for example, when the upstream dataset is sorted by Google Maps ranking or scraping order and you want to maintain that sequence for reporting.
Use in Dify
Drop this actor into Dify workflows via the Apify plugin's Run Actor node. Each lead returns scored, classified, and recommended as structured JSON — qualify / nurture / disqualify plus a shouldAct boolean and a typed recommendedAction.actionId your downstream node branches on. Competitor pipelines pointed at the same lead list return raw scraped fields; this returns decisions you can route, gate, and ticket-fill from directly.
- Actor ID:
ryanclinton/lead-scoring-engine - Sample input (score an upstream scraper's dataset for an outbound SDR team):
{
"datasetId": "aBcDeFgHiJkLmNoP",
"mode": "auto",
"persona": "outbound-sdr",
"outputProfile": "standard",
"targetIndustries": ["Marketing Agency", "Digital Agency"],
"targetCompanySizes": ["11-50", "51-200"],
"targetServices": ["SEO", "PPC", "Content Marketing"],
"qualifyThreshold": 65,
"minScoreToInclude": 50,
"csvExport": true
}
Dify if/else routing
A single Dify if/else node branches on decision and routes to the right downstream actor:
| Branch condition | Action |
|---|---|
decision == "qualify" AND shouldAct == true | Run AI Outreach Personalizer on the lead. The actor's recommendedAction.executionHint.targetActorSlug already names this. |
decision == "qualify" AND recommendedAction.actionId == "enrich-first" | Run Lead Enrichment Pipeline on the lead first; re-score after enrichment. |
decision == "nurture" | Push to the marketing nurture list (HubSpot, Customer.io, etc.). |
decision == "disqualify" | Archive — skip outreach. |
Summary record recordType == "summary" AND decisionReadiness == "insufficient-data" | Stop the workflow — cohort too small for confidence. Re-run with ≥25 leads. |
The full recommendedAction object is usable verbatim in a Dify Code node — recommendedAction.label becomes a Slack alert message, recommendedAction.owner becomes the assignee tag, recommendedAction.eta becomes the deadline, and task.id becomes a stable idempotency key for ticket creation.
Opt-in modes Dify workflows can leverage
mode: "auto"— Dify workflows usually pass heterogeneous batches; auto-resolve picks the right preset per call.outputProfile: "llm"— emits agent-optimised records withsummary,whyThisMatters,whyNow,openingAngle, andscoringTraceonly. Drops cleanly into a Dify LLM node prompt without preprocessing.outputProfile: "minimal"— emits decision + action only. Keeps Dify variable size small for high-throughput branching.csvExport: true—OUTPUT.csvlands in the run's Key-Value Store with Apollo / Outreach.io / Salesloft compatible columns. Read via the KV node and push directly to Apollo's CSV import endpoint.
The recommendedAction action playbook is usable verbatim — no LLM rewriting required. The task object is wire-compatible with the universal task schema (id, kind, target, payload, owner, deadline, dryRun: true) consumed by Jira / Linear / GitHub Issues integrations downstream.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| B2B Lead Gen Suite | Full pipeline: pass the output dataset ID from B2B Lead Gen Suite directly as datasetId to score every scraped lead against your ICP in one chained run |
| Google Maps Email Extractor | Extract local business leads with emails from Google Maps, then score the output dataset to identify which local businesses match your agency's ICP |
| Website Contact Scraper | Scrape contact details from a list of company websites, then score the enriched records to prioritise outreach by ICP fit |
| Waterfall Contact Enrichment | Run scoring first; pass only A and B grade leads (score ≥65) into enrichment to reduce enrichment cost by 50–70% |
| HubSpot Lead Pusher | Push scored leads with icpScore and icpGrade fields into HubSpot as contact properties, then use HubSpot workflows to route by grade |
| Bulk Email Verifier | After scoring, verify emails only on A and B grade leads before handing to SDRs — avoids bounce rates from low-quality contacts |
| B2B Lead Qualifier | Use alongside this actor for a 30-signal deep-qualification pass on your top-scoring leads; Lead Scoring Engine provides the first filter, B2B Lead Qualifier provides the deep profile |
| Lead Enrichment Pipeline | All-in-one Clay alternative: email discovery, verification, company research, and scoring in one run ($0.12/lead) |
| AI Outreach Personalizer | Generate personalized cold emails using your own OpenAI/Anthropic key — zero AI markup ($0.01/lead) |
| Intent Signal Tracker | Track buying signals: hiring, tech changes, funding, content updates. Prioritize outreach by intent score ($0.05/company) |
| Lead Data Quality Auditor | Audit lead data quality before outreach — email verification, phone validation, domain freshness ($0.005/record) |
Limitations
- No live data enrichment — the actor scores only the fields already present in the lead record. If a lead is missing
industryorcompanySize, those dimensions return 0 or neutral rather than fetching the data from an external source. Use Waterfall Contact Enrichment upstream to add missing fields before scoring. - Industry matching is text-based — the fuzzy match works against the 10 built-in alias groups. Industry terms outside those groups must match verbatim (exact or partial substring). Highly niche verticals (e.g. "maritime logistics", "precision agriculture") may not match aliases and will require exact string configuration.
- Intent signals require upstream data — the intent dimension scores neutrally (50) when no signal fields are present in the lead record. Rating, review count, job postings, and chat widget data must be scraped upstream (e.g. by Google Maps Email Extractor or Website Contact Scraper) and included in the lead object.
- Company size bands are fixed — the 8 bands (1–10, 11–50, 51–200, 201–500, 501–1000, 1001–5000, 5001–10000, 10001+) cannot be customised. Very large or very small employee thresholds that fall outside these bands cannot be expressed.
- No deduplication — if the input leads array or dataset contains duplicate domain entries, each will be scored and charged separately. Deduplication should happen upstream.
- Maximum 10,000 leads per run by default — set
maxLeadsup to 100,000 if needed, but very large runs on 256 MB memory are possible due to the streaming pagination approach. - No partial tech stack matching within a single string — tech stack matching checks for the target term as a substring of the combined lead text. If a lead stores tech as
"shopify-plus-theme"and you target"Shopify", it will match. But deeply abbreviated or encoded tech strings may not resolve correctly.
Integrations
- Zapier — trigger a lead scoring run when new leads are added to a Google Sheet or CRM, then write scored results back automatically
- Make — build multi-step scenarios that scrape leads, score them, filter by grade, and push A/B leads to HubSpot or ActiveCampaign
- Google Sheets — export scored leads directly to a Sheet for manual review, sorting by
icpScorecolumn to surface top prospects - Apify API — chain scoring runs programmatically using the dataset ID from any upstream run; retrieve results in JSON, CSV, or XLSX format
- Webhooks — trigger downstream actions (Slack notification, CRM push, email alert) when a run completes and the summary shows more than N grade-A leads
- LangChain / LlamaIndex — feed scored lead data into an AI agent that generates personalised outreach copy ranked by
icpScore, targeting only A and B grade leads
Troubleshooting
Most leads scoring 0 on the industry dimension despite having industry data. Check that your targetIndustries values match the format in the lead record. The engine checks industry, vertical, category, niche, and services fields. If the lead stores industry as "Digital Marketing" and you target "Marketing Agency", the fuzzy alias resolves correctly. But if the lead has no industry field at all, the dimension returns 0. Inspect icpNotes[0] on a low-scoring record — it will state exactly which fields were read and what the target was.
Company size dimension returning 0 despite employee data present. The actor reads employeeCount, teamSize, employees, headcount, and companySize fields. Ensure at least one of these is present and contains a number or a parseable string like "45 employees", "11-50", or "mid-market". A field named staff or team_size (snake_case) will not be read — map it to a supported field name upstream.
spendingLimitReached: true in the summary record. The PPE spending cap was hit before all leads were processed. Leads processed before the limit was reached are in the dataset. Either increase the per-run spending limit in the Apify console, or reduce the number of leads per run by lowering maxLeads.
Run completes but output dataset is empty. This happens when minScoreToInclude is set too high for the data quality of the input leads. Check the summary record (it is always written regardless of the filter) for the grade distribution. If all leads are grade F or D, lower minScoreToInclude to 0, inspect the icpNotes on a sample of records, and adjust your ICP configuration accordingly.
Charge count is lower than total input. Expected behaviour as of v1.1 — minScoreToInclude filters BEFORE charging, and the charge fires only after a lead is pushed to the dataset. Total charges therefore equal totalPushed in the summary record, not totalInput. If you want every input lead charged regardless of grade, set minScoreToInclude: 0 (the default).
Responsible use
- This actor processes only the lead data you supply — it does not scrape any websites or call any external APIs.
- When using scored lead data for outreach, comply with GDPR, CAN-SPAM, CASL, and other applicable data protection laws in your jurisdiction.
- Do not use scored lead data for spam, harassment, or unsolicited contact outside the terms of service of your outreach platform.
- Ensure you have a lawful basis for processing personal data (including email addresses) contained in the lead records you supply.
- For guidance on web scraping legality, see Apify's guide.
FAQ
How does lead scoring against an ICP work? The actor evaluates each lead across six dimensions — industry match, company size, services alignment, contact presence, intent signals, and data completeness — and combines weighted dimension scores into a single 0–100 number. Dimension weights default to 25/20/20/15/10/10 and can be customised. The final score determines a letter grade (A through F) using fixed thresholds.
How many leads can I score in one run? Up to 100,000 (set via maxLeads). The default cap is 10,000. The actor paginates large datasets in 1,000-item batches to stay within the 256 MB memory allocation.
Does lead scoring require exact field names? The actor reads a defined set of field names (listed in the Limitations section). If your upstream scraper uses different names (e.g. staff_count instead of employeeCount), map the fields before passing leads to the scoring engine. Renaming can be done in a Make/Zapier step or with a lightweight transformation actor.
What happens if I don't configure any ICP targets? Dimensions with no configured targets return a neutral score of 50. If all dimensions return neutral, the final score is 50 (grade C). This is useful for testing the pipeline before you have a defined ICP — all leads score similarly and the output reflects contact presence and data completeness only.
How is Lead Scoring Engine different from B2B Lead Qualifier? Lead Scoring Engine scores any lead record you supply against a configurable ICP using six dimensions. It is a computation layer in a pipeline, not a data source. B2B Lead Qualifier fetches and analyses 30+ signals from external sources about a company. The two work best together: score first to identify which leads are worth deeper qualification, then run B2B Lead Qualifier on grade-A leads only.
Can I use custom ICP dimensions beyond the six built-in ones? Not currently. The six dimensions (industry, company size, services, contact presence, intent signals, data completeness) cover the most common B2B qualification criteria. If you need additional dimensions — for example, a revenue threshold or geographic filter — pre-filter your leads before passing them to the actor.
How accurate is the industry matching? Exact matches (target term equals a field value verbatim or via alias) score 100% accurately. Partial matches (target term appears as a substring of combined field text) score 60 and are correct most of the time but can produce false positives — for example, targeting "SEO" would partially match a company description mentioning "our CEO". For high-precision industry filtering, ensure your leads have a dedicated industry field from upstream enrichment.
Is it legal to score leads using this actor? Yes — the actor processes data you supply and makes no external data requests. The legality of your outreach depends on how you obtained the lead data and how you use it, not on the scoring computation itself. Ensure your lead acquisition and outreach comply with GDPR, CAN-SPAM, and relevant local laws.
Can I schedule lead scoring to run automatically? Yes. Use Apify's built-in scheduling to trigger a scoring run daily or weekly. Point datasetId at the output of a scheduled upstream actor (e.g. Google Maps Email Extractor) and the scoring run will process new leads automatically as they are collected.
How long does a typical run take? Scoring 100 leads takes under 10 seconds. Scoring 1,000 leads takes approximately 30–60 seconds, primarily due to actor startup time. Scoring 10,000 leads takes 2–4 minutes. If datasetId points to a large dataset, loading time adds 10–20 seconds per 10,000 records retrieved.
What is the difference between icpScore and icpFactors? icpScore is the final weighted score (0–100) used for grading and sorting. icpFactors contains the raw 0–100 score for each dimension before weight application — useful for diagnosing which specific dimension is dragging a lead's score down.
Can I run this actor from a Cursor or Claude workflow via MCP? Yes. The Apify platform exposes actor runs through the Apify MCP server. You can call this actor from any LLM agent that supports MCP tool calls, passing leads inline and receiving scored results in the same response cycle.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom ICP configurations, pipeline integrations, or enterprise scoring volumes, reach out through the Apify platform.
Related actors
AI Cold Email Writer — $0.01/Email, Zero LLM Markup
Generates personalized cold emails from enriched lead data using your own OpenAI or Anthropic key. Subject line, body, CTA, and optional follow-up sequence — $0.01/email, zero LLM markup.
AI Outreach Personalizer — Emails with Your LLM Key
Generate personalized cold emails using your own OpenAI or Anthropic API key. Subject lines, opening lines, full bodies — tailored to each lead's role, company, and signals. $0.01/lead compute + your LLM costs. Zero AI markup.
B2B Lead Generation Suite - Find Emails, Score & Qualify Leads
All-in-one B2B lead pipeline. Enter company URLs, get enriched leads with emails, phone numbers, contacts, email patterns, quality scores (0-100), grades, and business signals from a 3-step automated pipeline.
B2B Lead Qualifier - Score & Rank Company Leads
B2B lead scoring tool and API that scores companies 0-100 from 30+ website signals. 5 scoring categories, 4 profiles (sales, marketing, recruiting, default). Plain-English explanations, hiring detection, industry classification, score change tracking. $0.15/lead, no subscription.
Ready to try Lead Scoring Engine — ICP Score Leads 0-100?
Run it on your own Apify account. Apify offers a free tier with $5 of monthly credits.
Open on Apify Store