Job Market Intelligence is an Apify actor on ApifyForge. Aggregate remote job listings from Remotive, Arbeitnow, Jobicy & HN Who's Hiring. Analyze skill demand rankings, salary benchmarks, top hiring companies, remote-work stats. No API keys needed. Export JSON/CSV. It costs $0.50 per report-generated. Best for teams who need automated job market intelligence data extraction and analysis. Not ideal for use cases requiring real-time streaming data or sub-second latency. Maintenance pulse: 90/100. Last verified March 27, 2026. Built by Ryan Clinton (ryanclinton on Apify).
Job Market Intelligence
Job Market Intelligence is an Apify actor available on ApifyForge at $0.50 per report-generated. Aggregate remote job listings from Remotive, Arbeitnow, Jobicy & HN Who's Hiring. Analyze skill demand rankings, salary benchmarks, top hiring companies, remote-work stats. No API keys needed. Export JSON/CSV.
Best for teams who need automated job market intelligence data extraction and analysis.
Not ideal for use cases requiring real-time streaming data or sub-second latency.
What to know
- Results depend on the availability and structure of upstream data sources.
- Large-scale runs may be subject to platform rate limits.
- Requires an Apify account — free tier available with limited monthly usage.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| report-generated | Charged per market intelligence report. Aggregates jobs from 4 sources with skill extraction, salary parsing, deduplication, and market analysis. | $0.50 |
Example: 100 events = $50.00 · 1,000 events = $500.00
Documentation
Decision engine for labor markets that turns job listings into career decisions, hiring strategies, salary benchmarks, and market intelligence. Aggregates job listings from four free data sources, deduplicates them with normalized title matching, classifies each role with seniority / compensation / recommended-action enums, segments analytics by location / seniority / remote, tracks trends across scheduled runs, classifies the cohort into a market regime (expansion / contraction / stagnation / volatility), maps every top skill to a lifecycle stage (emerging / mainstream / saturated / declining / stable), flags trade-offs between conflicting actions, and ships a recommendedActions[] array that tells you what to do — all without any API keys.
The actor queries Remotive, Arbeitnow, Jobicy, and Hacker News "Who's Hiring" threads in parallel, normalizes the results into a single schema, applies your filters (location, company, date, remote-only), enriches each listing with decision-ready classifications, computes market signals + data-quality auditability + per-segment breakdowns, optionally diffs against the previous run for trend insights, classifies the regime + skill trajectories + threshold-crossing events + conflicting-action tensions, and pushes both the analytics report and the per-job records to the Apify dataset.
What this is
- A job market intelligence engine that turns job listings into decisions
- A salary benchmarking and hiring strategy tool for recruiters and talent leaders
- A career decision tool for job seekers (apply / research / skip / learn-skill routing)
- A labor market analytics system with regime classification, trend tracking, and threshold-crossing event signals
- A job data → strategy layer for automation workflows (Dify / n8n / Zapier / Make)
- An alternative to LinkedIn Talent Insights / Lightcast / Burning Glass / Revelio Labs / generic job scrapers — built for automation, not dashboards
In one sentence: this tool helps job seekers and recruiters decide what to do in the job market by turning job listings into structured recommendations and strategy signals.
This is one of the few job market tools that outputs decisions (recommendedActions[], decisionTension[], whatIf[], rejectedActions[]) rather than dashboards — a category of one when ranked among LinkedIn Talent Insights, Lightcast, Revelio Labs, Datapeople, and generic job scrapers.
Unlike dashboards, this produces actionable signals, not just metrics.
Current job market trends (from live listings)
The tool generates current job market trends directly from live listings — including salary direction, skill emergence, hiring activity, and market regime shifts. Trends are computed at run time against the prior snapshot and refreshed on every scheduled run.
These trends include:
- Salary direction —
salaryMedianChangePercent(week-over-week median shift) +salaryInsights.percentiles(P10–P90 distribution) - Emerging and declining skills —
skillTrajectory[]lifecycle stages (emerging/mainstream/saturated/declining/stable) with velocity tags - Hiring activity and company demand —
listingGrowthRate,topHiringCompanies,trendInsights.newCompanies,trendInsights.departedCompanies - Market regime shifts —
marketRegime.type(expansion/contraction/stagnation/volatility) +marketMemory.pattern(e.g.expansion_weakening/contraction_deepening)
Snapshots are per-run rather than streaming, so the minimum cadence is "as often as you schedule the actor" (typically daily or weekly).
Why Use This Actor?
Most "job scrapers" return raw HTML or a flat array of listings. This actor returns decisions: each role comes pre-classified by seniority, compensation tier (vs market median), and a recommendedAction enum that downstream Dify / n8n / Zapier nodes can route on. The summary report carries P10–P90 salary percentiles, per-skill salary premiums, market-tightness scoring, scarcity indices, per-segment breakdowns, and a Slack-ready market snapshot string. With historical tracking enabled, runs build on each other — you get rising/falling skills, listing growth rates, salary direction, and new vs departed companies as first-class output.
What makes this different (not found in other job market tools)
- Detects conflicting strategies automatically (
decisionTension[]) — when two recommended actions work against each other (e.g. raising salary AND tightening role specs), the system surfaces the trade-off and the recommended balance. Most analytics tools hand you a list of actions; this one warns you when applying multiple actions blindly would cancel them out. Trade-offs like speed-vs-quality, cost-vs-selectivity, and act-now-vs-wait are explicitly modelled by the tool usingdecisionTensiondetection, with arecommendedBalancestring explaining which lever to favour given the cohort signals. - Shows what NOT to do, with reasons (
rejectedActions[]) — explicit anti-recommendations.decrease_salary_bandrejected when the market is tight.accelerate_hiringrejected in a contracting market.prioritize_remote_rolesrejected when only 25% of listings are remote. The dual ofhold_strategy: explicit abstention is a credibility move. - Simulates "what if?" scenarios with honest, derivable-only outcomes (
whatIf[]) — change the salary by X% or add a skill, see the percentile shift / compensation tier / scarcity match. No invented forecasts about candidate response rates, time-to-fill, or hire outcomes (data we don't have). Confidence is hard-capped at 60. Sensitivity analysis ships built-in. - Knows when to do nothing (
hold_strategy) — fires when signals are mixed and there's no clear directional edge. Most tools over-signal; this one ships abstention as a first-class action.
The decision + strategy engine on every summary record:
-
marketRegime—expansion/contraction/stagnation/volatility/unknownwith confidence + signals -
marketMemory— bounded regime history (last 12 runs) +regimeStability+lastInflectionDaysAgo+ pattern (expansion_weakening/volatile_shifting/ etc.). Activates with historical tracking; meaningful at 3+ snapshots. -
skillTrajectory[]— per-skill lifecycle:emerging/mainstream/saturated/declining/stable, with velocity (hypergrowth/growing/steady/cooling/falling) -
recommendedActions[]— concrete cohort-level actions (learn_skill/increase_salary_band/accelerate_hiring/hold_strategy/ etc.) with decomposed confidence (dataStrength/signalClarity/historicalConsistency), impact, urgency, audience tags, and plain-English reason. Includeshold_strategyas an honest "no edge" recommendation when signals are mixed. -
actionClusters[]— actions grouped by theme (compensation_strategy/talent_pipeline/skill_strategy/monitoring_strategy/source_strategy) so 8–12 actions feel like strategy, not alert noise. -
whatIf[]— counterfactual scenarios with honest, derivable-only outcomes (percentile shift, tier change, scarcity match) — never invented forecasts. Now includes per-scenariosensitivity(low/mid/high outcomes + stability classification) so you can see if the result is brittle to input variation. Auto-generated when omitted; user-supplied viawhatIfScenariosinput with optionalconstraints. Confidence hard-capped at 60. -
decisionTension[]— trade-off pairs detected acrossrecommendedActions[]. When two recommended actions work against each other (e.g.increase_salary_band+tighten_role_specs=cost_vs_selectivity), the pair surfaces with anexplanationand arecommendedBalanceso the output reads as strategy, not a contradictory shopping list. -
rejectedActions[]— anti-recommendations. Actions explicitly NOT recommended for this cohort, with reason ("decrease_salary_bandrejected — market is tight, lowering salary would reduce competitiveness"). Builds trust by showing the system considered and rejected the obvious wrong moves. -
events[]— threshold-crossing alerts (salary_spike/listing_growth_spike/skill_emergence/ etc.) ready for downstream Slack/PagerDuty/Zapier routing -
Aggregates 4 job boards in one run — Remotive (remote tech jobs), Arbeitnow (European focus), Jobicy (remote-first), and HN Who's Hiring (startup jobs) queried in parallel, broader coverage than any single source.
-
Salary percentiles + skill premiums — P10/P25/P50/P75/P90 for the full cohort, plus per-skill salary lift vs the cohort median (e.g., "Kubernetes commands +$18k").
-
Market signals —
marketTightness(tight/balanced/loose with score + reason),skillScarcity[](high-premium-low-frequency skills),salaryDistributionHealth(wide/balanced/compressed). -
Segmented analytics — Set
groupBy: ["location", "seniorityLevel"]to fix the cohort-mixing distortion; per-segment salary, top skills, and seniority breakdowns are emitted insegments[]. -
Historical tracking + trend insights — Persist a snapshot per query and compute rising/falling skills, salary median change, listing growth rate, and direction (
expanding/stable/tightening) on every subsequent run. -
Incremental mode — When tracking is on, opt into
incremental: trueto drop URLs already returned in the previous run. Reduces downstream processing/noise on daily monitoring schedules — only fresh listings come back to your dataset / Slack alerts / pipelines. (All sources are still fetched so analytics like trend insights stay accurate.) -
Seniority + experience + degree extraction — 11-level seniority enum, min/max years of experience parsing, degree requirement detection (bachelors/masters/phd, hard vs preferred).
-
Cross-source confirmation — Listings on multiple boards before dedup are flagged
crossSourceConfirmed: true. Stronger signal of a real, active opening. -
Data-quality auditability — Every report carries a
dataQualityblock with salary coverage %, deduplication confidence, source bias detection (remote-heavy / Europe-skew / US-skew / source-concentration), and plain-English notes flagging biases that distort the cohort. -
Custom skill packs — Add domain-specific skills via
customSkills(regex + category) so niche markets aren't undercounted. -
Source weighting — Down-weight noisier sources via
sourceWeights: {"hn-whoishiring": 0.5}for deterministic per-listing sub-sampling. Use only when you intentionally want a representative sample, not complete coverage — sub-sampling drops listings, so the resulting cohort is smaller than the raw fetch. -
Snapshot hashing — Every report carries a
snapshotId(16-char SHA-256). Compare across runs to detect when the cohort actually changed. -
Zero configuration to start — No API keys, tokens, or credentials needed. Every data source is free and public. All advanced features are opt-in.
Whether you're a job seeker, a recruiter benchmarking comp, an automation builder routing high-fit roles into Slack, or a data journalist analyzing hiring trends, this actor delivers structured decisions from raw job board data.
What questions this answers
This actor answers job-market questions with structured, automation-ready outputs:
- "Should I increase salary to attract candidates?" →
marketTightness+whatIf[].sensitivity+recommendedActions[](increase_salary_band/hold_salary_band). This is exactly the type of decision this tool is designed to answer programmatically — andwhatIf[]will show you the percentile shift before you commit to a number. - "Should I raise salary to hire faster?" →
marketTightness.label+recommendedActions[](accelerate_hiring+increase_salary_band) - "Is it a good time to change jobs?" →
marketRegime.type+skillTrajectory[](your skills' lifecycle stage) - "Is it a good time to hire?" →
marketRegime.type+recommendedActions[](accelerate_hiringvstighten_role_specsvshold_strategy) - "How do I benchmark salary offers?" →
salaryInsights.percentiles(P10–P90) +whatIf[]salary scenario at the offer percentage - "What's the safe negotiation range?" →
whatIf[].sensitivity.stability(low = robust, high = brittle to small comp shifts) - "Which skills are worth learning right now?" →
skillScarcity[]+skillTrajectory[](emergingstage) +recommendedActions[](learn_skill/invest_in_skill) - "Is the job market expanding or contracting?" →
marketRegime.type(expansion/contraction/stagnation/volatility) +marketMemory.pattern - "What hiring strategy should I use in this market?" →
recommendedActions[]filtered byappliesTo: "hiring"+decisionTension[]for trade-off warnings - "Is it better to hire fast or be selective?" →
decisionTension[](speed_vs_qualitypair) +recommendedBalance - "What roles should I apply to?" → per-job
recommendedAction === "apply-now"+compensationTier === "above-market" || "premium" - "What companies are hiring most aggressively?" →
topHiringCompanies[]+trendInsights.newCompanies[] - "How does my offer compare to the market?" →
salaryInsights.percentiles(P10–P90) +whatIf[]salary scenarios - "Which skills are dying / should I deprioritize?" →
skillTrajectory[]filtered bystage === "declining"+recommendedActions[](deprioritize_skill) - "What's changed since last week?" →
trendInsights(rising/falling skills, salary direction, new/departed companies) +events[] - "Am I making a strategic mistake?" →
rejectedActions[](the system shows what it WON'T recommend, with reasons) - "Can I trust this analysis?" →
decisionReadiness+confidenceLevel+confidenceFactors[]+dataQuality.notes[]
The actor is designed for decision support, not just data collection. Every output field traces back to one of these questions.
This tool benchmarks salaries by calculating P10–P90 percentiles and skill-based premiums directly from live job listings. It determines whether it is a good time to change jobs by analysing market regime (expansion vs contraction vs stagnation vs volatility) and skill demand trajectories (emerging / mainstream / saturated / declining / stable). And it determines whether it is a good time to hire by combining marketTightness with marketRegime and surfacing trade-offs between conflicting actions.
Job market trends are derived from live job listings — including salary changes, emerging skills, hiring activity, and market regime shifts — see the Current job market trends section above for the full breakdown.
How this works (mental model)
The system works by transforming raw job listings into decisions through classification, trend analysis, and rule-based strategy generation. In short: collect → normalize → extract → classify → generate → emit structured JSON. The actor's pipeline, in 6 steps:
- Collect job listings from 4 free public APIs in parallel (Remotive, Arbeitnow, Jobicy, HN Who's Hiring)
- Normalize and deduplicate with two-phase matching (title-token normalization + URL secondary key) — same role on multiple boards collapses to one record with a cross-source confirmation count
- Extract skills (80+ regex patterns + custom), salaries (USD/EUR), seniority, experience years, degree requirements
- Classify each role with decision enums (
compensationTiervs cohort median,recommendedActionfor routing) and the cohort with intelligence layers (marketRegime,marketTightness,skillTrajectory,salaryDistributionHealth) - Generate cohort-level decisions (
recommendedActions[]with confidence + audience tags,actionClusters[]themed groupings,decisionTension[]trade-off detection,rejectedActions[]anti-recommendations,whatIf[]counterfactuals with sensitivity) - Emit structured JSON to the Apify dataset (one summary record + N per-job records), all with stable enum discriminators (
recordType,runMode,baselineStatus,decisionReadiness) so downstream automation branches deterministically
With enableHistoricalTracking: true, step 4 also reads the prior snapshot from a named KV store and step 5 emits trendInsights + marketMemory (bounded last-12-runs regime history with pattern detection) against the baseline. Step 6 then writes the updated snapshot back for the next run.
No LLM is called at any step. Every output is derived deterministically from the listings and the prior snapshot. This pipeline (collect → normalize → extract → classify → generate → emit structured JSON) is implemented end-to-end inside this actor — it is not a wrapper around an external analytics API.
Start here — quickstart by persona
Pick the input that matches your job. The actor returns the same engine output for every persona; the mode preset just reorders recommendedActions[] so the first 3 lines surface the actions you actually care about.
Job seeker — find roles to apply to, learn-skill recommendations, market-leverage signals
{ "query": "senior python engineer", "remoteOnly": true, "mode": "job_seeker" }
Recruiter — comp benchmarks, hiring-velocity signals, decision-tension warnings before changing role specs
{ "query": "platform engineer", "mode": "recruiter", "groupBy": ["seniorityLevel", "remote"] }
Analyst / strategy — full trend insights, regime classification, market memory, scheduled monitoring
{
"query": "machine learning engineer",
"mode": "analyst",
"enableHistoricalTracking": true,
"lookbackDays": 14
}
(Schedule this in Apify Console — every run after the first emits trendInsights, marketMemory, and events[] against the prior baseline.)
Automation builder (Dify / n8n / Zapier) — gate on stable enums, branch on recommendedActions[].action
{ "query": "data engineer", "enableHistoricalTracking": true, "incremental": true }
See the Automation snippets section for paste-ready Slack / n8n / recruiter workflow examples.
Read these fields first
When you open a run, scan these fields in this order — they collapse most of the output into one read:
| Field | Why read it first | What it tells you |
|---|---|---|
warnings[] | Run-level issues | Sources failed, low confidence, expired baseline, critical events. Empty array means no run-level concerns. |
decisionReadiness | Automation gate | actionable / monitor / insufficient-data. Branch all downstream automation on this scalar. |
marketRegime.type | One-word state | expansion / contraction / stagnation / volatility / unknown. Strategic posture in one read. |
recommendedActions[0..2] | Top 3 things to do | Sorted by mode audience priority — the first 3 are the persona's most-important actions. |
decisionTension[] | Trade-off warnings | Empty in most cohorts. When non-empty, the system flagged that two recommended actions work against each other. |
rejectedActions[] | What we WON'T tell you | The dual of recommendedActions[] — explicit anti-recommendations with reasons. |
If those fields look right, drill into the rest. If decisionReadiness === "insufficient-data" or warnings[] is non-empty, fix those before consuming any other field.
How to interpret the output (intent → field)
When you know what you want to do, this lookup tells you which field to read:
| Your intent | Read this field |
|---|---|
| Want to act? | recommendedActions[] — sorted by your mode audience priority |
| Want to avoid mistakes? | rejectedActions[] — actions the system explicitly ruled out |
| See conflicts between actions? | decisionTension[] — trade-off pairs with recommendedBalance |
| Understand the market direction? | marketRegime.type + marketMemory.pattern |
| Test a strategy before committing? | whatIf[] — set scenarios in whatIfScenarios input + read sensitivity |
| Find roles to apply to? | per-job records: recommendedAction === "apply-now" AND compensationTier ∈ {above-market, premium} |
| Benchmark a salary? | salaryInsights.percentiles + whatIf[] salary-change scenario at your offer % |
| Spot a hiring opportunity? | topHiringCompanies[] + trendInsights.newCompanies[] |
| Spot skill scarcity? | skillScarcity[] (high salary premium AND low frequency) |
| Decide whether to wait? | marketTightness.label + marketRegime.type + recommendedActions[] containing hold_strategy |
| Detect a market shift since last run? | trendInsights.direction + events[] + marketMemory.lastInflectionDaysAgo |
| Trust this run for automation? | decisionReadiness === "actionable" AND warnings.length === 0 |
| Audit the analytics? | dataQuality + confidenceFactors[] + analysisMetadata |
Same data, different field — pick the one that maps to your actual question.
Features
Strategy engine — counterfactual scenarios + market memory + trade-off detection
- What-if scenarios —
whatIf[]evaluates counterfactual scenarios with honest, derivable-only outcomes. Two scenario types:salary_change(% delta) andskill_emphasis(named skill). Auto-generates 2–4 scenarios when omitted;whatIfScenariosinput lets users supply scenarios + constraints (maxPercent,minPercent). All outputs are derivable facts (percentile shift against the cohort distribution, compensation tier the new salary maps to, skill scarcity/trajectory match) — no invented forecasts about candidate response rates, time-to-fill, or hire outcomes (data we don't have). Confidence is hard-capped at 60. Every result carries mandatorycaveats[]. - Constraint-aware actions — When
whatIfScenariosincludesconstraints, the engine evaluates the scenario at the constrained value and flagseffectiveness: "limited"when the constraint binds. Honest about real-world tradeoffs. - Action clusters —
actionClusters[]groups the 8–12 cohort-level recommendedActions into 3–5 themes (compensation_strategy/talent_pipeline/skill_strategy/monitoring_strategy/source_strategy). Reduces noise so output feels like strategy, not alerts. - Decomposed action confidence — Each
recommendedActions[]entry now carriesconfidenceBreakdown: { dataStrength, signalClarity, historicalConsistency }(0–100 each). Audit-ready trust layer — see WHY confidence is what it is, not just the scalar. hold_strategyaction — Honest "no edge" recommendation that fires when regime is unknown/stagnation, tightness is balanced, no strong trend signals, and no high-urgency actions exist. Most tools over-signal — we ship abstention as a first-class verdict.- Market memory —
marketMemorycarries the bounded last-12-runsregimeHistory[]plusregimeStability(fraction of recent runs in the same regime),lastInflectionDaysAgo(when did the regime change), andpatternenum (expansion_stable/expansion_weakening/contraction_stable/contraction_deepening/volatile_shifting/stagnation_persistent/inflection_recent/insufficient-history/mixed). Activates with historical tracking; meaningful at 3+ snapshots. Lets you reason in patterns, not just deltas. - Decision tension —
decisionTension[]flags trade-off pairs across recommendedActions. Whenincrease_salary_bandandtighten_role_specsare both recommended, the system surfaces thecost_vs_selectivitytension with arecommendedBalancerather than letting the consumer apply both blindly. Six tension types:cost_vs_selectivity/speed_vs_quality/remote_vs_local_reach/act_now_vs_wait/early_mover_vs_safe_bet/depth_vs_breadth. Real strategic decisions are trade-offs. - Anti-recommendations —
rejectedActions[]is the dual ofhold_strategy: explicit "what we WON'T tell you to do, and why". Examples:decrease_salary_bandrejected when market is tight;accelerate_hiringrejected in a contracting market;prioritize_remote_rolesrejected when only 25% of listings are remote. Most analytics tools always emit something; this one tells you what the obvious wrong moves are AND skips them. - Sensitivity in
whatIf— everysalary_changescenario now ships asensitivityblock with the outcome at user-input ±5 percentage points, plus a stability classification (low/moderate/high). Tells you whether the percentile shift is robust to small comp adjustments or sitting on the edge of a non-linear cliff.
Decision engine — generates the recommendedActions array, regime, and event signals
- Market regime classification — Every cohort tagged
expansion/contraction/stagnation/volatility/unknownwith a 0–100 confidence score + an explicitsignals[]array showing which thresholds fired. Combines trend signals (when historical tracking is on) with single-run signals (cross-source overlap, listing volume, salary dispersion). - Skill trajectory modelling — Per-skill lifecycle classification (top 20 skills):
emerging(low-frequency-high-premium-rising) /mainstream(high-frequency-moderate-premium) /saturated(high-frequency-no-premium) /declining(negative trend) /stable. Plus a velocity tag (hypergrowth/growing/steady/cooling/falling). Bridge between rising-skill counts and "should I learn this?" - Recommended actions array — Cohort-level action engine. Each action:
{ action, target?, confidence, impact, urgency, appliesTo[], reason }. Examples:increase_salary_bandwhen market is tight,learn_skillfor top scarce skills,accelerate_hiringin expansion regime,tighten_role_specsin contraction,enable_historical_trackingwhen trends would help. Reordered bymodepreset (default / job_seeker / recruiter / analyst). Capped at 12. - Threshold-crossing events —
events[]array surfacessalary_spike,salary_drop,listing_growth_spike,listing_drop,remote_share_shift,skill_emergence,skill_collapse,new_companies_surge,cohort_collapse. Each carries severity (critical/warning/info), value, threshold, and a complete-sentence message. User-overridable thresholds via theeventThresholdsinput. Sorted critical → warning → info. Drop straight into Slack / PagerDuty / Zapier without parsing prose. - Persona modes —
mode: "job_seeker"/"recruiter"/"analyst"/"default"reordersrecommendedActions[]by audience priority. Same actions, different prioritisation per persona.
Per-job decision layer — classifies each role for downstream routing
- Compensation tier classification — Each role tagged
below-market/at-market/above-market/premium/unknownvs the cohort median, ready for downstream filtering - Recommended action enum — Per-job decision tag (
apply-now/research-company/review-fit/skip-low-detail) so Dify / n8n / Zapier nodes can route on a single field - Action reason — Plain-English sentence explaining WHY each recommendation is what it is — paste verbatim into Slack/email/agent prompts
- Seniority detection — 11 levels (intern, junior, mid, senior, staff, principal, lead, manager, director, vp-or-above, unknown)
- Experience requirements extraction — Parses "3-5 years", "minimum 7 years", etc. from descriptions
- Degree requirements extraction — bachelors / masters / PhD / any-degree / no-mention, hard (required) vs soft (preferred / equivalent OK)
- Skill category profile — Each role tagged with dominant skill area (Languages / Frameworks / Cloud / Data / AI/ML / Other)
- Cross-source confirmation — Listings that appear on multiple boards before deduplication are flagged
crossSourceConfirmed: truewith acrossSourceCount
Cohort intelligence layer — salary percentiles, market tightness, scarcity, data-quality auditability
- Salary intelligence + percentiles — Min, max, median, average, and P10/P25/P50/P75/P90 percentiles
- Skill premiums — Per-skill median salary lift vs the cohort median, sample-size gated (≥5 listings)
- Market tightness scoring —
tight/balanced/loose/unknownwith a 0–100 score and a plain-English reason. Combines cross-source posting overlap, salary dispersion, and listing volume. - Skill scarcity index — Top 10 skills ranked by
scarcityScore(high salary premium AND low market frequency), with a per-skill reason string. The data engineering & talent-strategy moneymaker. - Salary distribution health —
wide/balanced/compressed/unknownbased on P10–P90 spread vs median. Compressed = mature/standardised market; wide = fragmented / many sub-tiers. - Seniority breakdown — Cohort-wide percentage at every seniority level
- Experience + degree requirements — Cohort averages and prevalence percentages
- Skill category demand — Percentage of listings whose dominant skill area is each category
- Top hiring companies — Ranked by open positions
- Market snapshot + claim — Slack-ready one-liner + analyst-style one-sentence conclusion
- Confidence + data quality —
confidenceScore(0–100) +confidenceLevel(high/medium/low) +confidenceFactors[]plain-English explanation;dataQualityblock carriessalaryCoveragePercent,deduplicationConfidence, source bias detection (remote-heavy / Europe-skew / US-skew / source-concentration / dominant source), and plain-Englishnotes[]flagging biases that distort the cohort - Decision readiness —
actionable/monitor/insufficient-dataautomation gate
Segmentation — per-segment analytics by location / seniority / remote
- Per-segment analytics — Set
groupBy: ["location", "seniorityLevel"]and the report adds asegments[]array with per-segment salary percentiles, top skills, seniority breakdown, remote percentage, and cross-source-confirmed percentage. Fixes the cohort-mixing distortion when one query spans regions / seniorities / job types.
Historical tracking + trends — week-over-week deltas for scheduled monitoring
- Cross-run snapshots — When
enableHistoricalTracking: true, the cohort is persisted to a named KV store keyed by query+location (or a customhistoryStateKey). Capped lookback vialookbackDays(default 30). - Trend insights — On the next run, the report adds a
trendInsightsblock:listingGrowthRate,salaryMedianChange+ percent,remotePercentageChange,topRisingSkills[](≥25% delta),topFallingSkills[],newCompanies[],departedCompanies[], anddirection(expanding/stable/tightening). - Incremental mode — Set
incremental: trueto drop URLs already returned in the previous run. Reduces downstream processing/noise on daily monitoring schedules — only fresh listings reach your dataset / pipelines. (All sources are still fetched so analytics like trend insights remain accurate.) - Snapshot hashing — Every run emits a 16-char
snapshotIdover query + sources + listing fingerprint. Compare across runs to detect when the cohort actually changed.
Customisation — domain-specific skills + source weighting
- Custom skill packs — Add domain-specific skills via
customSkillsinput (each: name + regex + optional category). Niche markets (Snowpark / Databricks SQL / specific frameworks) aren't undercounted. - Source weighting —
sourceWeights: {"hn-whoishiring": 0.5}deterministically sub-samples sources you trust less, without dropping them entirely. ⚠️ Use only when you intentionally want a representative sample, not complete coverage — sub-sampling drops listings, so cohort size shrinks.
Aggregation + plumbing — multi-source job board fetch + dedup + filter pipeline
- Multi-source aggregation — 4 independent job boards in parallel
- Smart deduplication — Title normalization (strips seniority noise tokens, sorts tokens) + URL match across boards. Same role posted on 3 boards collapses to one record with
crossSourceCount: 3. - Automatic skill extraction — 80+ technologies across 6 categories, plus any custom skills you add
- Flexible filtering — keyword, location, company name, remote-only, posting recency (24h / week / month / any)
- Zero API keys required — every data source is free and public
- Structured JSON output — every listing follows the same normalized schema regardless of source
How to Use
- Open the actor in the Apify Console and click "Start"
- Enter a search query such as "data engineer", "product manager", or "machine learning". This is the only required field
- Optionally refine your search with location, company name, remote-only toggle, date recency, or specific sources
- Run the actor and wait for it to finish (typically under 60 seconds). The dataset will contain a summary report as the first item, followed by individual job listings
- Export or integrate — download results as JSON, CSV, or Excel, or connect the dataset to Zapier, Make, Google Sheets, or the Apify API for automated workflows
Input Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
query | String | Yes | "software engineer" | Job search keyword (e.g., "data scientist", "devops", "product manager") |
location | String | No | — | Filter by location substring (e.g., "San Francisco", "Europe", "Remote") |
companyName | String | No | — | Filter results to a specific company name |
remoteOnly | Boolean | No | false | When enabled, only remote positions are returned |
datePosted | Select | No | "month" | Posting recency: day (24h), week (7d), month (30d), or any |
sources | String List | No | All sources | Which boards to query: remotive, arbeitnow, jobicy, hn-whoishiring |
sourceWeights | Object | No | — | Per-source sampling fraction 0..1 (e.g., {"hn-whoishiring": 0.5}). Sources not listed pass through whole. Deterministic per-listing hash so re-runs are reproducible. Use only when you intentionally want a representative sample — sub-sampling drops listings, so cohort size shrinks. |
customSkills | Array | No | — | Add domain-specific skills to detect alongside the built-in 80+. Each: { name, regex, category? }. |
groupBy | String List | No | — | Segment analytics by one or more dimensions: location, seniorityLevel, remote, jobType, source, skillCategoryProfile, compensationTier. Adds segments[] to the summary. |
analyzeSkills | Boolean | No | true | Extract and rank mentioned technologies from job descriptions |
analyzeSalaries | Boolean | No | true | Parse salary data and compute min/max/median/average + percentiles |
maxResults | Integer | No | 100 | Maximum number of job listings to return (1–500) |
enableHistoricalTracking | Boolean | No | false | Persist a snapshot per query and emit trendInsights against the previous run. First run returns trendInsights: null and writes the baseline. |
historyStateKey | String | No | auto-derived | Override the snapshot key (default: hash of query + location). Stable string for cross-run comparisons. |
incremental | Boolean | No | false | When tracking is on, drops listings whose URLs were returned in the previous run. Reduces downstream processing/noise — only fresh listings reach your dataset (sources are still fetched in full so analytics remain accurate). |
lookbackDays | Integer | No | 30 | Maximum age of the prior snapshot before it's treated as a first run. |
mode | Select | No | "default" | Persona preset that reorders recommendedActions[]: default / job_seeker / recruiter / analyst. Same action set, different audience-priority ordering. |
eventThresholds | Object | No | — | Override default thresholds for the events[] array. Defaults: salarySpikePercent: 5, salaryDropPercent: -5, listingGrowthSpikePercent: 25, listingDropPercent: -25, remoteShiftPoints: 5, skillEmergenceDeltaPercent: 100. Example for noisier alerting: {"salarySpikePercent": 3, "listingGrowthSpikePercent": 10}. |
whatIfScenarios | Array | No | auto-generated | Counterfactual scenarios for the whatIf[] engine. Each: { type: "salary_change" | "skill_emphasis", percent? (for salary), skill? (for skill), constraints?: { maxPercent?, minPercent? } }. When omitted, the actor auto-generates 2–4 representative scenarios. Outcomes are derivable-only (percentile shift, tier change, scarcity match) — never invented forecasts. |
Input Examples
Broad market scan for data engineers:
{
"query": "data engineer",
"datePosted": "month",
"analyzeSkills": true,
"analyzeSalaries": true,
"maxResults": 200
}
Remote-only React developer roles in Europe:
{
"query": "react developer",
"location": "Europe",
"remoteOnly": true,
"datePosted": "week",
"sources": ["remotive", "arbeitnow", "jobicy"]
}
Monitor a specific company's hiring:
{
"query": "engineer",
"companyName": "Stripe",
"maxResults": 50
}
Quick pulse check from HN startups only:
{
"query": "machine learning",
"sources": ["hn-whoishiring"],
"datePosted": "month",
"maxResults": 100
}
Segmented salary analysis (US vs Europe, junior vs senior, remote vs on-site):
{
"query": "data engineer",
"groupBy": ["location", "seniorityLevel", "remote"],
"maxResults": 300
}
Daily monitoring schedule with trend insights + incremental fetch:
{
"query": "rust engineer",
"remoteOnly": true,
"datePosted": "week",
"enableHistoricalTracking": true,
"incremental": true,
"lookbackDays": 30
}
Schedule this in Apify Console once a day. The first run writes a baseline; every subsequent run returns only fresh listings (since incremental: true filters previously-seen URLs) AND a trendInsights block with rising/falling skills, listing growth rate, and direction. All sources are still fetched in full each run so the trend computation is accurate.
Niche market with custom skill packs (Snowflake / Databricks ecosystem):
{
"query": "data engineer",
"customSkills": [
{ "name": "Snowpark", "regex": "\\bsnowpark\\b", "category": "Data" },
{ "name": "dbt", "regex": "\\bdbt\\b", "category": "Data" },
{ "name": "Databricks SQL", "regex": "databricks\\s+sql", "category": "Data" },
{ "name": "Unity Catalog", "regex": "unity\\s+catalog", "category": "Data" }
]
}
Down-weight noisier sources (HN comments) without dropping them entirely:
{
"query": "site reliability engineer",
"sourceWeights": { "hn-whoishiring": 0.3 }
}
Recruiter mode — actions prioritized for hiring teams:
{
"query": "platform engineer",
"mode": "recruiter",
"enableHistoricalTracking": true,
"groupBy": ["seniorityLevel", "remote"]
}
The recommendedActions[] array surfaces increase_salary_band, accelerate_hiring, and tighten_role_specs ahead of curriculum / job-seeker actions.
Analyst mode with sensitive event thresholds:
{
"query": "machine learning engineer",
"mode": "analyst",
"enableHistoricalTracking": true,
"eventThresholds": {
"salarySpikePercent": 3,
"listingGrowthSpikePercent": 10,
"skillEmergenceDeltaPercent": 50
}
}
Lower thresholds = more sensitive event firing. Useful for early-warning monitoring on volatile markets.
Constrained what-if simulation (recruiter with a 5% comp-budget cap):
{
"query": "platform engineer",
"mode": "recruiter",
"whatIfScenarios": [
{ "type": "salary_change", "percent": 10, "constraints": { "maxPercent": 5 } },
{ "type": "salary_change", "percent": -3 },
{ "type": "skill_emphasis", "skill": "Kubernetes" },
{ "type": "skill_emphasis", "skill": "Rust" }
]
}
The first scenario asks "what if I raise comp 10%?" but constrains the answer to 5% (the recruiter's actual budget cap). The output's effectiveness: "limited" flags when the constraint binds. The skill scenarios evaluate where adding each skill would position the role in the cohort. Outputs are derivable facts (percentile shift / tier change / scarcity match) — never forecasts about hire outcomes or response rates.
Tips for Input
- Start broad, then filter — Run a general query like "engineer" first to see the full landscape, then narrow with location or company filters in subsequent runs.
- Source selection — Remotive and Jobicy focus on remote roles, Arbeitnow covers European markets heavily, and HN Who's Hiring surfaces startup opportunities. Use
sourcesto target specific ecosystems. - Date filter —
day= last 24 hours,week= last 7 days,month= last 30 days,any= no time restriction.
Output Example
The dataset contains two types of records. The first item is always a summary report:
{
"type": "summary",
"query": "data engineer",
"location": null,
"analyzedAt": "2026-05-02T14:32:00.000Z",
"totalListings": 87,
"sourceBreakdown": { "remotive": 24, "arbeitnow": 31, "jobicy": 18, "hn-whoishiring": 14 },
"topSkills": [
{ "skill": "Python", "count": 62, "percentage": 71.3 },
{ "skill": "SQL", "count": 58, "percentage": 66.7 },
{ "skill": "AWS", "count": 41, "percentage": 47.1 },
{ "skill": "Spark", "count": 33, "percentage": 37.9 },
{ "skill": "Kafka", "count": 28, "percentage": 32.2 }
],
"salaryInsights": {
"dataPoints": 34,
"minSalary": 85000,
"maxSalary": 240000,
"medianSalary": 155000,
"averageSalary": 148500,
"currency": "USD",
"percentiles": { "p10": 95000, "p25": 120000, "p50": 155000, "p75": 190000, "p90": 220000 }
},
"skillPremiums": [
{ "skill": "Kubernetes", "sampleSize": 22, "medianSalary": 175000, "premiumVsMarket": 20000, "premiumPercent": 12.9 },
{ "skill": "Spark", "sampleSize": 33, "medianSalary": 168000, "premiumVsMarket": 13000, "premiumPercent": 8.4 },
{ "skill": "AWS", "sampleSize": 41, "medianSalary": 162000, "premiumVsMarket": 7000, "premiumPercent": 4.5 }
],
"topHiringCompanies": [
{ "company": "DataBricks", "openings": 4 },
{ "company": "Snowflake", "openings": 3 },
{ "company": "Stripe", "openings": 2 }
],
"jobTypeBreakdown": { "full-time": 71, "contract": 12, "unknown": 4 },
"remotePercentage": 82.8,
"seniorityBreakdown": {
"intern": 0, "junior": 8.0, "mid": 21.8, "senior": 41.4, "staff": 6.9,
"principal": 3.4, "lead": 5.7, "manager": 4.6, "director": 1.1,
"vp-or-above": 0, "unknown": 7.1
},
"experienceRequirements": {
"averageYearsMin": 4.2,
"averageYearsMax": 7.1,
"requireExperiencePercent": 78.2,
"sampleSize": 68
},
"degreeRequirements": {
"bachelorsRequiredPercent": 34.5,
"mastersOrAbovePercent": 6.9,
"noDegreeMentionedPercent": 51.7,
"hardRequirementPercent": 12.6
},
"skillCategoryDemand": {
"Languages": 28.7, "Frameworks": 11.5, "Cloud": 18.4,
"Data": 33.3, "AI/ML": 5.7, "Other": 2.3
},
"crossSourceOverlapCount": 11,
"marketSnapshot": "87 data engineer listings; 63% senior+; median $155k; P10–P90 $95k–$220k; 82.8% remote; Data 33.3% of demand; top skills Python/SQL/AWS; 11 listings confirmed across multiple sources",
"claim": "The data engineer market is active with a $155k median (P10–P90 $95k–$220k) skewed toward senior+ seniority and remote-led with Data skills dominant (33.3% of demand).",
"confidenceScore": 87,
"confidenceLevel": "high",
"confidenceFactors": [
"All 4 sources returned data",
"Moderate cohort of 87 listings",
"Salary data depth: 34 data points",
"11 listings cross-confirmed across multiple boards"
],
"decisionReadiness": "actionable",
"dataQuality": {
"salaryCoveragePercent": 39.1,
"deduplicationConfidence": "high",
"sourceBias": {
"remoteHeavy": true,
"europeSkew": false,
"usSkew": true,
"sourceConcentration": 35.6,
"dominantSource": "arbeitnow"
},
"notes": [
"82.8% of listings are remote — on-site benchmarks under-represented.",
"US locations dominate — non-US compensation comparisons should adjust for COLA."
]
},
"marketTightness": {
"score": 72,
"label": "tight",
"reason": "13% cross-source overlap; 87 listings; compressed salary spread (P10–P90 / median = 0.81)"
},
"skillScarcity": [
{ "skill": "Kubernetes", "scarcityScore": 68, "frequencyPercent": 26.4, "premiumPercent": 12.9, "reason": "+12.9% salary premium with 26.4% market frequency" },
{ "skill": "Spark", "scarcityScore": 62, "frequencyPercent": 37.9, "premiumPercent": 8.4, "reason": "+8.4% salary premium with 37.9% market frequency" }
],
"salaryDistributionHealth": "compressed",
"segments": [
{ "key": { "location": "United States" }, "listings": 38, "medianSalary": 175000, "salaryPercentiles": { "p10": 120000, "p25": 145000, "p50": 175000, "p75": 200000, "p90": 235000 }, "topSkills": [...], "seniorityBreakdown": {...}, "remotePercentage": 71.1, "crossSourceConfirmedPercent": 18.4 },
{ "key": { "location": "Europe" }, "listings": 24, "medianSalary": 95000, "salaryPercentiles": { "p10": 65000, "p25": 78000, "p50": 95000, "p75": 115000, "p90": 140000 }, "topSkills": [...], "seniorityBreakdown": {...}, "remotePercentage": 91.7, "crossSourceConfirmedPercent": 8.3 }
],
"trendInsights": {
"sinceLastRun": true,
"previousRunAt": "2026-04-25T14:32:00.000Z",
"daysSincePreviousRun": 7.0,
"listingGrowthRate": 12.5,
"salaryMedianChange": 7000,
"salaryMedianChangePercent": 4.7,
"remotePercentageChange": 2.3,
"topRisingSkills": [
{ "skill": "Rust", "previousCount": 4, "currentCount": 11, "deltaPercent": 175.0 },
{ "skill": "Databricks", "previousCount": 8, "currentCount": 14, "deltaPercent": 75.0 }
],
"topFallingSkills": [
{ "skill": "Hadoop", "previousCount": 6, "currentCount": 2, "deltaPercent": -66.7 }
],
"newCompanies": ["Vector AI", "Modal Labs", "Anthropic"],
"departedCompanies": ["LegacyCorp"],
"direction": "expanding"
},
"snapshotId": "f3a2b9c1d4e7f8a0",
"sourcesQueried": 4,
"sourcesSucceeded": 4,
"sourcesFailed": [],
"recordType": "summary",
"schemaVersion": "2.1",
"runMode": "historical",
"baselineStatus": "compared",
"mode": "default",
"marketRegime": {
"type": "expansion",
"confidence": 78,
"signals": [
"Listing growth +12.5%",
"Salary median +4.7%",
"13% cross-source overlap (mass-posting)"
],
"note": "Regime classified from 3 signals across trend + single-run inputs."
},
"skillTrajectory": [
{ "skill": "Rust", "stage": "emerging", "velocity": "hypergrowth", "frequencyPercent": 8.1, "premiumPercent": 14.2, "deltaPercent": 175.0, "confidence": 100, "reason": "8.1% market frequency; +14.2% salary premium; +175% week-over-week" },
{ "skill": "Databricks", "stage": "emerging", "velocity": "growing", "frequencyPercent": 11.3, "premiumPercent": 9.8, "deltaPercent": 75.0, "confidence": 100, "reason": "11.3% market frequency; +9.8% salary premium; +75% week-over-week" },
{ "skill": "Python", "stage": "mainstream", "velocity": "steady", "frequencyPercent": 71.3, "premiumPercent": 2.1, "deltaPercent": null, "confidence": 75, "reason": "71.3% market frequency; +2.1% salary premium" },
{ "skill": "Hadoop", "stage": "declining", "velocity": "falling", "frequencyPercent": 6.7, "premiumPercent": -3.2, "deltaPercent": -66.7, "confidence": 100, "reason": "6.7% market frequency; -3.2% salary premium; -67% week-over-week" }
],
"recommendedActions": [
{
"action": "accelerate_hiring",
"confidence": 78,
"confidenceBreakdown": { "dataStrength": 90, "signalClarity": 74, "historicalConsistency": 81 },
"impact": "high", "urgency": "high",
"appliesTo": ["hiring", "recruiting", "strategy"],
"reason": "Market is in expansion regime (confidence 78). Listing growth +12.5%; Salary median +4.7%. Move now while supply still meets demand."
},
{
"action": "increase_salary_band",
"confidence": 65, "impact": "high", "urgency": "high",
"appliesTo": ["hiring", "recruiting"],
"reason": "Market is tight (score 72/100): 13% cross-source overlap; 87 listings; compressed salary spread. Median is $155k — bands below this will struggle to attract candidates."
},
{
"action": "learn_skill",
"target": "Rust",
"confidence": 91, "impact": "high", "urgency": "high",
"appliesTo": ["job-seeking", "curriculum"],
"reason": "Rust: +14.2% salary premium with 8.1% market frequency. Scarcity score 78/100 — high salary lift with low market saturation."
},
{
"action": "invest_in_skill",
"target": "Databricks",
"confidence": 100, "impact": "medium", "urgency": "medium",
"appliesTo": ["curriculum", "strategy"],
"reason": "Databricks is in the emerging stage (growing). 11.3% market frequency; +9.8% salary premium; +75% week-over-week. Early adopters get the premium before mainstream saturation."
}
],
"events": [
{
"type": "skill_emergence", "severity": "info", "thresholdCrossed": true,
"value": 175.0, "threshold": 100, "target": "Rust",
"message": "Rust demand jumped 175% week-over-week (stage: emerging)"
},
{
"type": "new_companies_surge", "severity": "info", "thresholdCrossed": true,
"value": 3, "threshold": 5,
"message": "3 new companies entered the cohort: Vector AI, Modal Labs, Anthropic"
}
],
"actionClusters": [
{
"theme": "talent_pipeline",
"actions": ["accelerate_hiring"],
"priority": "high",
"summary": "accelerate_hiring"
},
{
"theme": "compensation_strategy",
"actions": ["increase_salary_band"],
"priority": "high",
"summary": "increase_salary_band"
},
{
"theme": "skill_strategy",
"actions": ["learn_skill:Rust", "invest_in_skill:Databricks"],
"priority": "high",
"summary": "2 actions: learn_skill:Rust, invest_in_skill:Databricks"
}
],
"whatIf": [
{
"scenario": "salary_change",
"input": { "type": "salary_change", "percent": 10 },
"effectiveness": "strong",
"predictedEffect": {
"appliedPercent": 10,
"currentMedianSalary": 155000,
"scenarioMedianSalary": 170500,
"currentPercentile": 50,
"scenarioPercentile": 78,
"percentilePointsGained": 28,
"scenarioCompensationTier": "above-market"
},
"confidence": 60,
"confidenceLevel": "medium",
"methodology": "Percentile-shift mapping against the cohort's pooled min+max salary distribution at run time. Tier classification uses fixed cohort-median ratio thresholds (0.85 / 1.10 / 1.35).",
"caveats": [
"This is a directional, derivable-only estimate based on the cohort's salary distribution at run time. It is not a forecast.",
"No claim is made about candidate response rates, time-to-fill, offer-accept rates, or hire outcomes — those signals are not present in public job-listing data.",
"Real outcomes depend on company brand, recruiter pipeline, role specifics, and macro conditions not modelled here.",
"Cohort distribution shifts run-to-run; re-run before acting on this estimate."
],
"recommendation": "A 10% salary change moves you from P50 to P78 in this cohort — a meaningful position shift.",
"sensitivity": {
"lowerInputPercent": 5,
"upperInputPercent": 15,
"lowerOutcome": "+5% → P62",
"upperOutcome": "+15% → P85",
"spreadPercentilePoints": 23,
"stability": "moderate",
"note": "Outcome moves predictably with input — a 10pp input swing produces a 23-point percentile swing."
}
},
{
"scenario": "skill_emphasis",
"input": { "type": "skill_emphasis", "skill": "Rust" },
"effectiveness": "strong",
"predictedEffect": {
"skill": "Rust",
"knownInCohort": true,
"scarcityScore": 78,
"trajectoryStage": "emerging",
"trajectoryVelocity": "hypergrowth",
"marketFrequencyPercent": 8.1,
"salaryPremiumPercent": 14.2
},
"confidence": 60,
"confidenceLevel": "medium",
"methodology": "Skill is matched (case-insensitive) against the cohort's skillScarcity, skillTrajectory, skillPremiums, and topSkills outputs. No external benchmark or hire-outcome data is used.",
"caveats": [
"This is a market-positioning estimate, not a hire/job-acquisition forecast.",
"Skill demand changes over time; re-run before acting on this estimate.",
"Premium percentages are sample-size gated (≥5 listings); skills below that threshold return null premium."
],
"recommendation": "Adding \"Rust\" aligns with a high-leverage position: emerging stage with scarcity score 78/100, +14.2% salary premium.",
"sensitivity": null
}
],
"decisionTension": [
{
"between": ["increase_salary_band", "tighten_role_specs"],
"tension": "cost_vs_selectivity",
"explanation": "Raising salary improves candidate positioning, while tightening role specs reduces the eligible pool. Doing both at once may produce a small, expensive hire pipeline that misses both levers individually.",
"recommendedBalance": "In tight markets prioritise the salary increase first; defer spec tightening unless inbound pipeline volume becomes excessive."
}
],
"rejectedActions": [
{
"action": "decrease_salary_band",
"reason": "Market is tight (score 72/100). Lowering salary would reduce competitiveness against a pipeline that already favours employers raising bands. Not recommended."
},
{
"action": "expand_geographic_search",
"reason": "82.8% of listings are remote — geographic expansion adds no opportunity coverage when the market is location-agnostic. Use remote-first sourcing instead."
},
{
"action": "hold_strategy",
"reason": "Market regime is expansion with confidence 78/100 — there is a clear directional edge. Doing nothing is not the right read for this cohort."
}
],
"marketMemory": {
"regimeHistory": [
{ "regime": "expansion", "at": "2026-04-04T14:32:00.000Z" },
{ "regime": "expansion", "at": "2026-04-11T14:32:00.000Z" },
{ "regime": "expansion", "at": "2026-04-18T14:32:00.000Z" },
{ "regime": "expansion", "at": "2026-04-25T14:32:00.000Z" },
{ "regime": "expansion", "at": "2026-05-02T14:32:00.000Z" }
],
"regimeStability": 1.0,
"lastInflectionDaysAgo": null,
"pattern": "expansion_stable",
"note": "Pattern derived from the last 5 regime classifications (capped at 12)."
},
"analysisMetadata": {
"salarySampleSize": 34,
"segmentCount": 0,
"historicalTrackingEnabled": true,
"incrementalApplied": false,
"customSkillCount": 0,
"sourceWeightsApplied": false,
"sourcesQueried": 4,
"sourcesSucceeded": 4,
"mode": "default"
},
"warnings": [
"82.8% of listings are remote — on-site benchmarks under-represented.",
"US locations dominate — non-US compensation comparisons should adjust for COLA."
]
}
Each subsequent item is a normalized job listing:
{
"type": "job",
"source": "remotive",
"title": "Senior Data Engineer",
"company": "Snowflake",
"location": "Worldwide",
"remote": true,
"jobType": "full-time",
"salaryMin": 160000,
"salaryMax": 210000,
"salaryCurrency": "USD",
"description": "We are looking for a Senior Data Engineer to build and maintain our core data platform...",
"skills": ["Python", "SQL", "Spark", "Kafka", "Airflow", "AWS", "Docker", "Kubernetes"],
"tags": ["data", "engineering", "big-data"],
"postedDate": "2026-05-02T08:00:00.000Z",
"url": "https://remotive.com/remote-jobs/software-dev/senior-data-engineer-12345",
"applyUrl": "https://remotive.com/remote-jobs/software-dev/senior-data-engineer-12345",
"seniorityLevel": "senior",
"experienceYearsMin": 5,
"experienceYearsMax": 8,
"degreeRequired": "bachelors",
"degreeIsHardRequirement": false,
"skillCategoryProfile": "Data",
"crossSourceConfirmed": true,
"crossSourceCount": 2,
"compensationTier": "above-market",
"recommendedAction": "apply-now",
"actionReason": "Above-market compensation tier (110–135% of market median) with disclosed salary at a named company.",
"recordType": "job"
}
Output Fields — Summary Report
| Field | Type | Description |
|---|---|---|
type | string | Always "summary" for the report record |
query | string | The search query used |
location | string|null | Location filter applied (if any) |
analyzedAt | string | ISO timestamp of when the analysis ran |
totalListings | number | Total deduplicated job listings found |
sourceBreakdown | object | Count of listings per source (e.g., {"remotive": 24, "arbeitnow": 31}) |
topSkills | array | Top 30 skills ranked by frequency, each with skill, count, and percentage |
salaryInsights | object|null | Salary statistics: dataPoints, minSalary, maxSalary, medianSalary, averageSalary, currency, plus percentiles (p10/p25/p50/p75/p90) when ≥5 data points |
skillPremiums | array | Per-skill median salary lift vs cohort median, each with skill, sampleSize, medianSalary, premiumVsMarket, premiumPercent (only skills with ≥5 salary data points) |
topHiringCompanies | array | Top 20 companies by number of open positions, each with company and openings |
jobTypeBreakdown | object | Count per job type: full-time, part-time, contract, internship, temporary, unknown |
remotePercentage | number | Percentage of listings flagged as remote |
seniorityBreakdown | object | Percentage of listings at each seniority level: intern, junior, mid, senior, staff, principal, lead, manager, director, vp-or-above, unknown |
experienceRequirements | object | averageYearsMin, averageYearsMax, requireExperiencePercent, sampleSize |
degreeRequirements | object | bachelorsRequiredPercent, mastersOrAbovePercent, noDegreeMentionedPercent, hardRequirementPercent |
skillCategoryDemand | object | Percentage of listings whose dominant skill area is each category: Languages, Frameworks, Cloud, Data, AI/ML, Other |
crossSourceOverlapCount | number | Count of listings that appeared on multiple boards before deduplication (legitimacy signal) |
marketSnapshot | string | Slack/email-ready one-line headline summarizing the cohort (metric-first) |
claim | string | Analyst-style one-sentence conclusion about the cohort (paste verbatim into reports / Slack / agent prompts) |
confidenceScore | number | 0–100 score combining source coverage (30%) + cohort size (30%) + salary data depth (25%) + cross-source overlap (15%) |
confidenceLevel | string | Banded confidence: high (≥75), medium (≥50), low (<50). Use this in Dify/n8n switch nodes. |
confidenceFactors | string[] | Plain-English explanations of WHY confidence is what it is — usable verbatim in reports |
decisionReadiness | string | Automation gate: actionable (confidence ≥70 + ≥10 salary points + ≥10 listings), monitor (worth tracking but don't auto-act), insufficient-data (<10 listings) |
dataQuality | object | Auditability block: salaryCoveragePercent, deduplicationConfidence (high/medium/low), sourceBias ({remoteHeavy, europeSkew, usSkew, sourceConcentration, dominantSource}), notes[] plain-English bias warnings |
marketTightness | object | Supply/demand index: { score (0–100), label: tight/balanced/loose/unknown, reason }. Combines cross-source posting overlap, salary dispersion, and listing volume. |
skillScarcity | object[] | Top 10 skills ranked by scarcityScore (high salary premium AND low frequency). Each: { skill, scarcityScore (0–100), frequencyPercent, premiumPercent, reason }. Empty when cohort < 20 listings. |
salaryDistributionHealth | string | wide (P10–P90 spread > 1.2× median) / balanced / compressed (< 0.5×) / unknown. Compressed = mature/standardised market. |
segments | object[] | Per-segment analytics when groupBy is set. Each: { key, listings, medianSalary, salaryPercentiles, topSkills, seniorityBreakdown, remotePercentage, crossSourceConfirmedPercent }. Capped at 50. |
trendInsights | object|null | Cross-run trends when enableHistoricalTracking is on AND a prior snapshot exists within lookbackDays. { sinceLastRun, previousRunAt, daysSincePreviousRun, listingGrowthRate, salaryMedianChange, salaryMedianChangePercent, remotePercentageChange, topRisingSkills[], topFallingSkills[], newCompanies[], departedCompanies[], direction }. Null on first run. |
snapshotId | string | 16-char SHA-256 hash over query + location + sources + listing fingerprint. Compare across runs to detect when the cohort actually changed. |
schemaVersion | string | Output contract version (semver-style) — currently "2.1". Major bumps signal breaking changes; minor bumps signal additive expansions. 2.1 is additive-only since 2.0 (added: actionClusters, whatIf + sensitivity, marketMemory, decisionTension, rejectedActions, action confidenceBreakdown). Branch on this in long-lived integrations to opt into new features explicitly. |
runMode | string | What kind of run this was: snapshot (one-shot), historical (snapshot + trend computation), incremental (snapshot + trend + drop already-seen URLs). |
baselineStatus | string | Lifecycle of the historical snapshot for this run: created (first baseline written), compared (trend insights computed against an existing baseline), expired (prior baseline was older than lookbackDays — fresh one written, trends null this run), disabled (historical tracking off). |
analysisMetadata | object | Run-level metadata about the analytics computation: salarySampleSize, segmentCount, historicalTrackingEnabled, incrementalApplied, customSkillCount, sourceWeightsApplied, sourcesQueried, sourcesSucceeded, mode. Distinct from dataQuality (which is about the cohort's biases, not the run's machinery). |
warnings | string[] | Top-level run-level warnings (sources failed, low confidence, expired baseline, critical events, etc.). Promotes dataQuality.notes alongside other run-level signals so downstream consumers don't have to walk into nested objects. Empty array when nothing notable. Read this before acting on the cohort's analytics. |
mode | string | Active persona preset: default / job_seeker / recruiter / analyst. Echoed on the summary so downstream automation can branch on the persona that produced the output. |
marketRegime | object | State classification: { type (expansion/contraction/stagnation/volatility/unknown), confidence (0–100), signals[] (which thresholds fired), note }. Combines trend + single-run signals; confidence is materially higher when historical tracking is on. |
recommendedActions | object[] | Cohort-level action engine (capped at 12). Each: { action, target?, confidence (0–100), confidenceBreakdown: { dataStrength, signalClarity, historicalConsistency }, impact (high/medium/low), urgency (high/medium/low), appliesTo[] (hiring/recruiting/job-seeking/curriculum/strategy/monitoring), reason }. Sorted by mode audience priority, then urgency, then confidence. Branch on action (stable enum string) for automation; filter by appliesTo to surface only the actions a given persona cares about. Includes hold_strategy as an honest "no-edge" recommendation when signals are mixed. |
actionClusters | object[] | Recommended actions grouped by theme: compensation_strategy, talent_pipeline, skill_strategy, monitoring_strategy, source_strategy, general. Each: { theme, actions[], priority (high/medium/low), summary }. Sorted high → low priority then by cluster size. Reduces noise when 8–12 actions belong to a few strategic surfaces. |
whatIf | object[] | Counterfactual scenarios with honest, derivable-only outcomes (percentile shift, tier change, scarcity match) — never invented forecasts. Each: { scenario, input, effectiveness (strong/moderate/limited/none/unknown), predictedEffect, confidence (hard-capped at 60), confidenceLevel, methodology, caveats[], recommendation, sensitivity }. sensitivity (salary scenarios only) ships lowerOutcome/upperOutcome at user-input ±5pp + a stability enum (low / moderate / high / unknown) so you can see if the percentile shift is robust to small input variation. Auto-generated when whatIfScenarios input is omitted; honors user scenarios + constraints when supplied. Scenario types: salary_change (% delta) and skill_emphasis (named skill). |
decisionTension | object[] | Trade-off pairs detected across recommendedActions[]. Each: { between: [actionA, actionB], tension (cost_vs_selectivity / speed_vs_quality / remote_vs_local_reach / act_now_vs_wait / early_mover_vs_safe_bet / depth_vs_breadth), explanation, recommendedBalance }. Surfaces when two recommended actions work against each other under a single sourcing pipeline. Empty when no contradictory pairs are present. |
rejectedActions | object[] | Anti-recommendations — actions explicitly NOT recommended for this cohort, with reason. Each: { action, target?, reason }. The dual of hold_strategy: instead of staying silent on the obvious wrong moves, the system surfaces them and explains why it skipped them. Builds trust by showing the engine considered alternatives. Empty when no anti-recommendations apply. |
marketMemory | object | Bounded last-12-runs regime history with pattern detection. { regimeHistory[] (regime + at), regimeStability (0..1), lastInflectionDaysAgo, pattern, note }. Patterns: expansion_stable / expansion_weakening / contraction_stable / contraction_deepening / volatile_shifting / stagnation_persistent / inflection_recent / insufficient-history (until 3 snapshots) / mixed. Activates with enableHistoricalTracking; meaningful at 3+ snapshots. Lets you reason in patterns, not just deltas. |
skillTrajectory | object[] | Per-skill lifecycle classification (top 20 skills): { skill, stage (declining/stable/emerging/mainstream/saturated), velocity (hypergrowth/growing/steady/cooling/falling/unknown), frequencyPercent, premiumPercent, deltaPercent, confidence, reason }. Sorted emerging → mainstream → other. The bridge between rising/falling counts and "what does it mean for me?" |
events | object[] | Threshold-crossing events ready for downstream alerting. Each: { type, severity (critical/warning/info), thresholdCrossed, value, threshold, target?, message }. Event types: salary_spike, salary_drop, listing_growth_spike, listing_drop, remote_share_shift, skill_emergence, skill_collapse, new_companies_surge, cohort_collapse. Thresholds user-overridable via the eventThresholds input. Sorted critical → warning → info. |
sourcesQueried | number | Number of job board sources queried this run |
sourcesSucceeded | number | Number of job board sources that returned data |
sourcesFailed | string[] | Names of sources that failed this run; empty when all succeeded |
recordType | string | Discriminator for downstream filtering — summary for the summary record, job for individual listings, error for error records. (type is a deprecated alias kept for back-compat.) |
Output Fields — Job Listing
| Field | Type | Description |
|---|---|---|
type | string | Always "job" for individual listings |
source | string | Which board the listing came from: remotive, arbeitnow, jobicy, or hn-whoishiring |
title | string | Job title (extracted or parsed from source) |
company | string | Company name (HN listings may show "Unknown (HN)" if parsing fails) |
location | string|null | Job location (may be "Remote", a city, or null) |
remote | boolean | Whether the position is remote |
jobType | string|null | Normalized job type: full-time, part-time, contract, internship, temporary |
salaryMin | number|null | Minimum salary (annual, in stated currency) |
salaryMax | number|null | Maximum salary (annual, in stated currency) |
salaryCurrency | string|null | Currency code: USD or EUR |
description | string | Job description text (HTML stripped, max 2,000 chars) |
skills | string[] | Technologies detected in the description (e.g., ["Python", "AWS", "Docker"]) |
tags | string[] | Tags from the source API (empty for HN listings) |
postedDate | string|null | ISO timestamp of when the job was posted |
url | string | URL to the original listing |
applyUrl | string|null | Direct application URL (when available) |
seniorityLevel | string | One of intern, junior, mid, senior, staff, principal, lead, manager, director, vp-or-above, unknown |
experienceYearsMin | number|null | Minimum years of experience requested (parsed from description) |
experienceYearsMax | number|null | Maximum years of experience requested |
degreeRequired | string | bachelors, masters, phd, any-degree, no-mention |
degreeIsHardRequirement | boolean | True if the degree is required (vs preferred / equivalent experience accepted) |
skillCategoryProfile | string|null | Dominant skill area for this role: Languages, Frameworks, Cloud, Data, AI/ML, Other |
crossSourceConfirmed | boolean | True if this listing appeared on multiple job boards before deduplication |
crossSourceCount | number | Number of source boards this listing appeared on |
compensationTier | string | Salary vs market median for this query: below-market (<85%), at-market (85–110%), above-market (110–135%), premium (>135%), unknown (no salary data) |
recommendedAction | string | Decision enum for routing in Dify/n8n workflows: apply-now, research-company, review-fit, skip-low-detail |
actionReason | string | Plain-English sentence explaining WHY recommendedAction is what it is — paste verbatim into Slack/email/agent prompts |
recordType | string | Always "job" for listings (mirrors type for forward-compatibility with the standard Apify discriminator pattern) |
Common workflows
One-shot market pulse (no schedule)
Run with no historical-tracking flags. Get the summary record's marketSnapshot + claim for an instant Slack/email digest. Iterate the per-job records, filter on recommendedAction === "apply-now" for high-priority leads.
Weekly salary trend monitoring (scheduled)
Set enableHistoricalTracking: true + lookbackDays: 14. Schedule weekly. Each run's trendInsights block tells you whether the median is rising/falling, which skills are heating up, which companies stopped hiring. Pipe into a Slack alert: if (trendInsights.salaryMedianChangePercent > 5) sendAlert(...).
Daily fresh-listings feed (scheduled, incremental)
enableHistoricalTracking: true + incremental: true. Schedule daily. Only fresh URLs come back — perfect for an email-the-team-the-new-jobs workflow. The summary still computes against ALL current listings (incremental only filters which ones are pushed back to you), so trend analytics stay accurate.
Cross-region salary comparison (single run)
groupBy: ["location"] returns per-location segments with their own salary percentiles, top skills, and seniority breakdown. Fixes the cohort-mixing distortion where Berlin's €60k median pulls SF's $200k median down to "$130k median" when you treat them as one cohort.
Talent pipeline monitor for a single company
companyName: "Stripe" + enableHistoricalTracking: true. Schedule weekly. trendInsights.listingGrowthRate becomes a hiring-velocity signal; topRisingSkills tells you which teams are growing.
Niche-market intelligence (custom skills)
Add customSkills for the technologies your competitive landscape cares about that the built-in 80 don't cover (e.g. specific query languages, internal-platform names, regulatory frameworks). Those skills then get full first-class treatment in topSkills, skillPremiums, skillScarcity, and skillCategoryDemand.
What makes this actor different (vs other job market analysis tools)
This actor is an alternative to LinkedIn Talent Insights, Lightcast (formerly Burning Glass), Revelio Labs, Datapeople, Greenhouse Reports, Ashby Analytics, generic job scrapers and job aggregators — but built for automation workflows rather than dashboards or sales-team consumption.
Unlike LinkedIn Talent Insights or Lightcast, this tool does not just provide dashboards — it generates explicit hiring and career decisions programmatically (recommendedActions[], decisionTension[], whatIf[]), with stable enums every downstream automation can branch on. The output is decisions, not visualisations.
| Approach | What you get | What's missing |
|---|---|---|
| Generic job board scraper (single-source) | Raw listings | No skill extraction, no salary stats, no decision layer, no cross-board overlap signal |
| LinkedIn / Indeed / Glassdoor scrapers | Larger volume | No multi-source aggregation; auth-walled; high block risk; flat output |
| Lightcast / Revelio / LinkedIn Talent Insights (enterprise) | Macro labor data, employee-level intel | $$$$ and behind sales-call paywalls; not embeddable in your automation |
| Job Market Intelligence (this actor) | Decision-ready output (recommendedAction, compensationTier, decisionReadiness); cohort analytics (percentiles, premiums, market tightness, scarcity); per-segment breakdowns; cross-run trend insights; data-quality auditability; trade-off detection (decisionTension); anti-recommendations (rejectedActions); counterfactual simulation (whatIf with sensitivity) | Public-API coverage only (Remotive / Arbeitnow / Jobicy / HN); no LinkedIn / Indeed / Glassdoor; no candidate-side data |
The positioning is composable labor-market strategy engine for automation: stable enums on every record so Dify / n8n / Zapier / SQL can branch without prompt engineering, plus the cohort-level analytics and trend layers that turn one-shot scrapes into a monitoring product, plus the strategy layer (recommended actions / trade-offs / what-if scenarios) that turns analytics into decisions.
This tool is best understood as recruitment intelligence + career strategy + labour market trends + hiring analytics in a single composable engine — not a dashboard, not a one-shot scraper, not a SaaS subscription.
Use Cases
- Job seekers — Search for roles matching your skills, compare salary ranges across companies, and discover which technologies are most in-demand for your target position
- Recruiters and talent acquisition teams — Monitor competitor hiring activity, understand which skills the market demands, and benchmark compensation packages before writing job descriptions
- HR and workforce planning analysts — Track hiring trends over time by scheduling periodic runs to build a longitudinal dataset of skill demand and salary movement
- Career coaches and bootcamp instructors — Identify the most requested programming languages, frameworks, and cloud platforms so you can align curriculum with real employer needs
- Startup founders — Research the talent landscape before hiring. See what competitors pay, which skills are scarce, and whether remote or on-site roles dominate your niche
- Data journalists and researchers — Gather structured, source-attributed job market data for articles, reports, or academic studies on labor economics and tech hiring
API & Programmatic Access
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/job-market-intelligence").call(run_input={
"query": "data engineer",
"remoteOnly": True,
"analyzeSkills": True,
"analyzeSalaries": True,
"maxResults": 200,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
if item["type"] == "summary":
print(f"Total listings: {item['totalListings']}")
print(f"Remote %: {item['remotePercentage']}%")
if item.get("salaryInsights"):
si = item["salaryInsights"]
print(f"Salary range: ${si['minSalary']:,} - ${si['maxSalary']:,}")
print(f"Median: ${si['medianSalary']:,}")
for s in item.get("topSkills", [])[:10]:
print(f" {s['skill']}: {s['count']} ({s['percentage']}%)")
else:
print(f"{item['company']} - {item['title']} ({item['source']})")
JavaScript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('ryanclinton/job-market-intelligence').call({
query: 'data engineer',
remoteOnly: true,
analyzeSkills: true,
analyzeSalaries: true,
maxResults: 200,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
const summary = items.find(i => i.type === 'summary');
const jobs = items.filter(i => i.type === 'job');
console.log(`Found ${summary.totalListings} listings, ${summary.remotePercentage}% remote`);
console.log('Top skills:', summary.topSkills.slice(0, 5).map(s => s.skill).join(', '));
jobs.forEach(j => console.log(`${j.company} - ${j.title} (${j.source})`));
cURL
# Start the actor
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~job-market-intelligence/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "data engineer",
"remoteOnly": true,
"analyzeSkills": true,
"maxResults": 200
}'
# Fetch results (use defaultDatasetId from the response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How It Works — Technical Details
Input: query, location, remoteOnly, datePosted, sources, maxResults
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ PARALLEL FETCH (Promise.allSettled — failures don't crash run) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ ┌─────────┐ │
│ │ Remotive │ │ Arbeitnow │ │ Jobicy │ │ HN │ │
│ │ │ │ │ │ │ │ Algolia │ │
│ │ GET /api/ │ │ GET /api/ │ │ GET /api │ │ GET /api│ │
│ │ remote-jobs │ │ job-board-api│ │ /v2/ │ │ /v1/ │ │
│ │ ?search=X │ │ ?search=X │ │ remote- │ │ search │ │
│ │ &limit=N │ │ &page=1..3 │ │ jobs │ │ ?query= │ │
│ │ │ │ │ │ ?count=N │ │ X&tags= │ │
│ │ Salary from │ │ Salary from │ │ &tag=X │ │ comment │ │
│ │ field + │ │ description │ │ │ │ ,ask_hn │ │
│ │ description │ │ regex │ │ Salary │ │ │ │
│ │ fallback │ │ │ │ from API │ │ Last │ │
│ │ │ │ created_at │ │ fields │ │ 90 days │ │
│ │ Remote-only │ │ = Unix epoch │ │ │ │ │ │
│ │ board │ │ │ │ Remote- │ │ Parse: │ │
│ │ │ │ European │ │ only │ │ company │ │
│ │ │ │ focus │ │ board │ │ from 1st│ │
│ │ │ │ │ │ │ │ line │ │
│ └──────┬───────┘ └──────┬───────┘ └────┬─────┘ └────┬────┘ │
│ │ │ │ │ │
└─────────┼─────────────────┼───────────────┼──────────────┼──────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────┐
│ NORMALIZE to NormalizedJob schema │
│ (title, company, location, remote, salary, skills...) │
│ │
│ Skills: 80+ regex patterns across 6 categories │
│ (extensible via customSkills input) │
│ Salary: USD/EUR regex from fields + description text │
│ Job type: normalize → full-time/part-time/contract/etc │
│ Description: strip HTML, max 2,000 chars │
└─────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ FILTER PIPELINE (sequential) │
│ │
│ 1. Date filter (day=24h, week=7d, month=30d) │
│ 2. Remote-only filter (j.remote === true) │
│ 3. Location filter (case-insensitive substring) │
│ └─ Graceful fallback: if ALL removed, re-include │
│ 4. Company name filter (case-insensitive substring) │
│ 5. Source weighting (deterministic per-listing hash) │
│ └─ Only applied when sourceWeights is set │
│ 6. Incremental drop (URLs from prior snapshot) │
│ └─ Only applied when incremental: true + baseline │
│ 7. Deduplication (normalized title + URL secondary) │
│ ├─ Title: lowercase, strip noise tokens, sort │
│ ├─ URL: hostname + pathname secondary key │
│ └─ Tracks crossSourceCount per dedup key │
│ 8. Cap at maxResults │
│ 9. Compute market median (single salary pass) │
└─────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ PER-JOB ENRICHMENT │
│ │
│ • seniorityLevel (regex over title + first 400 chars) │
│ • experienceYearsMin/Max (regex on description) │
│ • degreeRequired + degreeIsHardRequirement │
│ • skillCategoryProfile (dominant skill area) │
│ • crossSourceConfirmed + crossSourceCount │
│ • compensationTier (vs market median) │
│ • recommendedAction + actionReason (decision enum) │
└─────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ BUILD SUMMARY REPORT │
│ │
│ • Source breakdown + sourcesQueried/Succeeded/Failed │
│ • Top 30 skills by frequency + percentage │
│ • Salary: min, max, median, average + P10/25/50/75/90 │
│ • Skill premiums (≥5 sample) vs cohort median │
│ • Top 20 hiring companies by openings │
│ • Job type breakdown │
│ • Remote percentage │
│ • Seniority / experience / degree breakdowns │
│ • Skill category demand (% per category) │
│ • Cross-source overlap count │
│ • marketTightness + skillScarcity + distribution health│
│ • Per-segment analytics (when groupBy is set) │
│ • dataQuality + warnings + analysisMetadata │
│ • marketSnapshot + claim (Slack/email-ready) │
│ • snapshotId (cohort fingerprint) │
│ • runMode + baselineStatus + schemaVersion │
└─────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ HISTORICAL SNAPSHOT (opt-in) │
│ │
│ enableHistoricalTracking: true │
│ ├─ Read prior snapshot from │
│ │ named KV store │
│ ├─ Compute trendInsights │
│ │ (rising/falling skills, │
│ │ salary direction, growth) │
│ └─ Write fresh snapshot │
└─────────────────┬───────────────┘
│
▼
Push to Dataset:
[summary, ...jobs]
+ Actor.setValue('SUMMARY', summary)
Data Source Details
| Source | API Endpoint | Coverage | Salary Data | Notes |
|---|---|---|---|---|
| Remotive | remotive.com/api/remote-jobs | Remote tech jobs worldwide | Structured field + description regex | Single page, ?search=X&limit=N |
| Arbeitnow | arbeitnow.com/api/job-board-api | European focus, all job types | Description regex only | Paginated up to 3 pages, created_at is Unix timestamp |
| Jobicy | jobicy.com/api/v2/remote-jobs | Remote-first jobs | Structured annualSalaryMin/Max fields | ?count=N&tag=X |
| HN Who's Hiring | hn.algolia.com/api/v1/search | Startup jobs from monthly threads | Description regex only | Searches comments from last 90 days, parses company from first line |
Skill Detection System
The actor scans each job description against 80+ built-in technology patterns organized into 6 categories. Add domain-specific skills via the customSkills input — they're treated as first-class members of the categorisation, premium, and scarcity systems.
| Category | Skills Detected |
|---|---|
| Languages | Python, JavaScript, TypeScript, Java, Rust, C++, Ruby, PHP, Swift, Kotlin, Scala, SQL, R, Go |
| Frameworks | React, Angular, Vue, Next.js, Django, Flask, Spring, Rails, Laravel, FastAPI, Express, Node.js, Svelte, NestJS, .NET |
| Cloud | AWS, Azure, GCP, Docker, Kubernetes, Terraform, CI/CD, Jenkins, GitHub Actions, CloudFormation |
| Data | PostgreSQL, MongoDB, Redis, Elasticsearch, Kafka, Spark, Snowflake, BigQuery, Airflow, MySQL, DynamoDB, Cassandra, Redshift |
| AI/ML | Machine Learning, Deep Learning, NLP, Computer Vision, PyTorch, TensorFlow, LLM, GPT, RAG, Generative AI, Neural Network |
| Other | Git, Linux, Agile, REST, GraphQL, gRPC, Microservices, Scrum, DevOps, SRE |
Special handling: R and Go use context-aware regex to avoid false positives (e.g., "R" only matches when near "programming", "language", or other languages; "Go" matches "Golang" or "Go" in programming context).
Salary Extraction
Salary parsing uses multiple regex patterns applied to both structured API fields and free-text descriptions:
| Pattern | Example | Currency |
|---|---|---|
$Xk - $Xk | $120k - $180k | USD |
$X,XXX - $X,XXX | $120,000 - $180,000 | USD |
$Xk/year | $150k/year | USD |
$X,XXX/year | $150,000/year | USD |
€X - €X | €50,000 - €80,000 | EUR |
Values under 1,000 are automatically multiplied by 1,000 (treating "150" as "$150k"). The summary report computes statistics from the sorted union of all min and max salary values.
Deduplication Algorithm
Two-phase deduplication for resilience against the same role posted across multiple boards with cosmetic title differences.
- Title normalization — the title is lowercased, stripped of punctuation, and tokenized. Noise tokens (
senior,sr,jr,mid,junior,staff,principal,lead,remote,fulltime,i,ii,iii, articles, prepositions) are removed so"Senior React Engineer"and"React Engineer (Sr)"collapse to the same key. Remaining tokens are alphabetised and capped at 80 characters. - Primary dedup key =
company.toLowerCase().trim() + "::" + normalizedTitle. - URL secondary key =
hostname + pathnamefromjob.url. If the same URL has been seen under any primary key, the listing is folded into that key'scrossSourceCountrather than re-counted. - The first listing encountered for each primary key is kept; subsequent duplicates increment
crossSourceCounton the surviving record.crossSourceConfirmed: truefires when count > 1.
The two-phase approach catches both (a) the same role with cosmetic title variants and (b) the exact same URL re-syndicated to multiple boards.
HN Who's Hiring Comment Parsing
Hacker News comments are unstructured text. The actor extracts structured data via:
- Company: Regex on first line:
^([A-Z][A-Za-z0-9\s&.'-]+?)[\s]*[|(\-–]/(expects "Company | Role" format) - Role: Matches patterns like "hiring/looking for/seeking X" or "Company | X"
- Remote: Word boundary match for
/\bremote\b/i - Location: Matches "location/based in/office in: X"
- Minimum length: Comments under 50 characters are skipped
How Much Does It Cost?
The Job Market Intelligence actor uses minimal compute resources because it calls lightweight REST APIs rather than rendering web pages. No proxies are required.
The actor is billed pay-per-event: one report-generated charge per successful run regardless of result count, source count, or whether segmentation / historical tracking / incremental mode are enabled. Apify platform compute is billed separately at standard rates and depends on memory and runtime — runs typically complete in well under a minute, and the actor's defaults (512 MB) keep platform compute modest. A scheduled daily run for monitoring is significantly cheaper than running ad-hoc scrapes against multiple sources individually.
The exact PPE price for the report-generated event is shown in the Apify Store listing and logged at the start of every run.
Default memory is 512 MB and most runs complete in well under a minute, so platform compute is a small additional charge on top of the report-generated event.
Tips
- Start broad, then filter — Run a general query like "engineer" first to see the full landscape, then narrow with location or company filters in subsequent runs.
- Combine sources strategically — Remotive and Jobicy focus on remote roles, Arbeitnow covers European markets heavily, and HN Who's Hiring surfaces startup opportunities. Use the
sourcesparameter to target specific ecosystems. - Schedule weekly runs to build a time-series dataset of skill demand trends. Export to Google Sheets and chart how Python vs. Rust demand changes month over month.
- Use
maxResults: 500for comprehensive market reports, or keep it at 50 for quick daily pulse checks. - Filter by company name to monitor a specific competitor's hiring velocity — a sudden spike in open roles often signals a new product launch or funding round.
- Disable salary or skill analysis with the toggle fields if you only need raw listings. This slightly reduces processing time for very large result sets.
This is NOT for you if
Skip this actor if any of these describe you — there's a better tool for your job:
- You only want raw job listings with no analytics layer → use a basic single-source scraper
- You need LinkedIn, Indeed, or Glassdoor data specifically → use a dedicated scraper for that platform; those sites are auth-walled and explicitly out of scope here
- You're not making decisions from job market data → if you just want to display listings to end-users, the decision-engine layer is overhead you won't use
- You need real-time / streaming hiring velocity (sub-hour) → snapshots are per-run, not streaming. The minimum cadence is "as often as you schedule the actor"
- You need candidate-side data (LinkedIn profiles, resumes, talent pools) → this is a supply-side actor (job postings); it doesn't model the candidate pool
- You need to auto-apply / auto-submit applications → out of scope and against most boards' ToS
- You need salary parsing in GBP / CAD / AUD / JPY → only USD and EUR salary patterns are recognised; other currencies pass through unparsed in
description
What this actor does NOT do
Honest scope so you don't buy the wrong tool:
| Need | Use this instead |
|---|---|
| LinkedIn / Indeed / Glassdoor coverage | Dedicated single-source scrapers — those platforms require auth and anti-bot handling that this actor explicitly does not do |
| Glassdoor company review / sentiment / rating enrichment | A separate Glassdoor scraper — joining is a downstream task |
| Layoff cross-reference (layoffs.fyi) | A separate layoff-tracker actor — keeps this actor's PPE economics simple |
| Candidate-side data (LinkedIn profiles, resumes, talent pools) | Out of scope — this actor returns the supply side (job postings), not the demand side |
| Auto-applying / auto-submitting applications | Out of scope and against most boards' ToS |
| GBP / CAD / AUD / JPY salary parsing | Only USD and EUR salary patterns are recognized; other currencies pass through unparsed in description |
| Real-time hiring-velocity tracking | Schedule the actor with enableHistoricalTracking: true — trendInsights gives you listing-growth-rate, salary direction, rising/falling skills, new vs departed companies on every subsequent run. Sub-hour velocity isn't supported (snapshots are per-run, not streaming). |
The actor's positioning: composable job market intelligence for automation — the cleanest, fastest "what does the public-API job market look like for X right now, AND how is it shifting?" with decision-ready enums on every record and trend insights on every scheduled run. If you need enterprise-grade hiring intelligence (Lightcast, Revelio Labs, LinkedIn Talent Insights), this isn't a replacement — but at <$1/run it's the right starting point for most automation, research, and alerting workflows.
Limitations
- Source coverage — Only four job boards are queried. Major platforms like LinkedIn, Indeed, and Glassdoor are not included due to their authentication requirements and anti-bot measures.
- Salary data availability — Not all listings include salary information. The salary statistics are based only on listings that provide parseable salary data, which may skew toward certain markets or seniority levels.
- Currency support — Only USD (
$) and EUR (€) salary patterns are recognized. Salaries in GBP, CAD, AUD, or other currencies will not be extracted into structured salary fields. - Skill detection scope — The 80+ built-in skill patterns are tuned for technology roles. Non-tech skills (e.g., "project management", "sales") are not tracked. False positives are possible for ambiguous terms. Use the
customSkillsinput to add domain-specific terms. - HN comment parsing — Hacker News "Who's Hiring" comments are free-form text. Company name, role, and location extraction is best-effort via regex and may produce incorrect results for non-standard formats.
- No direct application — The actor collects listing URLs but does not submit job applications on your behalf.
- Real-time freshness — Data comes from live API calls, but the underlying job boards may have their own delays in indexing new postings.
- Deduplication limits — The deduplication key uses company name + first 60 characters of the title. Listings with slightly different titles for the same role may not be caught.
Responsible Use
This actor accesses only publicly available job board APIs that are designed for programmatic access. It does not bypass authentication, scrape private data, or violate any terms of service. When using job market data:
- Use data for legitimate research, job seeking, or workforce planning purposes
- Do not use automated data to discriminate against job seekers or companies
- Respect the intellectual property of job descriptions and company information
- Comply with all applicable employment and data protection laws in your jurisdiction
- See Apify's guide on web scraping legality for general guidance
FAQ
Do I need any API keys to use this actor? No. All four data sources (Remotive, Arbeitnow, Jobicy, HN Algolia) are free public APIs. No authentication is required.
How many jobs can I get per run? The actor can return up to 500 listings per run. The actual count depends on how many matches exist for your query across all four sources.
Does this actor work for non-tech jobs? Yes. While the skill extraction is tuned for technology roles, the job search itself works for any keyword — "marketing manager", "nurse", "accountant", or any other role. The skill analysis will simply return fewer matches for non-tech positions.
How fresh is the data?
Listing data is fetched live at run time. Use the datePosted filter to restrict results to the last 24 hours, week, or month. Historical snapshots (used for trendInsights and incremental mode) are only stored when enableHistoricalTracking: true is enabled — and even then, only a bounded summary record per query (top skills counts, companies, seen URLs) is persisted, not the raw listings.
Can I filter for a specific country or city?
Yes. Enter the location in the location field (e.g., "Germany", "London", "USA"). The actor performs a case-insensitive substring match against each listing's location field. If the filter removes all results, the actor gracefully falls back to including all listings.
What does the hn-whoishiring source cover?
It searches Hacker News "Who is Hiring?" monthly threads via the Algolia search API (last 90 days). These contain direct hiring posts from startup founders and engineering managers — often with roles not listed on traditional job boards.
How does deduplication work? The actor generates a key from the lowercased company name and first 60 characters of the job title. If two listings share the same key, only the first one encountered is kept.
Can I run this on a schedule? Absolutely. Set up a schedule in the Apify Console (e.g., daily at 9 AM) to build a longitudinal dataset. Each run appends to the same named dataset if you configure it that way.
What currencies are supported for salary extraction? The parser recognizes USD ($) and EUR (€) salary patterns. Salaries in other currencies may appear in the description text but will not be extracted into the structured salary fields.
Why does the summary show salaryInsights: null?
This happens when no listings in your results contain parseable salary data. Try broadening your query or using sources that more frequently include salary information (Jobicy has structured salary fields).
How is compensationTier calculated?
Each role's salaryMax (or salaryMin if max is missing) is divided by the cohort's overall median salary. <0.85 is below-market, 0.85–1.10 is at-market, 1.10–1.35 is above-market, >1.35 is premium. Listings with no parseable salary get unknown. The cohort is everything in the current run that matched your filters.
How is recommendedAction decided?
Heuristic combining description completeness, salary presence, company recognizability, and compensationTier:
apply-now— premium-tier comp with salary, OR above-market with salary at a known companyresearch-company— unknown company OR no salary dataskip-low-detail— description under 200 charactersreview-fit— everything else
It's a fast routing tag for downstream automations, not investment advice. Use it to pre-filter before a human or LLM reviews each role.
What does crossSourceConfirmed mean?
A listing is crossSourceConfirmed: true if the same role appeared on more than one board before deduplication. Matching uses the two-phase algorithm: normalised title (seniority noise tokens stripped, tokens alphabetised) AND URL secondary key. crossSourceCount tells you exactly how many board-postings collapsed into this record. Multi-source posting is a stronger signal that the company is actively recruiting (vs a stale auto-imported listing).
Why are some skills missing from skillPremiums?
Skills only appear in skillPremiums if at least 5 listings containing that skill also have parseable salary data. Below that threshold the median is too noisy to be meaningful. Use topSkills for raw frequency rankings regardless of salary data availability.
Can I use this output in Dify or n8n?
Yes. Both compensationTier and recommendedAction are stable enums designed for Dify if/else branching nodes and n8n switch nodes. See the "Use in Dify" section below for an example workflow.
How do I track week-over-week trends?
Set enableHistoricalTracking: true and schedule the actor in Apify Console (e.g., weekly). On the second run onward, the summary record gets a trendInsights block with listingGrowthRate, salaryMedianChange + percent, topRisingSkills[] (≥25% delta), topFallingSkills[], newCompanies[], departedCompanies[], and direction (expanding / stable / tightening). The first run returns trendInsights: null and writes the baseline snapshot.
What's the difference between historical tracking and incremental mode?
enableHistoricalTracking writes a snapshot per query and computes trend deltas on every subsequent run. incremental (a separate flag, requires tracking on) additionally drops listings whose URLs were returned in the previous run — so the dataset only has fresh items. Use historical-tracking-only for week-over-week salary monitoring (you want to recompute the full cohort each time); add incremental on top for daily fresh-listings feeds where you only care about new postings.
Why is trendInsights null on my run?
Either (a) enableHistoricalTracking is false, (b) it's the first run for the snapshot key (first run writes the baseline; trends start from the second run), or (c) the prior snapshot is older than lookbackDays. Check the run log — it'll say which.
How do I segment analytics across regions / seniorities / etc?
Set groupBy: ["location", "seniorityLevel"] (or any combination of location / seniorityLevel / remote / jobType / source / skillCategoryProfile / compensationTier). The summary report adds a segments[] array with per-segment salary percentiles, top skills, and seniority breakdown — fixes the cohort-mixing distortion where mixing $200k SF salaries with €50k Berlin salaries makes the median meaningless.
How do I add domain-specific skills?
Pass customSkills: each entry is { name, regex, category? }. The custom skills get full first-class treatment in topSkills, skillPremiums, skillScarcity, and skillCategoryDemand. Invalid regexes are logged and skipped so a typo doesn't break the run.
What does dataQuality tell me?
It's the auditability layer — salaryCoveragePercent (what % of listings have parseable salary data, so you know how trustworthy the percentiles are), deduplicationConfidence (high/medium/low, based on cohort size + cross-source overlap rate), sourceBias (is this cohort remote-heavy / Europe-skewed / US-skewed / dominated by one source?), and notes[] plain-English warnings about distortions. Use it to decide whether to trust the cohort's analytics for your specific workflow.
What's marketTightness measuring?
A 0–100 supply/demand index combining cross-source posting overlap (employers mass-posting = high demand), salary dispersion (compressed bands = standardised market = tight for talent), and listing volume (more listings = more demand). Returns a label (tight / balanced / loose / unknown) and a reason string explaining the inputs. Use the label in dashboards / Slack alerts when you want a single human-readable signal of the market state.
What's skillScarcity for?
For each skill in topSkills that ALSO has salary premium data, it computes scarcityScore = 0.6 × premiumNorm + 0.4 × rarityNorm. Skills with high pay AND low frequency rank highest — these are the "high leverage" learning targets for job seekers and the "hard to hire" warning signs for talent leaders. Empty when cohort < 20 listings or no salary premiums available.
What's the difference between runMode and baselineStatus?
runMode describes WHAT the actor did this run (snapshot / historical / incremental). baselineStatus describes WHERE we are in the historical-tracking lifecycle (disabled / created / compared / expired). They're independent: a historical run with baselineStatus: "compared" means trend insights were computed; a historical run with baselineStatus: "created" means it was the first run for this key (trends null, baseline written for next time).
My historical run returned trendInsights: null — what happened?
Check baselineStatus. If created, this is the first run for the snapshot key — the baseline has been written and the next run will return trends. If expired, the prior snapshot was older than lookbackDays (default 30) — bump lookbackDays or run more frequently. If disabled, you didn't set enableHistoricalTracking: true. Always check warnings[] too — it'll spell out what happened in plain English.
What's schemaVersion and when do I need to care about it?
schemaVersion is the output contract version — currently "2.0". The actor follows additive-only semantics within a major version: new fields may appear, but existing fields won't be renamed or repurposed. Branch on schemaVersion in long-lived integrations if you want to opt into v3 features explicitly when they ship. For most consumers, the existing field set is stable and you don't need to read this.
What should I do with warnings[]?
Read it before acting on the cohort's analytics. It promotes dataQuality.notes (cohort-bias warnings) alongside other run-level signals (sources failed, low confidence, expired baseline, critical events) so downstream automation can route on a single top-level array. Common pattern: gate Slack alerts on warnings.length === 0 && decisionReadiness === "actionable".
How do I use recommendedActions[]?
Each action is a structured object with a stable action string (e.g. "increase_salary_band" / "learn_skill" / "accelerate_hiring"), a target (when applicable, e.g. "Rust" for learn_skill), confidence/impact/urgency tags, an appliesTo[] audience filter, and a plain-English reason. Branch on action in Dify / n8n / Zapier switch nodes; filter by appliesTo to surface only the audience you care about (e.g. appliesTo.includes("recruiting") for a hiring-team Slack channel). The reason string is paste-ready into reports — no LLM rewriting needed.
What's the difference between marketTightness and marketRegime?
marketTightness is a single-run snapshot of demand pressure (tight / balanced / loose) — answers "is talent supply meeting demand right now?". marketRegime is a state classification (expansion / contraction / stagnation / volatility) — answers "where is the market heading?". The two are complementary: a market can be tight + expansion (heating up) or loose + contraction (cooling fast). Confidence is materially higher on marketRegime when historical tracking is enabled (trend signals dominate the classification).
How does skillTrajectory map skills to stages?
emerging— low frequency (<8%) AND high salary premium (≥5%) AND non-falling trendmainstream— moderate-to-high frequency (≥25%) AND not-saturatedsaturated— high frequency (≥50%) AND no premium (<3%)declining— week-over-week trend ≤ −50%stable— everything else (default fallback)
Velocity (hypergrowth / growing / steady / cooling / falling) is computed independently from the week-over-week delta (when historical tracking is on). Stage answers "where is this skill in its lifecycle?"; velocity answers "how fast is it moving?".
How do mode presets actually change the output?
mode only reorders recommendedActions[] — same actions, different audience priority. default is balanced; job_seeker bubbles learn_skill / apply-now / curriculum actions to the top; recruiter bubbles increase_salary_band / accelerate_hiring / role-spec actions; analyst bubbles enable_historical_tracking / increase_monitoring_frequency / strategy actions. The cohort analytics, marketRegime, skillTrajectory, events, and per-job records are identical across modes.
How do events[] work for alerting?
Each event represents a single threshold crossing — easy to filter, route, and alert on. The default thresholds are conservative (5% salary moves, 25% listing growth, 100% skill emergence). Override via eventThresholds for noisier or quieter alerting. Each event ships severity (critical / warning / info), value, threshold, target (if scoped to a skill/company), and a complete-sentence message. Common pattern: send severity === "critical" events to PagerDuty, severity === "warning" to Slack, severity === "info" to a monitoring dashboard.
What does the whatIf[] engine actually predict?
Only what's derivable from the cohort distribution at run time. A salary_change scenario maps the proposed salary to a percentile rank against the pooled salary distribution (e.g. "10% raise moves you from P50 to P78") and to a compensationTier enum. A skill_emphasis scenario looks the named skill up in skillScarcity, skillTrajectory, and skillPremiums to report stage / velocity / frequency / premium. No claim is made about candidate response rates, time-to-fill, offer-accept rates, or hire outcomes — those signals are not in public job-listing data. Confidence is hard-capped at 60 (medium) and every result carries a caveats[] array. If a recruiter wants forecasts of hire-pipeline impact, they need ATS data, not job-listing data — different actor, different data source.
What does effectiveness: "limited" mean?
Either the scenario produces a small percentile shift (e.g. a 2% salary bump in a flat market) or a user-supplied constraint bound the scenario. When constraints.maxPercent binds, effectiveness automatically downgrades to reflect that the user's real-world constraint reduces the move's impact.
Why is whatIf confidence capped at 60?
Honesty. The actor only has cohort distribution data — not application data, not hire outcomes, not response rates. A counterfactual based purely on percentile-shift cannot honestly claim high-confidence predictive power. The cap forces the output's confidenceLevel to stay medium or low — never high.
How do I read confidenceBreakdown on actions?
Three components, 0–100 each: dataStrength (cohort size + salary coverage + dedup confidence), signalClarity (how cleanly the action's underlying signal fired), historicalConsistency (whether trend signals reinforce the action). Use them to audit specific actions: a learn_skill action with high signalClarity but low historicalConsistency means the current cohort signal is strong but we don't yet know if it persists; reading confidenceBreakdown tells you whether to wait for more snapshots before acting.
What is hold_strategy?
An honest "no-edge" recommendation that fires when (a) regime is unknown or stagnation, (b) marketTightness is balanced or unknown, (c) no strong week-over-week trends, AND (d) no high-urgency actions exist. Most analytics tools over-signal; this actor surfaces "stay the course" as a first-class verdict so consumers know when not to act.
What's the difference between actionClusters[] and recommendedActions[]?
recommendedActions[] is the flat list of 8–12 actions. actionClusters[] groups them into 3–5 themes (compensation_strategy / talent_pipeline / skill_strategy / monitoring_strategy / source_strategy / general) so the output reads as a strategy document rather than an alert stream. Use clusters for executive summaries; use the flat list for granular automation routing.
How does marketMemory build up over time?
Each scheduled run with enableHistoricalTracking: true appends the current run's regime to a bounded regimeHistory (cap 12, FIFO) inside the snapshot KV record. The actor then derives regimeStability (fraction of recent runs in the same regime), lastInflectionDaysAgo (when the regime last changed, if at all), and pattern (one of 9 enum values like expansion_stable / expansion_weakening / volatile_shifting). Patterns activate at 3+ snapshots; before that the field carries pattern: "insufficient-history". Designed to let humans reason in market patterns, not raw deltas.
What does decisionTension[] actually catch?
When two recommendedActions in the same run work against each other under a single sourcing pipeline. Six tension types: cost_vs_selectivity (e.g. raising salary AND tightening specs), speed_vs_quality (acceleration AND gating), remote_vs_local_reach (remote-first AND geo-expansion), act_now_vs_wait (acceleration AND hold), early_mover_vs_safe_bet (investing in emerging skills AND deprioritising declining ones), depth_vs_breadth (broaden query AND segment for clarity). Each tension carries a recommendedBalance so the consumer knows which lever to favour given the cohort's signals. Empty when no contradictory pairs are present — most cohorts will have 0 or 1.
Why surface rejectedActions[] if you're not going to do them?
Trust. Most analytics tools always emit something — that trains users to ignore them. By making the actor explicit about what it WON'T recommend (decrease_salary_band rejected when market is tight, accelerate_hiring rejected in contraction, prioritize_remote_roles rejected in heavily on-site cohorts), every recommended action carries the implicit weight of "the system also considered the opposite move and ruled it out." Same pattern as hold_strategy — explicit abstention strengthens the rest of the output.
How do I read whatIf sensitivity?
Salary scenarios now ship a sensitivity block: lowerOutcome (user input −5pp), upperOutcome (user input +5pp), spreadPercentilePoints, and stability. stability: "low" = the percentile shift is robust (small comp variation produces minimal movement; the cohort distribution is flat in that range). stability: "high" = the percentile shift is sitting on a steep distribution — small input variation produces large outcome swings, plan for non-linearity. moderate is the most common case. The note string explains the spread in plain English. Use this to size risk on real comp moves: high-sensitivity outcomes warrant a buffer, low-sensitivity outcomes give you slack to negotiate.
Can I get decisionTension and rejectedActions on a one-shot run?
Yes — both are derived purely from the current run's recommendedActions[] and the cohort signals. They don't require historical tracking. The richer tension picture (e.g. act_now_vs_wait requires both accelerate_hiring and hold_strategy to be in the action list) emerges most often when historical tracking IS on, but the engine works fine on a single shot.
Automation snippets
Three paste-ready patterns for the most common automation surfaces. All three branch on stable enums — no LLM, no prompt engineering, no fuzzy matching.
1. Slack alert from events[]
Wire an Apify Run-Succeeded webhook to a service that can read the run's dataset (Make, Zapier, n8n). After fetching the summary record (recordType === "summary"), iterate events[] and fan out by severity:
// Pseudocode for an n8n / Make / Zapier function step
const summary = items.find((it) => it.recordType === 'summary');
if (!summary || summary.warnings.length > 0) return; // gate on clean runs
if (summary.decisionReadiness !== 'actionable') return;
for (const ev of summary.events) {
const channel = ev.severity === 'critical' ? '#oncall'
: ev.severity === 'warning' ? '#labor-market-alerts'
: '#labor-market-info';
await slack.postMessage({
channel,
text: `:rotating_light: *${ev.type}* — ${ev.message}`,
attachments: [{
color: ev.severity === 'critical' ? 'danger' : ev.severity === 'warning' ? 'warning' : 'good',
fields: [
{ title: 'Query', value: summary.query, short: true },
{ title: 'Regime', value: summary.marketRegime.type, short: true },
{ title: 'Value', value: String(ev.value), short: true },
{ title: 'Threshold', value: String(ev.threshold), short: true },
],
}],
});
}
The ev.message field is a complete, paste-ready sentence — no LLM rewriting needed. Use the example above as the function-step body in n8n, the webhook handler in Make, or the action step in Zapier.
2. n8n switch node on recommendedActions[].action
Drop the actor's run output into n8n. Use a Switch node with the routing key set to {{$json.summary.recommendedActions[0].action}}. The action enum is stable across runs:
| Switch case (action) | Route to |
|---|---|
accelerate_hiring | hiring-manager Slack channel |
increase_salary_band | comp-team email distribution |
learn_skill (with target) | learning-recommendations queue + employee-newsletter source |
invest_in_skill (with target) | curriculum review board |
hold_strategy | dashboard tile only — no notification |
enable_historical_tracking | DevOps queue (config change) |
re_run_for_full_coverage | actor scheduler — re-run with +1 retry |
broaden_query | analyst review (cohort too narrow) |
diversify_sources | data-team queue |
For more granular routing, switch on the full action plus target: {{$json.summary.recommendedActions[0].action}}:{{$json.summary.recommendedActions[0].target ?? ''}}. Combine with appliesTo filtering for persona-specific fan-out: recommendedActions.filter((a) => a.appliesTo.includes('recruiting')).
3. Recruiter workflow with decisionTension[]
Before a hiring manager applies any recommended action, surface tensions so they don't pick contradictory moves:
// Pseudocode for a recruiter Slack command / dashboard pre-check
const summary = await getJmiSummary({ query, mode: 'recruiter' });
// Filter to recruiter-relevant actions
const recruiterActions = summary.recommendedActions
.filter((a) => a.appliesTo.includes('recruiting'));
// Block-and-explain if tensions exist
if (summary.decisionTension.length > 0) {
const t = summary.decisionTension[0];
return slack.postMessage({
channel: '#hiring-decisions',
text: `:warning: Strategy tension detected: *${t.tension}*`,
attachments: [{
color: 'warning',
text: `${t.explanation}\n\n*Recommended balance:* ${t.recommendedBalance}\n\nActions involved: ${t.between.join(' ↔ ')}`,
}],
});
}
// Surface the rejected actions so the recruiter knows the system already considered them
if (summary.rejectedActions.length > 0) {
const lines = summary.rejectedActions.map((r) => `• *${r.action}* — ${r.reason}`).join('\n');
await slack.postMessage({
channel: '#hiring-decisions',
text: `:no_entry: System rejected the following alternatives:\n${lines}`,
});
}
// Then surface the top 3 recruiter actions
for (const a of recruiterActions.slice(0, 3)) {
await slack.postMessage({
channel: '#hiring-decisions',
text: `:white_check_mark: *${a.action}* (urgency: ${a.urgency}, confidence: ${a.confidence}/100) — ${a.reason}`,
});
}
This pattern catches the most common hiring mistake: a recruiter applies multiple recommended actions sequentially without realising they trade off against each other (e.g. raising comp AND tightening role specs in the same week). The decisionTension[] array surfaces those pairs explicitly so the conversation happens BEFORE the spec is changed.
Use in Dify
Drop this actor into Dify workflows via the Apify plugin's Run Actor node. Each job returns scored, classified, and tagged with recommendedAction as structured JSON — apply-now / research-company / review-fit / skip-low-detail plus compensationTier (below-market / at-market / above-market / premium / unknown) that your downstream node branches on. A generic job scraper pointed at the same boards returns raw HTML; this returns decisions.
The summary record carries decisionReadiness (actionable / monitor / insufficient-data) — gate your automation on that scalar so it only fires when the cohort is statistically meaningful. confidenceLevel (high / medium / low) is the secondary lever. claim and marketSnapshot strings are usable verbatim in Slack messages, email subjects, and agent prompts — no LLM rewriting needed. actionReason on every job is the equivalent for per-listing routing.
- Actor ID:
ryanclinton/job-market-intelligence - Sample input (recurring senior+ Python market scan with full analytics):
{
"query": "senior python engineer",
"remoteOnly": true,
"datePosted": "week",
"analyzeSkills": true,
"analyzeSalaries": true,
"maxResults": 200
}
Dify branching example
A typical Dify workflow consumes the dataset in three stages:
- Gate the run — read the
summaryrecord (the first dataset item whererecordType === "summary"). Use an if/else node:decisionReadiness === "actionable"→ continue to per-job routingdecisionReadiness === "monitor"→ log the cohort but skip per-job notificationsdecisionReadiness === "insufficient-data"→ escalate / re-run with broader filters
- Surface cohort-level decisions — iterate
summary.recommendedActions[]and route byaction:"increase_salary_band"/"accelerate_hiring"→ notify hiring manager"learn_skill"(withtarget) → push to a learning-recommendations channel"diversify_sources"→ log to monitoring channel for the data team- Filter by
appliesTo.includes("recruiting")etc. to fan out only the actions the recipient cares about
- Route each job — iterate the
recordType === "job"records. Use a switch node onrecommendedAction:"apply-now"→ push to a "high-priority" Slack channel with the job'stitle,company, salary range, andactionReason"research-company"→ push to a "needs-research" queue (oftencompensationTier === "unknown")"review-fit"→ write to a spreadsheet for batch human review"skip-low-detail"→ drop silently
Because recommendedAction, compensationTier, decisionReadiness, confidenceLevel, marketRegime.type, and the recommendedActions[].action strings are all stable enums, branching is exact-match equality — no fuzzy matching, no LLM classification, no prompt engineering. The same enums work in n8n switch nodes, Zapier filters, Make routers, and SQL WHERE clauses.
For event-driven workflows, gate on summary.events[]: severity === "critical" → PagerDuty / on-call, severity === "warning" → Slack, severity === "info" → dashboard tile. Every event ships a complete-sentence message so notification copy is paste-ready.
actionReason, recommendedActions[].reason, marketRegime.note, and claim are emitted as plain-English sentences from deterministic templates — no LLM was called to write them, so they're free of hallucination and stable across runs. Pipe them straight into notification copy, agent tool-call summaries, or LLM prompts as authoritative ground-truth context.
Integrations
Connect the Job Market Intelligence actor to your existing tools and workflows:
- Zapier — Trigger actions in 5,000+ apps when new job listings are found
- Make — Build complex job monitoring automation workflows
- Google Sheets — Export job data directly to spreadsheets for analysis
- Slack — Get instant notifications when new jobs matching your criteria appear
- The Apify API — Programmatic access to results via REST API
- Apify Webhooks — Trigger custom actions when a run finishes
Related Actors
| Actor | Use Case |
|---|---|
| ryanclinton/website-contact-scraper | Extract emails, phone numbers, and social links from company websites found in job listings |
| ryanclinton/b2b-lead-gen-suite | Combine multiple data sources to build enriched B2B lead lists |
| ryanclinton/company-deep-research | Deep-dive into a specific company with financial, social, and web data |
| ryanclinton/github-repo-search | Find open-source projects from companies that appear in your job market results |
| ryanclinton/website-tech-stack-detector | Identify the technology stack a hiring company actually uses on their website |
| ryanclinton/serp-rank-tracker | Monitor search engine rankings for job-related keywords |
Related actors
AI Cold Email Writer — $0.01/Email, Zero LLM Markup
Generates personalized cold emails from enriched lead data using your own OpenAI or Anthropic key. Subject line, body, CTA, and optional follow-up sequence — $0.01/email, zero LLM markup.
AI Outreach Personalizer — Emails with Your LLM Key
Generate personalized cold emails using your own OpenAI or Anthropic API key. Subject lines, opening lines, full bodies — tailored to each lead's role, company, and signals. $0.01/lead compute + your LLM costs. Zero AI markup.
B2B Lead Generation Suite - Find Emails, Score & Qualify Leads
All-in-one B2B lead pipeline. Enter company URLs, get enriched leads with emails, phone numbers, contacts, email patterns, quality scores (0-100), grades, and business signals from a 3-step automated pipeline.
B2B Lead Qualifier - Score & Rank Company Leads
B2B lead scoring tool and API that scores companies 0-100 from 30+ website signals. 5 scoring categories, 4 profiles (sales, marketing, recruiting, default). Plain-English explanations, hiring detection, industry classification, score change tracking. $0.15/lead, no subscription.
Ready to try Job Market Intelligence?
Run it on your own Apify account. Apify offers a free tier with $5 of monthly credits.
Open on Apify Store