Job Market Intelligence is an Apify actor on ApifyForge. Aggregate remote job listings from Remotive, Arbeitnow, Jobicy & HN Who's Hiring. Analyze skill demand rankings, salary benchmarks, top hiring companies, remote-work stats. No API keys needed. Export JSON/CSV. It costs $0.50 per report-generated. Best for teams who need automated job market intelligence data extraction and analysis. Not ideal for use cases requiring real-time streaming data or sub-second latency. Maintenance pulse: 90/100. Last verified March 27, 2026. Built by Ryan Clinton (ryanclinton on Apify).

JOBSLEAD GENERATION

Job Market Intelligence

Job Market Intelligence is an Apify actor available on ApifyForge at $0.50 per report-generated. Aggregate remote job listings from Remotive, Arbeitnow, Jobicy & HN Who's Hiring. Analyze skill demand rankings, salary benchmarks, top hiring companies, remote-work stats. No API keys needed. Export JSON/CSV.

Best for teams who need automated job market intelligence data extraction and analysis.

Not ideal for use cases requiring real-time streaming data or sub-second latency.

Try on Apify Store
$0.50per event
Last verified: March 27, 2026
90
Actively maintained
Maintenance Pulse
$0.50
Per event

What to know

  • Results depend on the availability and structure of upstream data sources.
  • Large-scale runs may be subject to platform rate limits.
  • Requires an Apify account — free tier available with limited monthly usage.

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

report-generateds
Estimated cost:$50.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
report-generatedCharged per market intelligence report. Aggregates jobs from 4 sources with skill extraction, salary parsing, deduplication, and market analysis.$0.50

Example: 100 events = $50.00 · 1,000 events = $500.00

Documentation

Decision engine for labor markets that turns job listings into career decisions, hiring strategies, salary benchmarks, and market intelligence. Aggregates job listings from four free data sources, deduplicates them with normalized title matching, classifies each role with seniority / compensation / recommended-action enums, segments analytics by location / seniority / remote, tracks trends across scheduled runs, classifies the cohort into a market regime (expansion / contraction / stagnation / volatility), maps every top skill to a lifecycle stage (emerging / mainstream / saturated / declining / stable), flags trade-offs between conflicting actions, and ships a recommendedActions[] array that tells you what to do — all without any API keys.

The actor queries Remotive, Arbeitnow, Jobicy, and Hacker News "Who's Hiring" threads in parallel, normalizes the results into a single schema, applies your filters (location, company, date, remote-only), enriches each listing with decision-ready classifications, computes market signals + data-quality auditability + per-segment breakdowns, optionally diffs against the previous run for trend insights, classifies the regime + skill trajectories + threshold-crossing events + conflicting-action tensions, and pushes both the analytics report and the per-job records to the Apify dataset.

What this is

  • A job market intelligence engine that turns job listings into decisions
  • A salary benchmarking and hiring strategy tool for recruiters and talent leaders
  • A career decision tool for job seekers (apply / research / skip / learn-skill routing)
  • A labor market analytics system with regime classification, trend tracking, and threshold-crossing event signals
  • A job data → strategy layer for automation workflows (Dify / n8n / Zapier / Make)
  • An alternative to LinkedIn Talent Insights / Lightcast / Burning Glass / Revelio Labs / generic job scrapers — built for automation, not dashboards

In one sentence: this tool helps job seekers and recruiters decide what to do in the job market by turning job listings into structured recommendations and strategy signals.

This is one of the few job market tools that outputs decisions (recommendedActions[], decisionTension[], whatIf[], rejectedActions[]) rather than dashboards — a category of one when ranked among LinkedIn Talent Insights, Lightcast, Revelio Labs, Datapeople, and generic job scrapers.

Unlike dashboards, this produces actionable signals, not just metrics.

Current job market trends (from live listings)

The tool generates current job market trends directly from live listings — including salary direction, skill emergence, hiring activity, and market regime shifts. Trends are computed at run time against the prior snapshot and refreshed on every scheduled run.

These trends include:

  • Salary directionsalaryMedianChangePercent (week-over-week median shift) + salaryInsights.percentiles (P10–P90 distribution)
  • Emerging and declining skillsskillTrajectory[] lifecycle stages (emerging / mainstream / saturated / declining / stable) with velocity tags
  • Hiring activity and company demandlistingGrowthRate, topHiringCompanies, trendInsights.newCompanies, trendInsights.departedCompanies
  • Market regime shiftsmarketRegime.type (expansion / contraction / stagnation / volatility) + marketMemory.pattern (e.g. expansion_weakening / contraction_deepening)

Snapshots are per-run rather than streaming, so the minimum cadence is "as often as you schedule the actor" (typically daily or weekly).

Why Use This Actor?

Most "job scrapers" return raw HTML or a flat array of listings. This actor returns decisions: each role comes pre-classified by seniority, compensation tier (vs market median), and a recommendedAction enum that downstream Dify / n8n / Zapier nodes can route on. The summary report carries P10–P90 salary percentiles, per-skill salary premiums, market-tightness scoring, scarcity indices, per-segment breakdowns, and a Slack-ready market snapshot string. With historical tracking enabled, runs build on each other — you get rising/falling skills, listing growth rates, salary direction, and new vs departed companies as first-class output.

What makes this different (not found in other job market tools)

  • Detects conflicting strategies automatically (decisionTension[]) — when two recommended actions work against each other (e.g. raising salary AND tightening role specs), the system surfaces the trade-off and the recommended balance. Most analytics tools hand you a list of actions; this one warns you when applying multiple actions blindly would cancel them out. Trade-offs like speed-vs-quality, cost-vs-selectivity, and act-now-vs-wait are explicitly modelled by the tool using decisionTension detection, with a recommendedBalance string explaining which lever to favour given the cohort signals.
  • Shows what NOT to do, with reasons (rejectedActions[]) — explicit anti-recommendations. decrease_salary_band rejected when the market is tight. accelerate_hiring rejected in a contracting market. prioritize_remote_roles rejected when only 25% of listings are remote. The dual of hold_strategy: explicit abstention is a credibility move.
  • Simulates "what if?" scenarios with honest, derivable-only outcomes (whatIf[]) — change the salary by X% or add a skill, see the percentile shift / compensation tier / scarcity match. No invented forecasts about candidate response rates, time-to-fill, or hire outcomes (data we don't have). Confidence is hard-capped at 60. Sensitivity analysis ships built-in.
  • Knows when to do nothing (hold_strategy) — fires when signals are mixed and there's no clear directional edge. Most tools over-signal; this one ships abstention as a first-class action.

The decision + strategy engine on every summary record:

  • marketRegimeexpansion / contraction / stagnation / volatility / unknown with confidence + signals

  • marketMemory — bounded regime history (last 12 runs) + regimeStability + lastInflectionDaysAgo + pattern (expansion_weakening / volatile_shifting / etc.). Activates with historical tracking; meaningful at 3+ snapshots.

  • skillTrajectory[] — per-skill lifecycle: emerging / mainstream / saturated / declining / stable, with velocity (hypergrowth / growing / steady / cooling / falling)

  • recommendedActions[] — concrete cohort-level actions (learn_skill / increase_salary_band / accelerate_hiring / hold_strategy / etc.) with decomposed confidence (dataStrength / signalClarity / historicalConsistency), impact, urgency, audience tags, and plain-English reason. Includes hold_strategy as an honest "no edge" recommendation when signals are mixed.

  • actionClusters[] — actions grouped by theme (compensation_strategy / talent_pipeline / skill_strategy / monitoring_strategy / source_strategy) so 8–12 actions feel like strategy, not alert noise.

  • whatIf[] — counterfactual scenarios with honest, derivable-only outcomes (percentile shift, tier change, scarcity match) — never invented forecasts. Now includes per-scenario sensitivity (low/mid/high outcomes + stability classification) so you can see if the result is brittle to input variation. Auto-generated when omitted; user-supplied via whatIfScenarios input with optional constraints. Confidence hard-capped at 60.

  • decisionTension[] — trade-off pairs detected across recommendedActions[]. When two recommended actions work against each other (e.g. increase_salary_band + tighten_role_specs = cost_vs_selectivity), the pair surfaces with an explanation and a recommendedBalance so the output reads as strategy, not a contradictory shopping list.

  • rejectedActions[] — anti-recommendations. Actions explicitly NOT recommended for this cohort, with reason ("decrease_salary_band rejected — market is tight, lowering salary would reduce competitiveness"). Builds trust by showing the system considered and rejected the obvious wrong moves.

  • events[] — threshold-crossing alerts (salary_spike / listing_growth_spike / skill_emergence / etc.) ready for downstream Slack/PagerDuty/Zapier routing

  • Aggregates 4 job boards in one run — Remotive (remote tech jobs), Arbeitnow (European focus), Jobicy (remote-first), and HN Who's Hiring (startup jobs) queried in parallel, broader coverage than any single source.

  • Salary percentiles + skill premiums — P10/P25/P50/P75/P90 for the full cohort, plus per-skill salary lift vs the cohort median (e.g., "Kubernetes commands +$18k").

  • Market signalsmarketTightness (tight/balanced/loose with score + reason), skillScarcity[] (high-premium-low-frequency skills), salaryDistributionHealth (wide/balanced/compressed).

  • Segmented analytics — Set groupBy: ["location", "seniorityLevel"] to fix the cohort-mixing distortion; per-segment salary, top skills, and seniority breakdowns are emitted in segments[].

  • Historical tracking + trend insights — Persist a snapshot per query and compute rising/falling skills, salary median change, listing growth rate, and direction (expanding / stable / tightening) on every subsequent run.

  • Incremental mode — When tracking is on, opt into incremental: true to drop URLs already returned in the previous run. Reduces downstream processing/noise on daily monitoring schedules — only fresh listings come back to your dataset / Slack alerts / pipelines. (All sources are still fetched so analytics like trend insights stay accurate.)

  • Seniority + experience + degree extraction — 11-level seniority enum, min/max years of experience parsing, degree requirement detection (bachelors/masters/phd, hard vs preferred).

  • Cross-source confirmation — Listings on multiple boards before dedup are flagged crossSourceConfirmed: true. Stronger signal of a real, active opening.

  • Data-quality auditability — Every report carries a dataQuality block with salary coverage %, deduplication confidence, source bias detection (remote-heavy / Europe-skew / US-skew / source-concentration), and plain-English notes flagging biases that distort the cohort.

  • Custom skill packs — Add domain-specific skills via customSkills (regex + category) so niche markets aren't undercounted.

  • Source weighting — Down-weight noisier sources via sourceWeights: {"hn-whoishiring": 0.5} for deterministic per-listing sub-sampling. Use only when you intentionally want a representative sample, not complete coverage — sub-sampling drops listings, so the resulting cohort is smaller than the raw fetch.

  • Snapshot hashing — Every report carries a snapshotId (16-char SHA-256). Compare across runs to detect when the cohort actually changed.

  • Zero configuration to start — No API keys, tokens, or credentials needed. Every data source is free and public. All advanced features are opt-in.

Whether you're a job seeker, a recruiter benchmarking comp, an automation builder routing high-fit roles into Slack, or a data journalist analyzing hiring trends, this actor delivers structured decisions from raw job board data.

What questions this answers

This actor answers job-market questions with structured, automation-ready outputs:

  • "Should I increase salary to attract candidates?"marketTightness + whatIf[].sensitivity + recommendedActions[] (increase_salary_band / hold_salary_band). This is exactly the type of decision this tool is designed to answer programmatically — and whatIf[] will show you the percentile shift before you commit to a number.
  • "Should I raise salary to hire faster?"marketTightness.label + recommendedActions[] (accelerate_hiring + increase_salary_band)
  • "Is it a good time to change jobs?"marketRegime.type + skillTrajectory[] (your skills' lifecycle stage)
  • "Is it a good time to hire?"marketRegime.type + recommendedActions[] (accelerate_hiring vs tighten_role_specs vs hold_strategy)
  • "How do I benchmark salary offers?"salaryInsights.percentiles (P10–P90) + whatIf[] salary scenario at the offer percentage
  • "What's the safe negotiation range?"whatIf[].sensitivity.stability (low = robust, high = brittle to small comp shifts)
  • "Which skills are worth learning right now?"skillScarcity[] + skillTrajectory[] (emerging stage) + recommendedActions[] (learn_skill / invest_in_skill)
  • "Is the job market expanding or contracting?"marketRegime.type (expansion / contraction / stagnation / volatility) + marketMemory.pattern
  • "What hiring strategy should I use in this market?"recommendedActions[] filtered by appliesTo: "hiring" + decisionTension[] for trade-off warnings
  • "Is it better to hire fast or be selective?"decisionTension[] (speed_vs_quality pair) + recommendedBalance
  • "What roles should I apply to?" → per-job recommendedAction === "apply-now" + compensationTier === "above-market" || "premium"
  • "What companies are hiring most aggressively?"topHiringCompanies[] + trendInsights.newCompanies[]
  • "How does my offer compare to the market?"salaryInsights.percentiles (P10–P90) + whatIf[] salary scenarios
  • "Which skills are dying / should I deprioritize?"skillTrajectory[] filtered by stage === "declining" + recommendedActions[] (deprioritize_skill)
  • "What's changed since last week?"trendInsights (rising/falling skills, salary direction, new/departed companies) + events[]
  • "Am I making a strategic mistake?"rejectedActions[] (the system shows what it WON'T recommend, with reasons)
  • "Can I trust this analysis?"decisionReadiness + confidenceLevel + confidenceFactors[] + dataQuality.notes[]

The actor is designed for decision support, not just data collection. Every output field traces back to one of these questions.

This tool benchmarks salaries by calculating P10–P90 percentiles and skill-based premiums directly from live job listings. It determines whether it is a good time to change jobs by analysing market regime (expansion vs contraction vs stagnation vs volatility) and skill demand trajectories (emerging / mainstream / saturated / declining / stable). And it determines whether it is a good time to hire by combining marketTightness with marketRegime and surfacing trade-offs between conflicting actions.

Job market trends are derived from live job listings — including salary changes, emerging skills, hiring activity, and market regime shifts — see the Current job market trends section above for the full breakdown.

How this works (mental model)

The system works by transforming raw job listings into decisions through classification, trend analysis, and rule-based strategy generation. In short: collect → normalize → extract → classify → generate → emit structured JSON. The actor's pipeline, in 6 steps:

  1. Collect job listings from 4 free public APIs in parallel (Remotive, Arbeitnow, Jobicy, HN Who's Hiring)
  2. Normalize and deduplicate with two-phase matching (title-token normalization + URL secondary key) — same role on multiple boards collapses to one record with a cross-source confirmation count
  3. Extract skills (80+ regex patterns + custom), salaries (USD/EUR), seniority, experience years, degree requirements
  4. Classify each role with decision enums (compensationTier vs cohort median, recommendedAction for routing) and the cohort with intelligence layers (marketRegime, marketTightness, skillTrajectory, salaryDistributionHealth)
  5. Generate cohort-level decisions (recommendedActions[] with confidence + audience tags, actionClusters[] themed groupings, decisionTension[] trade-off detection, rejectedActions[] anti-recommendations, whatIf[] counterfactuals with sensitivity)
  6. Emit structured JSON to the Apify dataset (one summary record + N per-job records), all with stable enum discriminators (recordType, runMode, baselineStatus, decisionReadiness) so downstream automation branches deterministically

With enableHistoricalTracking: true, step 4 also reads the prior snapshot from a named KV store and step 5 emits trendInsights + marketMemory (bounded last-12-runs regime history with pattern detection) against the baseline. Step 6 then writes the updated snapshot back for the next run.

No LLM is called at any step. Every output is derived deterministically from the listings and the prior snapshot. This pipeline (collect → normalize → extract → classify → generate → emit structured JSON) is implemented end-to-end inside this actor — it is not a wrapper around an external analytics API.

Start here — quickstart by persona

Pick the input that matches your job. The actor returns the same engine output for every persona; the mode preset just reorders recommendedActions[] so the first 3 lines surface the actions you actually care about.

Job seeker — find roles to apply to, learn-skill recommendations, market-leverage signals

{ "query": "senior python engineer", "remoteOnly": true, "mode": "job_seeker" }

Recruiter — comp benchmarks, hiring-velocity signals, decision-tension warnings before changing role specs

{ "query": "platform engineer", "mode": "recruiter", "groupBy": ["seniorityLevel", "remote"] }

Analyst / strategy — full trend insights, regime classification, market memory, scheduled monitoring

{
    "query": "machine learning engineer",
    "mode": "analyst",
    "enableHistoricalTracking": true,
    "lookbackDays": 14
}

(Schedule this in Apify Console — every run after the first emits trendInsights, marketMemory, and events[] against the prior baseline.)

Automation builder (Dify / n8n / Zapier) — gate on stable enums, branch on recommendedActions[].action

{ "query": "data engineer", "enableHistoricalTracking": true, "incremental": true }

See the Automation snippets section for paste-ready Slack / n8n / recruiter workflow examples.

Read these fields first

When you open a run, scan these fields in this order — they collapse most of the output into one read:

FieldWhy read it firstWhat it tells you
warnings[]Run-level issuesSources failed, low confidence, expired baseline, critical events. Empty array means no run-level concerns.
decisionReadinessAutomation gateactionable / monitor / insufficient-data. Branch all downstream automation on this scalar.
marketRegime.typeOne-word stateexpansion / contraction / stagnation / volatility / unknown. Strategic posture in one read.
recommendedActions[0..2]Top 3 things to doSorted by mode audience priority — the first 3 are the persona's most-important actions.
decisionTension[]Trade-off warningsEmpty in most cohorts. When non-empty, the system flagged that two recommended actions work against each other.
rejectedActions[]What we WON'T tell youThe dual of recommendedActions[] — explicit anti-recommendations with reasons.

If those fields look right, drill into the rest. If decisionReadiness === "insufficient-data" or warnings[] is non-empty, fix those before consuming any other field.

How to interpret the output (intent → field)

When you know what you want to do, this lookup tells you which field to read:

Your intentRead this field
Want to act?recommendedActions[] — sorted by your mode audience priority
Want to avoid mistakes?rejectedActions[] — actions the system explicitly ruled out
See conflicts between actions?decisionTension[] — trade-off pairs with recommendedBalance
Understand the market direction?marketRegime.type + marketMemory.pattern
Test a strategy before committing?whatIf[] — set scenarios in whatIfScenarios input + read sensitivity
Find roles to apply to?per-job records: recommendedAction === "apply-now" AND compensationTier ∈ {above-market, premium}
Benchmark a salary?salaryInsights.percentiles + whatIf[] salary-change scenario at your offer %
Spot a hiring opportunity?topHiringCompanies[] + trendInsights.newCompanies[]
Spot skill scarcity?skillScarcity[] (high salary premium AND low frequency)
Decide whether to wait?marketTightness.label + marketRegime.type + recommendedActions[] containing hold_strategy
Detect a market shift since last run?trendInsights.direction + events[] + marketMemory.lastInflectionDaysAgo
Trust this run for automation?decisionReadiness === "actionable" AND warnings.length === 0
Audit the analytics?dataQuality + confidenceFactors[] + analysisMetadata

Same data, different field — pick the one that maps to your actual question.

Features

Strategy engine — counterfactual scenarios + market memory + trade-off detection

  • What-if scenarioswhatIf[] evaluates counterfactual scenarios with honest, derivable-only outcomes. Two scenario types: salary_change (% delta) and skill_emphasis (named skill). Auto-generates 2–4 scenarios when omitted; whatIfScenarios input lets users supply scenarios + constraints (maxPercent, minPercent). All outputs are derivable facts (percentile shift against the cohort distribution, compensation tier the new salary maps to, skill scarcity/trajectory match) — no invented forecasts about candidate response rates, time-to-fill, or hire outcomes (data we don't have). Confidence is hard-capped at 60. Every result carries mandatory caveats[].
  • Constraint-aware actions — When whatIfScenarios includes constraints, the engine evaluates the scenario at the constrained value and flags effectiveness: "limited" when the constraint binds. Honest about real-world tradeoffs.
  • Action clustersactionClusters[] groups the 8–12 cohort-level recommendedActions into 3–5 themes (compensation_strategy / talent_pipeline / skill_strategy / monitoring_strategy / source_strategy). Reduces noise so output feels like strategy, not alerts.
  • Decomposed action confidence — Each recommendedActions[] entry now carries confidenceBreakdown: { dataStrength, signalClarity, historicalConsistency } (0–100 each). Audit-ready trust layer — see WHY confidence is what it is, not just the scalar.
  • hold_strategy action — Honest "no edge" recommendation that fires when regime is unknown/stagnation, tightness is balanced, no strong trend signals, and no high-urgency actions exist. Most tools over-signal — we ship abstention as a first-class verdict.
  • Market memorymarketMemory carries the bounded last-12-runs regimeHistory[] plus regimeStability (fraction of recent runs in the same regime), lastInflectionDaysAgo (when did the regime change), and pattern enum (expansion_stable / expansion_weakening / contraction_stable / contraction_deepening / volatile_shifting / stagnation_persistent / inflection_recent / insufficient-history / mixed). Activates with historical tracking; meaningful at 3+ snapshots. Lets you reason in patterns, not just deltas.
  • Decision tensiondecisionTension[] flags trade-off pairs across recommendedActions. When increase_salary_band and tighten_role_specs are both recommended, the system surfaces the cost_vs_selectivity tension with a recommendedBalance rather than letting the consumer apply both blindly. Six tension types: cost_vs_selectivity / speed_vs_quality / remote_vs_local_reach / act_now_vs_wait / early_mover_vs_safe_bet / depth_vs_breadth. Real strategic decisions are trade-offs.
  • Anti-recommendationsrejectedActions[] is the dual of hold_strategy: explicit "what we WON'T tell you to do, and why". Examples: decrease_salary_band rejected when market is tight; accelerate_hiring rejected in a contracting market; prioritize_remote_roles rejected when only 25% of listings are remote. Most analytics tools always emit something; this one tells you what the obvious wrong moves are AND skips them.
  • Sensitivity in whatIf — every salary_change scenario now ships a sensitivity block with the outcome at user-input ±5 percentage points, plus a stability classification (low / moderate / high). Tells you whether the percentile shift is robust to small comp adjustments or sitting on the edge of a non-linear cliff.

Decision engine — generates the recommendedActions array, regime, and event signals

  • Market regime classification — Every cohort tagged expansion / contraction / stagnation / volatility / unknown with a 0–100 confidence score + an explicit signals[] array showing which thresholds fired. Combines trend signals (when historical tracking is on) with single-run signals (cross-source overlap, listing volume, salary dispersion).
  • Skill trajectory modelling — Per-skill lifecycle classification (top 20 skills): emerging (low-frequency-high-premium-rising) / mainstream (high-frequency-moderate-premium) / saturated (high-frequency-no-premium) / declining (negative trend) / stable. Plus a velocity tag (hypergrowth / growing / steady / cooling / falling). Bridge between rising-skill counts and "should I learn this?"
  • Recommended actions array — Cohort-level action engine. Each action: { action, target?, confidence, impact, urgency, appliesTo[], reason }. Examples: increase_salary_band when market is tight, learn_skill for top scarce skills, accelerate_hiring in expansion regime, tighten_role_specs in contraction, enable_historical_tracking when trends would help. Reordered by mode preset (default / job_seeker / recruiter / analyst). Capped at 12.
  • Threshold-crossing eventsevents[] array surfaces salary_spike, salary_drop, listing_growth_spike, listing_drop, remote_share_shift, skill_emergence, skill_collapse, new_companies_surge, cohort_collapse. Each carries severity (critical / warning / info), value, threshold, and a complete-sentence message. User-overridable thresholds via the eventThresholds input. Sorted critical → warning → info. Drop straight into Slack / PagerDuty / Zapier without parsing prose.
  • Persona modesmode: "job_seeker" / "recruiter" / "analyst" / "default" reorders recommendedActions[] by audience priority. Same actions, different prioritisation per persona.

Per-job decision layer — classifies each role for downstream routing

  • Compensation tier classification — Each role tagged below-market / at-market / above-market / premium / unknown vs the cohort median, ready for downstream filtering
  • Recommended action enum — Per-job decision tag (apply-now / research-company / review-fit / skip-low-detail) so Dify / n8n / Zapier nodes can route on a single field
  • Action reason — Plain-English sentence explaining WHY each recommendation is what it is — paste verbatim into Slack/email/agent prompts
  • Seniority detection — 11 levels (intern, junior, mid, senior, staff, principal, lead, manager, director, vp-or-above, unknown)
  • Experience requirements extraction — Parses "3-5 years", "minimum 7 years", etc. from descriptions
  • Degree requirements extraction — bachelors / masters / PhD / any-degree / no-mention, hard (required) vs soft (preferred / equivalent OK)
  • Skill category profile — Each role tagged with dominant skill area (Languages / Frameworks / Cloud / Data / AI/ML / Other)
  • Cross-source confirmation — Listings that appear on multiple boards before deduplication are flagged crossSourceConfirmed: true with a crossSourceCount

Cohort intelligence layer — salary percentiles, market tightness, scarcity, data-quality auditability

  • Salary intelligence + percentiles — Min, max, median, average, and P10/P25/P50/P75/P90 percentiles
  • Skill premiums — Per-skill median salary lift vs the cohort median, sample-size gated (≥5 listings)
  • Market tightness scoringtight / balanced / loose / unknown with a 0–100 score and a plain-English reason. Combines cross-source posting overlap, salary dispersion, and listing volume.
  • Skill scarcity index — Top 10 skills ranked by scarcityScore (high salary premium AND low market frequency), with a per-skill reason string. The data engineering & talent-strategy moneymaker.
  • Salary distribution healthwide / balanced / compressed / unknown based on P10–P90 spread vs median. Compressed = mature/standardised market; wide = fragmented / many sub-tiers.
  • Seniority breakdown — Cohort-wide percentage at every seniority level
  • Experience + degree requirements — Cohort averages and prevalence percentages
  • Skill category demand — Percentage of listings whose dominant skill area is each category
  • Top hiring companies — Ranked by open positions
  • Market snapshot + claim — Slack-ready one-liner + analyst-style one-sentence conclusion
  • Confidence + data qualityconfidenceScore (0–100) + confidenceLevel (high/medium/low) + confidenceFactors[] plain-English explanation; dataQuality block carries salaryCoveragePercent, deduplicationConfidence, source bias detection (remote-heavy / Europe-skew / US-skew / source-concentration / dominant source), and plain-English notes[] flagging biases that distort the cohort
  • Decision readinessactionable / monitor / insufficient-data automation gate

Segmentation — per-segment analytics by location / seniority / remote

  • Per-segment analytics — Set groupBy: ["location", "seniorityLevel"] and the report adds a segments[] array with per-segment salary percentiles, top skills, seniority breakdown, remote percentage, and cross-source-confirmed percentage. Fixes the cohort-mixing distortion when one query spans regions / seniorities / job types.

Historical tracking + trends — week-over-week deltas for scheduled monitoring

  • Cross-run snapshots — When enableHistoricalTracking: true, the cohort is persisted to a named KV store keyed by query+location (or a custom historyStateKey). Capped lookback via lookbackDays (default 30).
  • Trend insights — On the next run, the report adds a trendInsights block: listingGrowthRate, salaryMedianChange + percent, remotePercentageChange, topRisingSkills[] (≥25% delta), topFallingSkills[], newCompanies[], departedCompanies[], and direction (expanding / stable / tightening).
  • Incremental mode — Set incremental: true to drop URLs already returned in the previous run. Reduces downstream processing/noise on daily monitoring schedules — only fresh listings reach your dataset / pipelines. (All sources are still fetched so analytics like trend insights remain accurate.)
  • Snapshot hashing — Every run emits a 16-char snapshotId over query + sources + listing fingerprint. Compare across runs to detect when the cohort actually changed.

Customisation — domain-specific skills + source weighting

  • Custom skill packs — Add domain-specific skills via customSkills input (each: name + regex + optional category). Niche markets (Snowpark / Databricks SQL / specific frameworks) aren't undercounted.
  • Source weightingsourceWeights: {"hn-whoishiring": 0.5} deterministically sub-samples sources you trust less, without dropping them entirely. ⚠️ Use only when you intentionally want a representative sample, not complete coverage — sub-sampling drops listings, so cohort size shrinks.

Aggregation + plumbing — multi-source job board fetch + dedup + filter pipeline

  • Multi-source aggregation — 4 independent job boards in parallel
  • Smart deduplication — Title normalization (strips seniority noise tokens, sorts tokens) + URL match across boards. Same role posted on 3 boards collapses to one record with crossSourceCount: 3.
  • Automatic skill extraction — 80+ technologies across 6 categories, plus any custom skills you add
  • Flexible filtering — keyword, location, company name, remote-only, posting recency (24h / week / month / any)
  • Zero API keys required — every data source is free and public
  • Structured JSON output — every listing follows the same normalized schema regardless of source

How to Use

  1. Open the actor in the Apify Console and click "Start"
  2. Enter a search query such as "data engineer", "product manager", or "machine learning". This is the only required field
  3. Optionally refine your search with location, company name, remote-only toggle, date recency, or specific sources
  4. Run the actor and wait for it to finish (typically under 60 seconds). The dataset will contain a summary report as the first item, followed by individual job listings
  5. Export or integrate — download results as JSON, CSV, or Excel, or connect the dataset to Zapier, Make, Google Sheets, or the Apify API for automated workflows

Input Parameters

FieldTypeRequiredDefaultDescription
queryStringYes"software engineer"Job search keyword (e.g., "data scientist", "devops", "product manager")
locationStringNoFilter by location substring (e.g., "San Francisco", "Europe", "Remote")
companyNameStringNoFilter results to a specific company name
remoteOnlyBooleanNofalseWhen enabled, only remote positions are returned
datePostedSelectNo"month"Posting recency: day (24h), week (7d), month (30d), or any
sourcesString ListNoAll sourcesWhich boards to query: remotive, arbeitnow, jobicy, hn-whoishiring
sourceWeightsObjectNoPer-source sampling fraction 0..1 (e.g., {"hn-whoishiring": 0.5}). Sources not listed pass through whole. Deterministic per-listing hash so re-runs are reproducible. Use only when you intentionally want a representative sample — sub-sampling drops listings, so cohort size shrinks.
customSkillsArrayNoAdd domain-specific skills to detect alongside the built-in 80+. Each: { name, regex, category? }.
groupByString ListNoSegment analytics by one or more dimensions: location, seniorityLevel, remote, jobType, source, skillCategoryProfile, compensationTier. Adds segments[] to the summary.
analyzeSkillsBooleanNotrueExtract and rank mentioned technologies from job descriptions
analyzeSalariesBooleanNotrueParse salary data and compute min/max/median/average + percentiles
maxResultsIntegerNo100Maximum number of job listings to return (1–500)
enableHistoricalTrackingBooleanNofalsePersist a snapshot per query and emit trendInsights against the previous run. First run returns trendInsights: null and writes the baseline.
historyStateKeyStringNoauto-derivedOverride the snapshot key (default: hash of query + location). Stable string for cross-run comparisons.
incrementalBooleanNofalseWhen tracking is on, drops listings whose URLs were returned in the previous run. Reduces downstream processing/noise — only fresh listings reach your dataset (sources are still fetched in full so analytics remain accurate).
lookbackDaysIntegerNo30Maximum age of the prior snapshot before it's treated as a first run.
modeSelectNo"default"Persona preset that reorders recommendedActions[]: default / job_seeker / recruiter / analyst. Same action set, different audience-priority ordering.
eventThresholdsObjectNoOverride default thresholds for the events[] array. Defaults: salarySpikePercent: 5, salaryDropPercent: -5, listingGrowthSpikePercent: 25, listingDropPercent: -25, remoteShiftPoints: 5, skillEmergenceDeltaPercent: 100. Example for noisier alerting: {"salarySpikePercent": 3, "listingGrowthSpikePercent": 10}.
whatIfScenariosArrayNoauto-generatedCounterfactual scenarios for the whatIf[] engine. Each: { type: "salary_change" | "skill_emphasis", percent? (for salary), skill? (for skill), constraints?: { maxPercent?, minPercent? } }. When omitted, the actor auto-generates 2–4 representative scenarios. Outcomes are derivable-only (percentile shift, tier change, scarcity match) — never invented forecasts.

Input Examples

Broad market scan for data engineers:

{
    "query": "data engineer",
    "datePosted": "month",
    "analyzeSkills": true,
    "analyzeSalaries": true,
    "maxResults": 200
}

Remote-only React developer roles in Europe:

{
    "query": "react developer",
    "location": "Europe",
    "remoteOnly": true,
    "datePosted": "week",
    "sources": ["remotive", "arbeitnow", "jobicy"]
}

Monitor a specific company's hiring:

{
    "query": "engineer",
    "companyName": "Stripe",
    "maxResults": 50
}

Quick pulse check from HN startups only:

{
    "query": "machine learning",
    "sources": ["hn-whoishiring"],
    "datePosted": "month",
    "maxResults": 100
}

Segmented salary analysis (US vs Europe, junior vs senior, remote vs on-site):

{
    "query": "data engineer",
    "groupBy": ["location", "seniorityLevel", "remote"],
    "maxResults": 300
}

Daily monitoring schedule with trend insights + incremental fetch:

{
    "query": "rust engineer",
    "remoteOnly": true,
    "datePosted": "week",
    "enableHistoricalTracking": true,
    "incremental": true,
    "lookbackDays": 30
}

Schedule this in Apify Console once a day. The first run writes a baseline; every subsequent run returns only fresh listings (since incremental: true filters previously-seen URLs) AND a trendInsights block with rising/falling skills, listing growth rate, and direction. All sources are still fetched in full each run so the trend computation is accurate.

Niche market with custom skill packs (Snowflake / Databricks ecosystem):

{
    "query": "data engineer",
    "customSkills": [
        { "name": "Snowpark", "regex": "\\bsnowpark\\b", "category": "Data" },
        { "name": "dbt", "regex": "\\bdbt\\b", "category": "Data" },
        { "name": "Databricks SQL", "regex": "databricks\\s+sql", "category": "Data" },
        { "name": "Unity Catalog", "regex": "unity\\s+catalog", "category": "Data" }
    ]
}

Down-weight noisier sources (HN comments) without dropping them entirely:

{
    "query": "site reliability engineer",
    "sourceWeights": { "hn-whoishiring": 0.3 }
}

Recruiter mode — actions prioritized for hiring teams:

{
    "query": "platform engineer",
    "mode": "recruiter",
    "enableHistoricalTracking": true,
    "groupBy": ["seniorityLevel", "remote"]
}

The recommendedActions[] array surfaces increase_salary_band, accelerate_hiring, and tighten_role_specs ahead of curriculum / job-seeker actions.

Analyst mode with sensitive event thresholds:

{
    "query": "machine learning engineer",
    "mode": "analyst",
    "enableHistoricalTracking": true,
    "eventThresholds": {
        "salarySpikePercent": 3,
        "listingGrowthSpikePercent": 10,
        "skillEmergenceDeltaPercent": 50
    }
}

Lower thresholds = more sensitive event firing. Useful for early-warning monitoring on volatile markets.

Constrained what-if simulation (recruiter with a 5% comp-budget cap):

{
    "query": "platform engineer",
    "mode": "recruiter",
    "whatIfScenarios": [
        { "type": "salary_change", "percent": 10, "constraints": { "maxPercent": 5 } },
        { "type": "salary_change", "percent": -3 },
        { "type": "skill_emphasis", "skill": "Kubernetes" },
        { "type": "skill_emphasis", "skill": "Rust" }
    ]
}

The first scenario asks "what if I raise comp 10%?" but constrains the answer to 5% (the recruiter's actual budget cap). The output's effectiveness: "limited" flags when the constraint binds. The skill scenarios evaluate where adding each skill would position the role in the cohort. Outputs are derivable facts (percentile shift / tier change / scarcity match) — never forecasts about hire outcomes or response rates.

Tips for Input

  • Start broad, then filter — Run a general query like "engineer" first to see the full landscape, then narrow with location or company filters in subsequent runs.
  • Source selection — Remotive and Jobicy focus on remote roles, Arbeitnow covers European markets heavily, and HN Who's Hiring surfaces startup opportunities. Use sources to target specific ecosystems.
  • Date filterday = last 24 hours, week = last 7 days, month = last 30 days, any = no time restriction.

Output Example

The dataset contains two types of records. The first item is always a summary report:

{
    "type": "summary",
    "query": "data engineer",
    "location": null,
    "analyzedAt": "2026-05-02T14:32:00.000Z",
    "totalListings": 87,
    "sourceBreakdown": { "remotive": 24, "arbeitnow": 31, "jobicy": 18, "hn-whoishiring": 14 },
    "topSkills": [
        { "skill": "Python", "count": 62, "percentage": 71.3 },
        { "skill": "SQL", "count": 58, "percentage": 66.7 },
        { "skill": "AWS", "count": 41, "percentage": 47.1 },
        { "skill": "Spark", "count": 33, "percentage": 37.9 },
        { "skill": "Kafka", "count": 28, "percentage": 32.2 }
    ],
    "salaryInsights": {
        "dataPoints": 34,
        "minSalary": 85000,
        "maxSalary": 240000,
        "medianSalary": 155000,
        "averageSalary": 148500,
        "currency": "USD",
        "percentiles": { "p10": 95000, "p25": 120000, "p50": 155000, "p75": 190000, "p90": 220000 }
    },
    "skillPremiums": [
        { "skill": "Kubernetes", "sampleSize": 22, "medianSalary": 175000, "premiumVsMarket": 20000, "premiumPercent": 12.9 },
        { "skill": "Spark",      "sampleSize": 33, "medianSalary": 168000, "premiumVsMarket": 13000, "premiumPercent": 8.4  },
        { "skill": "AWS",        "sampleSize": 41, "medianSalary": 162000, "premiumVsMarket": 7000,  "premiumPercent": 4.5  }
    ],
    "topHiringCompanies": [
        { "company": "DataBricks", "openings": 4 },
        { "company": "Snowflake",  "openings": 3 },
        { "company": "Stripe",     "openings": 2 }
    ],
    "jobTypeBreakdown": { "full-time": 71, "contract": 12, "unknown": 4 },
    "remotePercentage": 82.8,
    "seniorityBreakdown": {
        "intern": 0, "junior": 8.0, "mid": 21.8, "senior": 41.4, "staff": 6.9,
        "principal": 3.4, "lead": 5.7, "manager": 4.6, "director": 1.1,
        "vp-or-above": 0, "unknown": 7.1
    },
    "experienceRequirements": {
        "averageYearsMin": 4.2,
        "averageYearsMax": 7.1,
        "requireExperiencePercent": 78.2,
        "sampleSize": 68
    },
    "degreeRequirements": {
        "bachelorsRequiredPercent": 34.5,
        "mastersOrAbovePercent": 6.9,
        "noDegreeMentionedPercent": 51.7,
        "hardRequirementPercent": 12.6
    },
    "skillCategoryDemand": {
        "Languages": 28.7, "Frameworks": 11.5, "Cloud": 18.4,
        "Data": 33.3, "AI/ML": 5.7, "Other": 2.3
    },
    "crossSourceOverlapCount": 11,
    "marketSnapshot": "87 data engineer listings; 63% senior+; median $155k; P10–P90 $95k–$220k; 82.8% remote; Data 33.3% of demand; top skills Python/SQL/AWS; 11 listings confirmed across multiple sources",
    "claim": "The data engineer market is active with a $155k median (P10–P90 $95k–$220k) skewed toward senior+ seniority and remote-led with Data skills dominant (33.3% of demand).",
    "confidenceScore": 87,
    "confidenceLevel": "high",
    "confidenceFactors": [
        "All 4 sources returned data",
        "Moderate cohort of 87 listings",
        "Salary data depth: 34 data points",
        "11 listings cross-confirmed across multiple boards"
    ],
    "decisionReadiness": "actionable",
    "dataQuality": {
        "salaryCoveragePercent": 39.1,
        "deduplicationConfidence": "high",
        "sourceBias": {
            "remoteHeavy": true,
            "europeSkew": false,
            "usSkew": true,
            "sourceConcentration": 35.6,
            "dominantSource": "arbeitnow"
        },
        "notes": [
            "82.8% of listings are remote — on-site benchmarks under-represented.",
            "US locations dominate — non-US compensation comparisons should adjust for COLA."
        ]
    },
    "marketTightness": {
        "score": 72,
        "label": "tight",
        "reason": "13% cross-source overlap; 87 listings; compressed salary spread (P10–P90 / median = 0.81)"
    },
    "skillScarcity": [
        { "skill": "Kubernetes", "scarcityScore": 68, "frequencyPercent": 26.4, "premiumPercent": 12.9, "reason": "+12.9% salary premium with 26.4% market frequency" },
        { "skill": "Spark",      "scarcityScore": 62, "frequencyPercent": 37.9, "premiumPercent": 8.4,  "reason": "+8.4% salary premium with 37.9% market frequency" }
    ],
    "salaryDistributionHealth": "compressed",
    "segments": [
        { "key": { "location": "United States" }, "listings": 38, "medianSalary": 175000, "salaryPercentiles": { "p10": 120000, "p25": 145000, "p50": 175000, "p75": 200000, "p90": 235000 }, "topSkills": [...], "seniorityBreakdown": {...}, "remotePercentage": 71.1, "crossSourceConfirmedPercent": 18.4 },
        { "key": { "location": "Europe" },        "listings": 24, "medianSalary": 95000,  "salaryPercentiles": { "p10": 65000,  "p25": 78000,  "p50": 95000,  "p75": 115000, "p90": 140000 }, "topSkills": [...], "seniorityBreakdown": {...}, "remotePercentage": 91.7, "crossSourceConfirmedPercent": 8.3  }
    ],
    "trendInsights": {
        "sinceLastRun": true,
        "previousRunAt": "2026-04-25T14:32:00.000Z",
        "daysSincePreviousRun": 7.0,
        "listingGrowthRate": 12.5,
        "salaryMedianChange": 7000,
        "salaryMedianChangePercent": 4.7,
        "remotePercentageChange": 2.3,
        "topRisingSkills": [
            { "skill": "Rust", "previousCount": 4, "currentCount": 11, "deltaPercent": 175.0 },
            { "skill": "Databricks", "previousCount": 8, "currentCount": 14, "deltaPercent": 75.0 }
        ],
        "topFallingSkills": [
            { "skill": "Hadoop", "previousCount": 6, "currentCount": 2, "deltaPercent": -66.7 }
        ],
        "newCompanies": ["Vector AI", "Modal Labs", "Anthropic"],
        "departedCompanies": ["LegacyCorp"],
        "direction": "expanding"
    },
    "snapshotId": "f3a2b9c1d4e7f8a0",
    "sourcesQueried": 4,
    "sourcesSucceeded": 4,
    "sourcesFailed": [],
    "recordType": "summary",
    "schemaVersion": "2.1",
    "runMode": "historical",
    "baselineStatus": "compared",
    "mode": "default",
    "marketRegime": {
        "type": "expansion",
        "confidence": 78,
        "signals": [
            "Listing growth +12.5%",
            "Salary median +4.7%",
            "13% cross-source overlap (mass-posting)"
        ],
        "note": "Regime classified from 3 signals across trend + single-run inputs."
    },
    "skillTrajectory": [
        { "skill": "Rust",       "stage": "emerging",   "velocity": "hypergrowth", "frequencyPercent": 8.1,  "premiumPercent": 14.2, "deltaPercent": 175.0, "confidence": 100, "reason": "8.1% market frequency; +14.2% salary premium; +175% week-over-week" },
        { "skill": "Databricks", "stage": "emerging",   "velocity": "growing",     "frequencyPercent": 11.3, "premiumPercent": 9.8,  "deltaPercent": 75.0,  "confidence": 100, "reason": "11.3% market frequency; +9.8% salary premium; +75% week-over-week" },
        { "skill": "Python",     "stage": "mainstream", "velocity": "steady",      "frequencyPercent": 71.3, "premiumPercent": 2.1,  "deltaPercent": null,  "confidence": 75,  "reason": "71.3% market frequency; +2.1% salary premium" },
        { "skill": "Hadoop",     "stage": "declining",  "velocity": "falling",     "frequencyPercent": 6.7,  "premiumPercent": -3.2, "deltaPercent": -66.7, "confidence": 100, "reason": "6.7% market frequency; -3.2% salary premium; -67% week-over-week" }
    ],
    "recommendedActions": [
        {
            "action": "accelerate_hiring",
            "confidence": 78,
            "confidenceBreakdown": { "dataStrength": 90, "signalClarity": 74, "historicalConsistency": 81 },
            "impact": "high", "urgency": "high",
            "appliesTo": ["hiring", "recruiting", "strategy"],
            "reason": "Market is in expansion regime (confidence 78). Listing growth +12.5%; Salary median +4.7%. Move now while supply still meets demand."
        },
        {
            "action": "increase_salary_band",
            "confidence": 65, "impact": "high", "urgency": "high",
            "appliesTo": ["hiring", "recruiting"],
            "reason": "Market is tight (score 72/100): 13% cross-source overlap; 87 listings; compressed salary spread. Median is $155k — bands below this will struggle to attract candidates."
        },
        {
            "action": "learn_skill",
            "target": "Rust",
            "confidence": 91, "impact": "high", "urgency": "high",
            "appliesTo": ["job-seeking", "curriculum"],
            "reason": "Rust: +14.2% salary premium with 8.1% market frequency. Scarcity score 78/100 — high salary lift with low market saturation."
        },
        {
            "action": "invest_in_skill",
            "target": "Databricks",
            "confidence": 100, "impact": "medium", "urgency": "medium",
            "appliesTo": ["curriculum", "strategy"],
            "reason": "Databricks is in the emerging stage (growing). 11.3% market frequency; +9.8% salary premium; +75% week-over-week. Early adopters get the premium before mainstream saturation."
        }
    ],
    "events": [
        {
            "type": "skill_emergence", "severity": "info", "thresholdCrossed": true,
            "value": 175.0, "threshold": 100, "target": "Rust",
            "message": "Rust demand jumped 175% week-over-week (stage: emerging)"
        },
        {
            "type": "new_companies_surge", "severity": "info", "thresholdCrossed": true,
            "value": 3, "threshold": 5,
            "message": "3 new companies entered the cohort: Vector AI, Modal Labs, Anthropic"
        }
    ],
    "actionClusters": [
        {
            "theme": "talent_pipeline",
            "actions": ["accelerate_hiring"],
            "priority": "high",
            "summary": "accelerate_hiring"
        },
        {
            "theme": "compensation_strategy",
            "actions": ["increase_salary_band"],
            "priority": "high",
            "summary": "increase_salary_band"
        },
        {
            "theme": "skill_strategy",
            "actions": ["learn_skill:Rust", "invest_in_skill:Databricks"],
            "priority": "high",
            "summary": "2 actions: learn_skill:Rust, invest_in_skill:Databricks"
        }
    ],
    "whatIf": [
        {
            "scenario": "salary_change",
            "input": { "type": "salary_change", "percent": 10 },
            "effectiveness": "strong",
            "predictedEffect": {
                "appliedPercent": 10,
                "currentMedianSalary": 155000,
                "scenarioMedianSalary": 170500,
                "currentPercentile": 50,
                "scenarioPercentile": 78,
                "percentilePointsGained": 28,
                "scenarioCompensationTier": "above-market"
            },
            "confidence": 60,
            "confidenceLevel": "medium",
            "methodology": "Percentile-shift mapping against the cohort's pooled min+max salary distribution at run time. Tier classification uses fixed cohort-median ratio thresholds (0.85 / 1.10 / 1.35).",
            "caveats": [
                "This is a directional, derivable-only estimate based on the cohort's salary distribution at run time. It is not a forecast.",
                "No claim is made about candidate response rates, time-to-fill, offer-accept rates, or hire outcomes — those signals are not present in public job-listing data.",
                "Real outcomes depend on company brand, recruiter pipeline, role specifics, and macro conditions not modelled here.",
                "Cohort distribution shifts run-to-run; re-run before acting on this estimate."
            ],
            "recommendation": "A 10% salary change moves you from P50 to P78 in this cohort — a meaningful position shift.",
            "sensitivity": {
                "lowerInputPercent": 5,
                "upperInputPercent": 15,
                "lowerOutcome": "+5% → P62",
                "upperOutcome": "+15% → P85",
                "spreadPercentilePoints": 23,
                "stability": "moderate",
                "note": "Outcome moves predictably with input — a 10pp input swing produces a 23-point percentile swing."
            }
        },
        {
            "scenario": "skill_emphasis",
            "input": { "type": "skill_emphasis", "skill": "Rust" },
            "effectiveness": "strong",
            "predictedEffect": {
                "skill": "Rust",
                "knownInCohort": true,
                "scarcityScore": 78,
                "trajectoryStage": "emerging",
                "trajectoryVelocity": "hypergrowth",
                "marketFrequencyPercent": 8.1,
                "salaryPremiumPercent": 14.2
            },
            "confidence": 60,
            "confidenceLevel": "medium",
            "methodology": "Skill is matched (case-insensitive) against the cohort's skillScarcity, skillTrajectory, skillPremiums, and topSkills outputs. No external benchmark or hire-outcome data is used.",
            "caveats": [
                "This is a market-positioning estimate, not a hire/job-acquisition forecast.",
                "Skill demand changes over time; re-run before acting on this estimate.",
                "Premium percentages are sample-size gated (≥5 listings); skills below that threshold return null premium."
            ],
            "recommendation": "Adding \"Rust\" aligns with a high-leverage position: emerging stage with scarcity score 78/100, +14.2% salary premium.",
            "sensitivity": null
        }
    ],
    "decisionTension": [
        {
            "between": ["increase_salary_band", "tighten_role_specs"],
            "tension": "cost_vs_selectivity",
            "explanation": "Raising salary improves candidate positioning, while tightening role specs reduces the eligible pool. Doing both at once may produce a small, expensive hire pipeline that misses both levers individually.",
            "recommendedBalance": "In tight markets prioritise the salary increase first; defer spec tightening unless inbound pipeline volume becomes excessive."
        }
    ],
    "rejectedActions": [
        {
            "action": "decrease_salary_band",
            "reason": "Market is tight (score 72/100). Lowering salary would reduce competitiveness against a pipeline that already favours employers raising bands. Not recommended."
        },
        {
            "action": "expand_geographic_search",
            "reason": "82.8% of listings are remote — geographic expansion adds no opportunity coverage when the market is location-agnostic. Use remote-first sourcing instead."
        },
        {
            "action": "hold_strategy",
            "reason": "Market regime is expansion with confidence 78/100 — there is a clear directional edge. Doing nothing is not the right read for this cohort."
        }
    ],
    "marketMemory": {
        "regimeHistory": [
            { "regime": "expansion", "at": "2026-04-04T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-04-11T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-04-18T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-04-25T14:32:00.000Z" },
            { "regime": "expansion", "at": "2026-05-02T14:32:00.000Z" }
        ],
        "regimeStability": 1.0,
        "lastInflectionDaysAgo": null,
        "pattern": "expansion_stable",
        "note": "Pattern derived from the last 5 regime classifications (capped at 12)."
    },
    "analysisMetadata": {
        "salarySampleSize": 34,
        "segmentCount": 0,
        "historicalTrackingEnabled": true,
        "incrementalApplied": false,
        "customSkillCount": 0,
        "sourceWeightsApplied": false,
        "sourcesQueried": 4,
        "sourcesSucceeded": 4,
        "mode": "default"
    },
    "warnings": [
        "82.8% of listings are remote — on-site benchmarks under-represented.",
        "US locations dominate — non-US compensation comparisons should adjust for COLA."
    ]
}

Each subsequent item is a normalized job listing:

{
    "type": "job",
    "source": "remotive",
    "title": "Senior Data Engineer",
    "company": "Snowflake",
    "location": "Worldwide",
    "remote": true,
    "jobType": "full-time",
    "salaryMin": 160000,
    "salaryMax": 210000,
    "salaryCurrency": "USD",
    "description": "We are looking for a Senior Data Engineer to build and maintain our core data platform...",
    "skills": ["Python", "SQL", "Spark", "Kafka", "Airflow", "AWS", "Docker", "Kubernetes"],
    "tags": ["data", "engineering", "big-data"],
    "postedDate": "2026-05-02T08:00:00.000Z",
    "url": "https://remotive.com/remote-jobs/software-dev/senior-data-engineer-12345",
    "applyUrl": "https://remotive.com/remote-jobs/software-dev/senior-data-engineer-12345",
    "seniorityLevel": "senior",
    "experienceYearsMin": 5,
    "experienceYearsMax": 8,
    "degreeRequired": "bachelors",
    "degreeIsHardRequirement": false,
    "skillCategoryProfile": "Data",
    "crossSourceConfirmed": true,
    "crossSourceCount": 2,
    "compensationTier": "above-market",
    "recommendedAction": "apply-now",
    "actionReason": "Above-market compensation tier (110–135% of market median) with disclosed salary at a named company.",
    "recordType": "job"
}

Output Fields — Summary Report

FieldTypeDescription
typestringAlways "summary" for the report record
querystringThe search query used
locationstring|nullLocation filter applied (if any)
analyzedAtstringISO timestamp of when the analysis ran
totalListingsnumberTotal deduplicated job listings found
sourceBreakdownobjectCount of listings per source (e.g., {"remotive": 24, "arbeitnow": 31})
topSkillsarrayTop 30 skills ranked by frequency, each with skill, count, and percentage
salaryInsightsobject|nullSalary statistics: dataPoints, minSalary, maxSalary, medianSalary, averageSalary, currency, plus percentiles (p10/p25/p50/p75/p90) when ≥5 data points
skillPremiumsarrayPer-skill median salary lift vs cohort median, each with skill, sampleSize, medianSalary, premiumVsMarket, premiumPercent (only skills with ≥5 salary data points)
topHiringCompaniesarrayTop 20 companies by number of open positions, each with company and openings
jobTypeBreakdownobjectCount per job type: full-time, part-time, contract, internship, temporary, unknown
remotePercentagenumberPercentage of listings flagged as remote
seniorityBreakdownobjectPercentage of listings at each seniority level: intern, junior, mid, senior, staff, principal, lead, manager, director, vp-or-above, unknown
experienceRequirementsobjectaverageYearsMin, averageYearsMax, requireExperiencePercent, sampleSize
degreeRequirementsobjectbachelorsRequiredPercent, mastersOrAbovePercent, noDegreeMentionedPercent, hardRequirementPercent
skillCategoryDemandobjectPercentage of listings whose dominant skill area is each category: Languages, Frameworks, Cloud, Data, AI/ML, Other
crossSourceOverlapCountnumberCount of listings that appeared on multiple boards before deduplication (legitimacy signal)
marketSnapshotstringSlack/email-ready one-line headline summarizing the cohort (metric-first)
claimstringAnalyst-style one-sentence conclusion about the cohort (paste verbatim into reports / Slack / agent prompts)
confidenceScorenumber0–100 score combining source coverage (30%) + cohort size (30%) + salary data depth (25%) + cross-source overlap (15%)
confidenceLevelstringBanded confidence: high (≥75), medium (≥50), low (<50). Use this in Dify/n8n switch nodes.
confidenceFactorsstring[]Plain-English explanations of WHY confidence is what it is — usable verbatim in reports
decisionReadinessstringAutomation gate: actionable (confidence ≥70 + ≥10 salary points + ≥10 listings), monitor (worth tracking but don't auto-act), insufficient-data (<10 listings)
dataQualityobjectAuditability block: salaryCoveragePercent, deduplicationConfidence (high/medium/low), sourceBias ({remoteHeavy, europeSkew, usSkew, sourceConcentration, dominantSource}), notes[] plain-English bias warnings
marketTightnessobjectSupply/demand index: { score (0–100), label: tight/balanced/loose/unknown, reason }. Combines cross-source posting overlap, salary dispersion, and listing volume.
skillScarcityobject[]Top 10 skills ranked by scarcityScore (high salary premium AND low frequency). Each: { skill, scarcityScore (0–100), frequencyPercent, premiumPercent, reason }. Empty when cohort < 20 listings.
salaryDistributionHealthstringwide (P10–P90 spread > 1.2× median) / balanced / compressed (< 0.5×) / unknown. Compressed = mature/standardised market.
segmentsobject[]Per-segment analytics when groupBy is set. Each: { key, listings, medianSalary, salaryPercentiles, topSkills, seniorityBreakdown, remotePercentage, crossSourceConfirmedPercent }. Capped at 50.
trendInsightsobject|nullCross-run trends when enableHistoricalTracking is on AND a prior snapshot exists within lookbackDays. { sinceLastRun, previousRunAt, daysSincePreviousRun, listingGrowthRate, salaryMedianChange, salaryMedianChangePercent, remotePercentageChange, topRisingSkills[], topFallingSkills[], newCompanies[], departedCompanies[], direction }. Null on first run.
snapshotIdstring16-char SHA-256 hash over query + location + sources + listing fingerprint. Compare across runs to detect when the cohort actually changed.
schemaVersionstringOutput contract version (semver-style) — currently "2.1". Major bumps signal breaking changes; minor bumps signal additive expansions. 2.1 is additive-only since 2.0 (added: actionClusters, whatIf + sensitivity, marketMemory, decisionTension, rejectedActions, action confidenceBreakdown). Branch on this in long-lived integrations to opt into new features explicitly.
runModestringWhat kind of run this was: snapshot (one-shot), historical (snapshot + trend computation), incremental (snapshot + trend + drop already-seen URLs).
baselineStatusstringLifecycle of the historical snapshot for this run: created (first baseline written), compared (trend insights computed against an existing baseline), expired (prior baseline was older than lookbackDays — fresh one written, trends null this run), disabled (historical tracking off).
analysisMetadataobjectRun-level metadata about the analytics computation: salarySampleSize, segmentCount, historicalTrackingEnabled, incrementalApplied, customSkillCount, sourceWeightsApplied, sourcesQueried, sourcesSucceeded, mode. Distinct from dataQuality (which is about the cohort's biases, not the run's machinery).
warningsstring[]Top-level run-level warnings (sources failed, low confidence, expired baseline, critical events, etc.). Promotes dataQuality.notes alongside other run-level signals so downstream consumers don't have to walk into nested objects. Empty array when nothing notable. Read this before acting on the cohort's analytics.
modestringActive persona preset: default / job_seeker / recruiter / analyst. Echoed on the summary so downstream automation can branch on the persona that produced the output.
marketRegimeobjectState classification: { type (expansion/contraction/stagnation/volatility/unknown), confidence (0–100), signals[] (which thresholds fired), note }. Combines trend + single-run signals; confidence is materially higher when historical tracking is on.
recommendedActionsobject[]Cohort-level action engine (capped at 12). Each: { action, target?, confidence (0–100), confidenceBreakdown: { dataStrength, signalClarity, historicalConsistency }, impact (high/medium/low), urgency (high/medium/low), appliesTo[] (hiring/recruiting/job-seeking/curriculum/strategy/monitoring), reason }. Sorted by mode audience priority, then urgency, then confidence. Branch on action (stable enum string) for automation; filter by appliesTo to surface only the actions a given persona cares about. Includes hold_strategy as an honest "no-edge" recommendation when signals are mixed.
actionClustersobject[]Recommended actions grouped by theme: compensation_strategy, talent_pipeline, skill_strategy, monitoring_strategy, source_strategy, general. Each: { theme, actions[], priority (high/medium/low), summary }. Sorted high → low priority then by cluster size. Reduces noise when 8–12 actions belong to a few strategic surfaces.
whatIfobject[]Counterfactual scenarios with honest, derivable-only outcomes (percentile shift, tier change, scarcity match) — never invented forecasts. Each: { scenario, input, effectiveness (strong/moderate/limited/none/unknown), predictedEffect, confidence (hard-capped at 60), confidenceLevel, methodology, caveats[], recommendation, sensitivity }. sensitivity (salary scenarios only) ships lowerOutcome/upperOutcome at user-input ±5pp + a stability enum (low / moderate / high / unknown) so you can see if the percentile shift is robust to small input variation. Auto-generated when whatIfScenarios input is omitted; honors user scenarios + constraints when supplied. Scenario types: salary_change (% delta) and skill_emphasis (named skill).
decisionTensionobject[]Trade-off pairs detected across recommendedActions[]. Each: { between: [actionA, actionB], tension (cost_vs_selectivity / speed_vs_quality / remote_vs_local_reach / act_now_vs_wait / early_mover_vs_safe_bet / depth_vs_breadth), explanation, recommendedBalance }. Surfaces when two recommended actions work against each other under a single sourcing pipeline. Empty when no contradictory pairs are present.
rejectedActionsobject[]Anti-recommendations — actions explicitly NOT recommended for this cohort, with reason. Each: { action, target?, reason }. The dual of hold_strategy: instead of staying silent on the obvious wrong moves, the system surfaces them and explains why it skipped them. Builds trust by showing the engine considered alternatives. Empty when no anti-recommendations apply.
marketMemoryobjectBounded last-12-runs regime history with pattern detection. { regimeHistory[] (regime + at), regimeStability (0..1), lastInflectionDaysAgo, pattern, note }. Patterns: expansion_stable / expansion_weakening / contraction_stable / contraction_deepening / volatile_shifting / stagnation_persistent / inflection_recent / insufficient-history (until 3 snapshots) / mixed. Activates with enableHistoricalTracking; meaningful at 3+ snapshots. Lets you reason in patterns, not just deltas.
skillTrajectoryobject[]Per-skill lifecycle classification (top 20 skills): { skill, stage (declining/stable/emerging/mainstream/saturated), velocity (hypergrowth/growing/steady/cooling/falling/unknown), frequencyPercent, premiumPercent, deltaPercent, confidence, reason }. Sorted emerging → mainstream → other. The bridge between rising/falling counts and "what does it mean for me?"
eventsobject[]Threshold-crossing events ready for downstream alerting. Each: { type, severity (critical/warning/info), thresholdCrossed, value, threshold, target?, message }. Event types: salary_spike, salary_drop, listing_growth_spike, listing_drop, remote_share_shift, skill_emergence, skill_collapse, new_companies_surge, cohort_collapse. Thresholds user-overridable via the eventThresholds input. Sorted critical → warning → info.
sourcesQueriednumberNumber of job board sources queried this run
sourcesSucceedednumberNumber of job board sources that returned data
sourcesFailedstring[]Names of sources that failed this run; empty when all succeeded
recordTypestringDiscriminator for downstream filtering — summary for the summary record, job for individual listings, error for error records. (type is a deprecated alias kept for back-compat.)

Output Fields — Job Listing

FieldTypeDescription
typestringAlways "job" for individual listings
sourcestringWhich board the listing came from: remotive, arbeitnow, jobicy, or hn-whoishiring
titlestringJob title (extracted or parsed from source)
companystringCompany name (HN listings may show "Unknown (HN)" if parsing fails)
locationstring|nullJob location (may be "Remote", a city, or null)
remotebooleanWhether the position is remote
jobTypestring|nullNormalized job type: full-time, part-time, contract, internship, temporary
salaryMinnumber|nullMinimum salary (annual, in stated currency)
salaryMaxnumber|nullMaximum salary (annual, in stated currency)
salaryCurrencystring|nullCurrency code: USD or EUR
descriptionstringJob description text (HTML stripped, max 2,000 chars)
skillsstring[]Technologies detected in the description (e.g., ["Python", "AWS", "Docker"])
tagsstring[]Tags from the source API (empty for HN listings)
postedDatestring|nullISO timestamp of when the job was posted
urlstringURL to the original listing
applyUrlstring|nullDirect application URL (when available)
seniorityLevelstringOne of intern, junior, mid, senior, staff, principal, lead, manager, director, vp-or-above, unknown
experienceYearsMinnumber|nullMinimum years of experience requested (parsed from description)
experienceYearsMaxnumber|nullMaximum years of experience requested
degreeRequiredstringbachelors, masters, phd, any-degree, no-mention
degreeIsHardRequirementbooleanTrue if the degree is required (vs preferred / equivalent experience accepted)
skillCategoryProfilestring|nullDominant skill area for this role: Languages, Frameworks, Cloud, Data, AI/ML, Other
crossSourceConfirmedbooleanTrue if this listing appeared on multiple job boards before deduplication
crossSourceCountnumberNumber of source boards this listing appeared on
compensationTierstringSalary vs market median for this query: below-market (<85%), at-market (85–110%), above-market (110–135%), premium (>135%), unknown (no salary data)
recommendedActionstringDecision enum for routing in Dify/n8n workflows: apply-now, research-company, review-fit, skip-low-detail
actionReasonstringPlain-English sentence explaining WHY recommendedAction is what it is — paste verbatim into Slack/email/agent prompts
recordTypestringAlways "job" for listings (mirrors type for forward-compatibility with the standard Apify discriminator pattern)

Common workflows

One-shot market pulse (no schedule)

Run with no historical-tracking flags. Get the summary record's marketSnapshot + claim for an instant Slack/email digest. Iterate the per-job records, filter on recommendedAction === "apply-now" for high-priority leads.

Weekly salary trend monitoring (scheduled)

Set enableHistoricalTracking: true + lookbackDays: 14. Schedule weekly. Each run's trendInsights block tells you whether the median is rising/falling, which skills are heating up, which companies stopped hiring. Pipe into a Slack alert: if (trendInsights.salaryMedianChangePercent > 5) sendAlert(...).

Daily fresh-listings feed (scheduled, incremental)

enableHistoricalTracking: true + incremental: true. Schedule daily. Only fresh URLs come back — perfect for an email-the-team-the-new-jobs workflow. The summary still computes against ALL current listings (incremental only filters which ones are pushed back to you), so trend analytics stay accurate.

Cross-region salary comparison (single run)

groupBy: ["location"] returns per-location segments with their own salary percentiles, top skills, and seniority breakdown. Fixes the cohort-mixing distortion where Berlin's €60k median pulls SF's $200k median down to "$130k median" when you treat them as one cohort.

Talent pipeline monitor for a single company

companyName: "Stripe" + enableHistoricalTracking: true. Schedule weekly. trendInsights.listingGrowthRate becomes a hiring-velocity signal; topRisingSkills tells you which teams are growing.

Niche-market intelligence (custom skills)

Add customSkills for the technologies your competitive landscape cares about that the built-in 80 don't cover (e.g. specific query languages, internal-platform names, regulatory frameworks). Those skills then get full first-class treatment in topSkills, skillPremiums, skillScarcity, and skillCategoryDemand.

What makes this actor different (vs other job market analysis tools)

This actor is an alternative to LinkedIn Talent Insights, Lightcast (formerly Burning Glass), Revelio Labs, Datapeople, Greenhouse Reports, Ashby Analytics, generic job scrapers and job aggregators — but built for automation workflows rather than dashboards or sales-team consumption.

Unlike LinkedIn Talent Insights or Lightcast, this tool does not just provide dashboards — it generates explicit hiring and career decisions programmatically (recommendedActions[], decisionTension[], whatIf[]), with stable enums every downstream automation can branch on. The output is decisions, not visualisations.

ApproachWhat you getWhat's missing
Generic job board scraper (single-source)Raw listingsNo skill extraction, no salary stats, no decision layer, no cross-board overlap signal
LinkedIn / Indeed / Glassdoor scrapersLarger volumeNo multi-source aggregation; auth-walled; high block risk; flat output
Lightcast / Revelio / LinkedIn Talent Insights (enterprise)Macro labor data, employee-level intel$$$$ and behind sales-call paywalls; not embeddable in your automation
Job Market Intelligence (this actor)Decision-ready output (recommendedAction, compensationTier, decisionReadiness); cohort analytics (percentiles, premiums, market tightness, scarcity); per-segment breakdowns; cross-run trend insights; data-quality auditability; trade-off detection (decisionTension); anti-recommendations (rejectedActions); counterfactual simulation (whatIf with sensitivity)Public-API coverage only (Remotive / Arbeitnow / Jobicy / HN); no LinkedIn / Indeed / Glassdoor; no candidate-side data

The positioning is composable labor-market strategy engine for automation: stable enums on every record so Dify / n8n / Zapier / SQL can branch without prompt engineering, plus the cohort-level analytics and trend layers that turn one-shot scrapes into a monitoring product, plus the strategy layer (recommended actions / trade-offs / what-if scenarios) that turns analytics into decisions.

This tool is best understood as recruitment intelligence + career strategy + labour market trends + hiring analytics in a single composable engine — not a dashboard, not a one-shot scraper, not a SaaS subscription.

Use Cases

  • Job seekers — Search for roles matching your skills, compare salary ranges across companies, and discover which technologies are most in-demand for your target position
  • Recruiters and talent acquisition teams — Monitor competitor hiring activity, understand which skills the market demands, and benchmark compensation packages before writing job descriptions
  • HR and workforce planning analysts — Track hiring trends over time by scheduling periodic runs to build a longitudinal dataset of skill demand and salary movement
  • Career coaches and bootcamp instructors — Identify the most requested programming languages, frameworks, and cloud platforms so you can align curriculum with real employer needs
  • Startup founders — Research the talent landscape before hiring. See what competitors pay, which skills are scarce, and whether remote or on-site roles dominate your niche
  • Data journalists and researchers — Gather structured, source-attributed job market data for articles, reports, or academic studies on labor economics and tech hiring

API & Programmatic Access

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/job-market-intelligence").call(run_input={
    "query": "data engineer",
    "remoteOnly": True,
    "analyzeSkills": True,
    "analyzeSalaries": True,
    "maxResults": 200,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item["type"] == "summary":
        print(f"Total listings: {item['totalListings']}")
        print(f"Remote %: {item['remotePercentage']}%")
        if item.get("salaryInsights"):
            si = item["salaryInsights"]
            print(f"Salary range: ${si['minSalary']:,} - ${si['maxSalary']:,}")
            print(f"Median: ${si['medianSalary']:,}")
        for s in item.get("topSkills", [])[:10]:
            print(f"  {s['skill']}: {s['count']} ({s['percentage']}%)")
    else:
        print(f"{item['company']} - {item['title']} ({item['source']})")

JavaScript

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('ryanclinton/job-market-intelligence').call({
    query: 'data engineer',
    remoteOnly: true,
    analyzeSkills: true,
    analyzeSalaries: true,
    maxResults: 200,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
const summary = items.find(i => i.type === 'summary');
const jobs = items.filter(i => i.type === 'job');

console.log(`Found ${summary.totalListings} listings, ${summary.remotePercentage}% remote`);
console.log('Top skills:', summary.topSkills.slice(0, 5).map(s => s.skill).join(', '));
jobs.forEach(j => console.log(`${j.company} - ${j.title} (${j.source})`));

cURL

# Start the actor
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~job-market-intelligence/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "data engineer",
    "remoteOnly": true,
    "analyzeSkills": true,
    "maxResults": 200
  }'

# Fetch results (use defaultDatasetId from the response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How It Works — Technical Details

Input: query, location, remoteOnly, datePosted, sources, maxResults
  │
  ▼
┌──────────────────────────────────────────────────────────────────┐
│ PARALLEL FETCH (Promise.allSettled — failures don't crash run)  │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────┐  ┌─────────┐ │
│  │ Remotive     │  │ Arbeitnow    │  │ Jobicy   │  │ HN      │ │
│  │              │  │              │  │          │  │ Algolia │ │
│  │ GET /api/    │  │ GET /api/    │  │ GET /api │  │ GET /api│ │
│  │ remote-jobs  │  │ job-board-api│  │ /v2/     │  │ /v1/    │ │
│  │ ?search=X    │  │ ?search=X    │  │ remote-  │  │ search  │ │
│  │ &limit=N     │  │ &page=1..3   │  │ jobs     │  │ ?query= │ │
│  │              │  │              │  │ ?count=N │  │ X&tags= │ │
│  │ Salary from  │  │ Salary from  │  │ &tag=X   │  │ comment │ │
│  │ field +      │  │ description  │  │          │  │ ,ask_hn │ │
│  │ description  │  │ regex        │  │ Salary   │  │         │ │
│  │ fallback     │  │              │  │ from API │  │ Last    │ │
│  │              │  │ created_at   │  │ fields   │  │ 90 days │ │
│  │ Remote-only  │  │ = Unix epoch │  │          │  │         │ │
│  │ board        │  │              │  │ Remote-  │  │ Parse:  │ │
│  │              │  │ European     │  │ only     │  │ company │ │
│  │              │  │ focus        │  │ board    │  │ from 1st│ │
│  │              │  │              │  │          │  │ line    │ │
│  └──────┬───────┘  └──────┬───────┘  └────┬─────┘  └────┬────┘ │
│         │                 │               │              │      │
└─────────┼─────────────────┼───────────────┼──────────────┼──────┘
          │                 │               │              │
          ▼                 ▼               ▼              ▼
    ┌─────────────────────────────────────────────────────────┐
    │ NORMALIZE to NormalizedJob schema                       │
    │ (title, company, location, remote, salary, skills...)   │
    │                                                         │
    │ Skills: 80+ regex patterns across 6 categories          │
    │ (extensible via customSkills input)                     │
    │ Salary: USD/EUR regex from fields + description text    │
    │ Job type: normalize → full-time/part-time/contract/etc  │
    │ Description: strip HTML, max 2,000 chars                │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────────────┐
    │ FILTER PIPELINE (sequential)                            │
    │                                                         │
    │  1. Date filter (day=24h, week=7d, month=30d)           │
    │  2. Remote-only filter (j.remote === true)              │
    │  3. Location filter (case-insensitive substring)        │
    │     └─ Graceful fallback: if ALL removed, re-include    │
    │  4. Company name filter (case-insensitive substring)    │
    │  5. Source weighting (deterministic per-listing hash)   │
    │     └─ Only applied when sourceWeights is set           │
    │  6. Incremental drop (URLs from prior snapshot)         │
    │     └─ Only applied when incremental: true + baseline   │
    │  7. Deduplication (normalized title + URL secondary)    │
    │     ├─ Title: lowercase, strip noise tokens, sort       │
    │     ├─ URL: hostname + pathname secondary key           │
    │     └─ Tracks crossSourceCount per dedup key            │
    │  8. Cap at maxResults                                   │
    │  9. Compute market median (single salary pass)          │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────────────┐
    │ PER-JOB ENRICHMENT                                      │
    │                                                         │
    │  • seniorityLevel (regex over title + first 400 chars)  │
    │  • experienceYearsMin/Max (regex on description)        │
    │  • degreeRequired + degreeIsHardRequirement             │
    │  • skillCategoryProfile (dominant skill area)           │
    │  • crossSourceConfirmed + crossSourceCount              │
    │  • compensationTier (vs market median)                  │
    │  • recommendedAction + actionReason (decision enum)     │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────────────┐
    │ BUILD SUMMARY REPORT                                    │
    │                                                         │
    │  • Source breakdown + sourcesQueried/Succeeded/Failed   │
    │  • Top 30 skills by frequency + percentage              │
    │  • Salary: min, max, median, average + P10/25/50/75/90  │
    │  • Skill premiums (≥5 sample) vs cohort median          │
    │  • Top 20 hiring companies by openings                  │
    │  • Job type breakdown                                   │
    │  • Remote percentage                                    │
    │  • Seniority / experience / degree breakdowns           │
    │  • Skill category demand (% per category)               │
    │  • Cross-source overlap count                           │
    │  • marketTightness + skillScarcity + distribution health│
    │  • Per-segment analytics (when groupBy is set)          │
    │  • dataQuality + warnings + analysisMetadata            │
    │  • marketSnapshot + claim (Slack/email-ready)           │
    │  • snapshotId (cohort fingerprint)                      │
    │  • runMode + baselineStatus + schemaVersion             │
    └─────────────────────┬───────────────────────────────────┘
                          │
                          ▼
            ┌─────────────────────────────────┐
            │ HISTORICAL SNAPSHOT (opt-in)    │
            │                                 │
            │  enableHistoricalTracking: true │
            │   ├─ Read prior snapshot from   │
            │   │  named KV store             │
            │   ├─ Compute trendInsights      │
            │   │  (rising/falling skills,    │
            │   │  salary direction, growth)  │
            │   └─ Write fresh snapshot       │
            └─────────────────┬───────────────┘
                              │
                              ▼
              Push to Dataset:
              [summary, ...jobs]
              + Actor.setValue('SUMMARY', summary)

Data Source Details

SourceAPI EndpointCoverageSalary DataNotes
Remotiveremotive.com/api/remote-jobsRemote tech jobs worldwideStructured field + description regexSingle page, ?search=X&limit=N
Arbeitnowarbeitnow.com/api/job-board-apiEuropean focus, all job typesDescription regex onlyPaginated up to 3 pages, created_at is Unix timestamp
Jobicyjobicy.com/api/v2/remote-jobsRemote-first jobsStructured annualSalaryMin/Max fields?count=N&tag=X
HN Who's Hiringhn.algolia.com/api/v1/searchStartup jobs from monthly threadsDescription regex onlySearches comments from last 90 days, parses company from first line

Skill Detection System

The actor scans each job description against 80+ built-in technology patterns organized into 6 categories. Add domain-specific skills via the customSkills input — they're treated as first-class members of the categorisation, premium, and scarcity systems.

CategorySkills Detected
LanguagesPython, JavaScript, TypeScript, Java, Rust, C++, Ruby, PHP, Swift, Kotlin, Scala, SQL, R, Go
FrameworksReact, Angular, Vue, Next.js, Django, Flask, Spring, Rails, Laravel, FastAPI, Express, Node.js, Svelte, NestJS, .NET
CloudAWS, Azure, GCP, Docker, Kubernetes, Terraform, CI/CD, Jenkins, GitHub Actions, CloudFormation
DataPostgreSQL, MongoDB, Redis, Elasticsearch, Kafka, Spark, Snowflake, BigQuery, Airflow, MySQL, DynamoDB, Cassandra, Redshift
AI/MLMachine Learning, Deep Learning, NLP, Computer Vision, PyTorch, TensorFlow, LLM, GPT, RAG, Generative AI, Neural Network
OtherGit, Linux, Agile, REST, GraphQL, gRPC, Microservices, Scrum, DevOps, SRE

Special handling: R and Go use context-aware regex to avoid false positives (e.g., "R" only matches when near "programming", "language", or other languages; "Go" matches "Golang" or "Go" in programming context).

Salary Extraction

Salary parsing uses multiple regex patterns applied to both structured API fields and free-text descriptions:

PatternExampleCurrency
$Xk - $Xk$120k - $180kUSD
$X,XXX - $X,XXX$120,000 - $180,000USD
$Xk/year$150k/yearUSD
$X,XXX/year$150,000/yearUSD
€X - €X€50,000 - €80,000EUR

Values under 1,000 are automatically multiplied by 1,000 (treating "150" as "$150k"). The summary report computes statistics from the sorted union of all min and max salary values.

Deduplication Algorithm

Two-phase deduplication for resilience against the same role posted across multiple boards with cosmetic title differences.

  1. Title normalization — the title is lowercased, stripped of punctuation, and tokenized. Noise tokens (senior, sr, jr, mid, junior, staff, principal, lead, remote, fulltime, i, ii, iii, articles, prepositions) are removed so "Senior React Engineer" and "React Engineer (Sr)" collapse to the same key. Remaining tokens are alphabetised and capped at 80 characters.
  2. Primary dedup key = company.toLowerCase().trim() + "::" + normalizedTitle.
  3. URL secondary key = hostname + pathname from job.url. If the same URL has been seen under any primary key, the listing is folded into that key's crossSourceCount rather than re-counted.
  4. The first listing encountered for each primary key is kept; subsequent duplicates increment crossSourceCount on the surviving record. crossSourceConfirmed: true fires when count > 1.

The two-phase approach catches both (a) the same role with cosmetic title variants and (b) the exact same URL re-syndicated to multiple boards.

HN Who's Hiring Comment Parsing

Hacker News comments are unstructured text. The actor extracts structured data via:

  • Company: Regex on first line: ^([A-Z][A-Za-z0-9\s&.'-]+?)[\s]*[|(\-–]/ (expects "Company | Role" format)
  • Role: Matches patterns like "hiring/looking for/seeking X" or "Company | X"
  • Remote: Word boundary match for /\bremote\b/i
  • Location: Matches "location/based in/office in: X"
  • Minimum length: Comments under 50 characters are skipped

How Much Does It Cost?

The Job Market Intelligence actor uses minimal compute resources because it calls lightweight REST APIs rather than rendering web pages. No proxies are required.

The actor is billed pay-per-event: one report-generated charge per successful run regardless of result count, source count, or whether segmentation / historical tracking / incremental mode are enabled. Apify platform compute is billed separately at standard rates and depends on memory and runtime — runs typically complete in well under a minute, and the actor's defaults (512 MB) keep platform compute modest. A scheduled daily run for monitoring is significantly cheaper than running ad-hoc scrapes against multiple sources individually.

The exact PPE price for the report-generated event is shown in the Apify Store listing and logged at the start of every run.

Default memory is 512 MB and most runs complete in well under a minute, so platform compute is a small additional charge on top of the report-generated event.

Tips

  • Start broad, then filter — Run a general query like "engineer" first to see the full landscape, then narrow with location or company filters in subsequent runs.
  • Combine sources strategically — Remotive and Jobicy focus on remote roles, Arbeitnow covers European markets heavily, and HN Who's Hiring surfaces startup opportunities. Use the sources parameter to target specific ecosystems.
  • Schedule weekly runs to build a time-series dataset of skill demand trends. Export to Google Sheets and chart how Python vs. Rust demand changes month over month.
  • Use maxResults: 500 for comprehensive market reports, or keep it at 50 for quick daily pulse checks.
  • Filter by company name to monitor a specific competitor's hiring velocity — a sudden spike in open roles often signals a new product launch or funding round.
  • Disable salary or skill analysis with the toggle fields if you only need raw listings. This slightly reduces processing time for very large result sets.

This is NOT for you if

Skip this actor if any of these describe you — there's a better tool for your job:

  • You only want raw job listings with no analytics layer → use a basic single-source scraper
  • You need LinkedIn, Indeed, or Glassdoor data specifically → use a dedicated scraper for that platform; those sites are auth-walled and explicitly out of scope here
  • You're not making decisions from job market data → if you just want to display listings to end-users, the decision-engine layer is overhead you won't use
  • You need real-time / streaming hiring velocity (sub-hour) → snapshots are per-run, not streaming. The minimum cadence is "as often as you schedule the actor"
  • You need candidate-side data (LinkedIn profiles, resumes, talent pools) → this is a supply-side actor (job postings); it doesn't model the candidate pool
  • You need to auto-apply / auto-submit applications → out of scope and against most boards' ToS
  • You need salary parsing in GBP / CAD / AUD / JPY → only USD and EUR salary patterns are recognised; other currencies pass through unparsed in description

What this actor does NOT do

Honest scope so you don't buy the wrong tool:

NeedUse this instead
LinkedIn / Indeed / Glassdoor coverageDedicated single-source scrapers — those platforms require auth and anti-bot handling that this actor explicitly does not do
Glassdoor company review / sentiment / rating enrichmentA separate Glassdoor scraper — joining is a downstream task
Layoff cross-reference (layoffs.fyi)A separate layoff-tracker actor — keeps this actor's PPE economics simple
Candidate-side data (LinkedIn profiles, resumes, talent pools)Out of scope — this actor returns the supply side (job postings), not the demand side
Auto-applying / auto-submitting applicationsOut of scope and against most boards' ToS
GBP / CAD / AUD / JPY salary parsingOnly USD and EUR salary patterns are recognized; other currencies pass through unparsed in description
Real-time hiring-velocity trackingSchedule the actor with enableHistoricalTracking: truetrendInsights gives you listing-growth-rate, salary direction, rising/falling skills, new vs departed companies on every subsequent run. Sub-hour velocity isn't supported (snapshots are per-run, not streaming).

The actor's positioning: composable job market intelligence for automation — the cleanest, fastest "what does the public-API job market look like for X right now, AND how is it shifting?" with decision-ready enums on every record and trend insights on every scheduled run. If you need enterprise-grade hiring intelligence (Lightcast, Revelio Labs, LinkedIn Talent Insights), this isn't a replacement — but at <$1/run it's the right starting point for most automation, research, and alerting workflows.

Limitations

  • Source coverage — Only four job boards are queried. Major platforms like LinkedIn, Indeed, and Glassdoor are not included due to their authentication requirements and anti-bot measures.
  • Salary data availability — Not all listings include salary information. The salary statistics are based only on listings that provide parseable salary data, which may skew toward certain markets or seniority levels.
  • Currency support — Only USD ($) and EUR () salary patterns are recognized. Salaries in GBP, CAD, AUD, or other currencies will not be extracted into structured salary fields.
  • Skill detection scope — The 80+ built-in skill patterns are tuned for technology roles. Non-tech skills (e.g., "project management", "sales") are not tracked. False positives are possible for ambiguous terms. Use the customSkills input to add domain-specific terms.
  • HN comment parsing — Hacker News "Who's Hiring" comments are free-form text. Company name, role, and location extraction is best-effort via regex and may produce incorrect results for non-standard formats.
  • No direct application — The actor collects listing URLs but does not submit job applications on your behalf.
  • Real-time freshness — Data comes from live API calls, but the underlying job boards may have their own delays in indexing new postings.
  • Deduplication limits — The deduplication key uses company name + first 60 characters of the title. Listings with slightly different titles for the same role may not be caught.

Responsible Use

This actor accesses only publicly available job board APIs that are designed for programmatic access. It does not bypass authentication, scrape private data, or violate any terms of service. When using job market data:

  • Use data for legitimate research, job seeking, or workforce planning purposes
  • Do not use automated data to discriminate against job seekers or companies
  • Respect the intellectual property of job descriptions and company information
  • Comply with all applicable employment and data protection laws in your jurisdiction
  • See Apify's guide on web scraping legality for general guidance

FAQ

Do I need any API keys to use this actor? No. All four data sources (Remotive, Arbeitnow, Jobicy, HN Algolia) are free public APIs. No authentication is required.

How many jobs can I get per run? The actor can return up to 500 listings per run. The actual count depends on how many matches exist for your query across all four sources.

Does this actor work for non-tech jobs? Yes. While the skill extraction is tuned for technology roles, the job search itself works for any keyword — "marketing manager", "nurse", "accountant", or any other role. The skill analysis will simply return fewer matches for non-tech positions.

How fresh is the data? Listing data is fetched live at run time. Use the datePosted filter to restrict results to the last 24 hours, week, or month. Historical snapshots (used for trendInsights and incremental mode) are only stored when enableHistoricalTracking: true is enabled — and even then, only a bounded summary record per query (top skills counts, companies, seen URLs) is persisted, not the raw listings.

Can I filter for a specific country or city? Yes. Enter the location in the location field (e.g., "Germany", "London", "USA"). The actor performs a case-insensitive substring match against each listing's location field. If the filter removes all results, the actor gracefully falls back to including all listings.

What does the hn-whoishiring source cover? It searches Hacker News "Who is Hiring?" monthly threads via the Algolia search API (last 90 days). These contain direct hiring posts from startup founders and engineering managers — often with roles not listed on traditional job boards.

How does deduplication work? The actor generates a key from the lowercased company name and first 60 characters of the job title. If two listings share the same key, only the first one encountered is kept.

Can I run this on a schedule? Absolutely. Set up a schedule in the Apify Console (e.g., daily at 9 AM) to build a longitudinal dataset. Each run appends to the same named dataset if you configure it that way.

What currencies are supported for salary extraction? The parser recognizes USD ($) and EUR (€) salary patterns. Salaries in other currencies may appear in the description text but will not be extracted into the structured salary fields.

Why does the summary show salaryInsights: null? This happens when no listings in your results contain parseable salary data. Try broadening your query or using sources that more frequently include salary information (Jobicy has structured salary fields).

How is compensationTier calculated? Each role's salaryMax (or salaryMin if max is missing) is divided by the cohort's overall median salary. <0.85 is below-market, 0.85–1.10 is at-market, 1.10–1.35 is above-market, >1.35 is premium. Listings with no parseable salary get unknown. The cohort is everything in the current run that matched your filters.

How is recommendedAction decided? Heuristic combining description completeness, salary presence, company recognizability, and compensationTier:

  • apply-now — premium-tier comp with salary, OR above-market with salary at a known company
  • research-company — unknown company OR no salary data
  • skip-low-detail — description under 200 characters
  • review-fit — everything else

It's a fast routing tag for downstream automations, not investment advice. Use it to pre-filter before a human or LLM reviews each role.

What does crossSourceConfirmed mean? A listing is crossSourceConfirmed: true if the same role appeared on more than one board before deduplication. Matching uses the two-phase algorithm: normalised title (seniority noise tokens stripped, tokens alphabetised) AND URL secondary key. crossSourceCount tells you exactly how many board-postings collapsed into this record. Multi-source posting is a stronger signal that the company is actively recruiting (vs a stale auto-imported listing).

Why are some skills missing from skillPremiums? Skills only appear in skillPremiums if at least 5 listings containing that skill also have parseable salary data. Below that threshold the median is too noisy to be meaningful. Use topSkills for raw frequency rankings regardless of salary data availability.

Can I use this output in Dify or n8n? Yes. Both compensationTier and recommendedAction are stable enums designed for Dify if/else branching nodes and n8n switch nodes. See the "Use in Dify" section below for an example workflow.

How do I track week-over-week trends? Set enableHistoricalTracking: true and schedule the actor in Apify Console (e.g., weekly). On the second run onward, the summary record gets a trendInsights block with listingGrowthRate, salaryMedianChange + percent, topRisingSkills[] (≥25% delta), topFallingSkills[], newCompanies[], departedCompanies[], and direction (expanding / stable / tightening). The first run returns trendInsights: null and writes the baseline snapshot.

What's the difference between historical tracking and incremental mode? enableHistoricalTracking writes a snapshot per query and computes trend deltas on every subsequent run. incremental (a separate flag, requires tracking on) additionally drops listings whose URLs were returned in the previous run — so the dataset only has fresh items. Use historical-tracking-only for week-over-week salary monitoring (you want to recompute the full cohort each time); add incremental on top for daily fresh-listings feeds where you only care about new postings.

Why is trendInsights null on my run? Either (a) enableHistoricalTracking is false, (b) it's the first run for the snapshot key (first run writes the baseline; trends start from the second run), or (c) the prior snapshot is older than lookbackDays. Check the run log — it'll say which.

How do I segment analytics across regions / seniorities / etc? Set groupBy: ["location", "seniorityLevel"] (or any combination of location / seniorityLevel / remote / jobType / source / skillCategoryProfile / compensationTier). The summary report adds a segments[] array with per-segment salary percentiles, top skills, and seniority breakdown — fixes the cohort-mixing distortion where mixing $200k SF salaries with €50k Berlin salaries makes the median meaningless.

How do I add domain-specific skills? Pass customSkills: each entry is { name, regex, category? }. The custom skills get full first-class treatment in topSkills, skillPremiums, skillScarcity, and skillCategoryDemand. Invalid regexes are logged and skipped so a typo doesn't break the run.

What does dataQuality tell me? It's the auditability layer — salaryCoveragePercent (what % of listings have parseable salary data, so you know how trustworthy the percentiles are), deduplicationConfidence (high/medium/low, based on cohort size + cross-source overlap rate), sourceBias (is this cohort remote-heavy / Europe-skewed / US-skewed / dominated by one source?), and notes[] plain-English warnings about distortions. Use it to decide whether to trust the cohort's analytics for your specific workflow.

What's marketTightness measuring? A 0–100 supply/demand index combining cross-source posting overlap (employers mass-posting = high demand), salary dispersion (compressed bands = standardised market = tight for talent), and listing volume (more listings = more demand). Returns a label (tight / balanced / loose / unknown) and a reason string explaining the inputs. Use the label in dashboards / Slack alerts when you want a single human-readable signal of the market state.

What's skillScarcity for? For each skill in topSkills that ALSO has salary premium data, it computes scarcityScore = 0.6 × premiumNorm + 0.4 × rarityNorm. Skills with high pay AND low frequency rank highest — these are the "high leverage" learning targets for job seekers and the "hard to hire" warning signs for talent leaders. Empty when cohort < 20 listings or no salary premiums available.

What's the difference between runMode and baselineStatus? runMode describes WHAT the actor did this run (snapshot / historical / incremental). baselineStatus describes WHERE we are in the historical-tracking lifecycle (disabled / created / compared / expired). They're independent: a historical run with baselineStatus: "compared" means trend insights were computed; a historical run with baselineStatus: "created" means it was the first run for this key (trends null, baseline written for next time).

My historical run returned trendInsights: null — what happened? Check baselineStatus. If created, this is the first run for the snapshot key — the baseline has been written and the next run will return trends. If expired, the prior snapshot was older than lookbackDays (default 30) — bump lookbackDays or run more frequently. If disabled, you didn't set enableHistoricalTracking: true. Always check warnings[] too — it'll spell out what happened in plain English.

What's schemaVersion and when do I need to care about it? schemaVersion is the output contract version — currently "2.0". The actor follows additive-only semantics within a major version: new fields may appear, but existing fields won't be renamed or repurposed. Branch on schemaVersion in long-lived integrations if you want to opt into v3 features explicitly when they ship. For most consumers, the existing field set is stable and you don't need to read this.

What should I do with warnings[]? Read it before acting on the cohort's analytics. It promotes dataQuality.notes (cohort-bias warnings) alongside other run-level signals (sources failed, low confidence, expired baseline, critical events) so downstream automation can route on a single top-level array. Common pattern: gate Slack alerts on warnings.length === 0 && decisionReadiness === "actionable".

How do I use recommendedActions[]? Each action is a structured object with a stable action string (e.g. "increase_salary_band" / "learn_skill" / "accelerate_hiring"), a target (when applicable, e.g. "Rust" for learn_skill), confidence/impact/urgency tags, an appliesTo[] audience filter, and a plain-English reason. Branch on action in Dify / n8n / Zapier switch nodes; filter by appliesTo to surface only the audience you care about (e.g. appliesTo.includes("recruiting") for a hiring-team Slack channel). The reason string is paste-ready into reports — no LLM rewriting needed.

What's the difference between marketTightness and marketRegime? marketTightness is a single-run snapshot of demand pressure (tight / balanced / loose) — answers "is talent supply meeting demand right now?". marketRegime is a state classification (expansion / contraction / stagnation / volatility) — answers "where is the market heading?". The two are complementary: a market can be tight + expansion (heating up) or loose + contraction (cooling fast). Confidence is materially higher on marketRegime when historical tracking is enabled (trend signals dominate the classification).

How does skillTrajectory map skills to stages?

  • emerging — low frequency (<8%) AND high salary premium (≥5%) AND non-falling trend
  • mainstream — moderate-to-high frequency (≥25%) AND not-saturated
  • saturated — high frequency (≥50%) AND no premium (<3%)
  • declining — week-over-week trend ≤ −50%
  • stable — everything else (default fallback)

Velocity (hypergrowth / growing / steady / cooling / falling) is computed independently from the week-over-week delta (when historical tracking is on). Stage answers "where is this skill in its lifecycle?"; velocity answers "how fast is it moving?".

How do mode presets actually change the output? mode only reorders recommendedActions[] — same actions, different audience priority. default is balanced; job_seeker bubbles learn_skill / apply-now / curriculum actions to the top; recruiter bubbles increase_salary_band / accelerate_hiring / role-spec actions; analyst bubbles enable_historical_tracking / increase_monitoring_frequency / strategy actions. The cohort analytics, marketRegime, skillTrajectory, events, and per-job records are identical across modes.

How do events[] work for alerting? Each event represents a single threshold crossing — easy to filter, route, and alert on. The default thresholds are conservative (5% salary moves, 25% listing growth, 100% skill emergence). Override via eventThresholds for noisier or quieter alerting. Each event ships severity (critical / warning / info), value, threshold, target (if scoped to a skill/company), and a complete-sentence message. Common pattern: send severity === "critical" events to PagerDuty, severity === "warning" to Slack, severity === "info" to a monitoring dashboard.

What does the whatIf[] engine actually predict? Only what's derivable from the cohort distribution at run time. A salary_change scenario maps the proposed salary to a percentile rank against the pooled salary distribution (e.g. "10% raise moves you from P50 to P78") and to a compensationTier enum. A skill_emphasis scenario looks the named skill up in skillScarcity, skillTrajectory, and skillPremiums to report stage / velocity / frequency / premium. No claim is made about candidate response rates, time-to-fill, offer-accept rates, or hire outcomes — those signals are not in public job-listing data. Confidence is hard-capped at 60 (medium) and every result carries a caveats[] array. If a recruiter wants forecasts of hire-pipeline impact, they need ATS data, not job-listing data — different actor, different data source.

What does effectiveness: "limited" mean? Either the scenario produces a small percentile shift (e.g. a 2% salary bump in a flat market) or a user-supplied constraint bound the scenario. When constraints.maxPercent binds, effectiveness automatically downgrades to reflect that the user's real-world constraint reduces the move's impact.

Why is whatIf confidence capped at 60? Honesty. The actor only has cohort distribution data — not application data, not hire outcomes, not response rates. A counterfactual based purely on percentile-shift cannot honestly claim high-confidence predictive power. The cap forces the output's confidenceLevel to stay medium or low — never high.

How do I read confidenceBreakdown on actions? Three components, 0–100 each: dataStrength (cohort size + salary coverage + dedup confidence), signalClarity (how cleanly the action's underlying signal fired), historicalConsistency (whether trend signals reinforce the action). Use them to audit specific actions: a learn_skill action with high signalClarity but low historicalConsistency means the current cohort signal is strong but we don't yet know if it persists; reading confidenceBreakdown tells you whether to wait for more snapshots before acting.

What is hold_strategy? An honest "no-edge" recommendation that fires when (a) regime is unknown or stagnation, (b) marketTightness is balanced or unknown, (c) no strong week-over-week trends, AND (d) no high-urgency actions exist. Most analytics tools over-signal; this actor surfaces "stay the course" as a first-class verdict so consumers know when not to act.

What's the difference between actionClusters[] and recommendedActions[]? recommendedActions[] is the flat list of 8–12 actions. actionClusters[] groups them into 3–5 themes (compensation_strategy / talent_pipeline / skill_strategy / monitoring_strategy / source_strategy / general) so the output reads as a strategy document rather than an alert stream. Use clusters for executive summaries; use the flat list for granular automation routing.

How does marketMemory build up over time? Each scheduled run with enableHistoricalTracking: true appends the current run's regime to a bounded regimeHistory (cap 12, FIFO) inside the snapshot KV record. The actor then derives regimeStability (fraction of recent runs in the same regime), lastInflectionDaysAgo (when the regime last changed, if at all), and pattern (one of 9 enum values like expansion_stable / expansion_weakening / volatile_shifting). Patterns activate at 3+ snapshots; before that the field carries pattern: "insufficient-history". Designed to let humans reason in market patterns, not raw deltas.

What does decisionTension[] actually catch? When two recommendedActions in the same run work against each other under a single sourcing pipeline. Six tension types: cost_vs_selectivity (e.g. raising salary AND tightening specs), speed_vs_quality (acceleration AND gating), remote_vs_local_reach (remote-first AND geo-expansion), act_now_vs_wait (acceleration AND hold), early_mover_vs_safe_bet (investing in emerging skills AND deprioritising declining ones), depth_vs_breadth (broaden query AND segment for clarity). Each tension carries a recommendedBalance so the consumer knows which lever to favour given the cohort's signals. Empty when no contradictory pairs are present — most cohorts will have 0 or 1.

Why surface rejectedActions[] if you're not going to do them? Trust. Most analytics tools always emit something — that trains users to ignore them. By making the actor explicit about what it WON'T recommend (decrease_salary_band rejected when market is tight, accelerate_hiring rejected in contraction, prioritize_remote_roles rejected in heavily on-site cohorts), every recommended action carries the implicit weight of "the system also considered the opposite move and ruled it out." Same pattern as hold_strategy — explicit abstention strengthens the rest of the output.

How do I read whatIf sensitivity? Salary scenarios now ship a sensitivity block: lowerOutcome (user input −5pp), upperOutcome (user input +5pp), spreadPercentilePoints, and stability. stability: "low" = the percentile shift is robust (small comp variation produces minimal movement; the cohort distribution is flat in that range). stability: "high" = the percentile shift is sitting on a steep distribution — small input variation produces large outcome swings, plan for non-linearity. moderate is the most common case. The note string explains the spread in plain English. Use this to size risk on real comp moves: high-sensitivity outcomes warrant a buffer, low-sensitivity outcomes give you slack to negotiate.

Can I get decisionTension and rejectedActions on a one-shot run? Yes — both are derived purely from the current run's recommendedActions[] and the cohort signals. They don't require historical tracking. The richer tension picture (e.g. act_now_vs_wait requires both accelerate_hiring and hold_strategy to be in the action list) emerges most often when historical tracking IS on, but the engine works fine on a single shot.

Automation snippets

Three paste-ready patterns for the most common automation surfaces. All three branch on stable enums — no LLM, no prompt engineering, no fuzzy matching.

1. Slack alert from events[]

Wire an Apify Run-Succeeded webhook to a service that can read the run's dataset (Make, Zapier, n8n). After fetching the summary record (recordType === "summary"), iterate events[] and fan out by severity:

// Pseudocode for an n8n / Make / Zapier function step
const summary = items.find((it) => it.recordType === 'summary');
if (!summary || summary.warnings.length > 0) return; // gate on clean runs
if (summary.decisionReadiness !== 'actionable') return;

for (const ev of summary.events) {
    const channel = ev.severity === 'critical' ? '#oncall'
                  : ev.severity === 'warning'  ? '#labor-market-alerts'
                  : '#labor-market-info';
    await slack.postMessage({
        channel,
        text: `:rotating_light: *${ev.type}* — ${ev.message}`,
        attachments: [{
            color: ev.severity === 'critical' ? 'danger' : ev.severity === 'warning' ? 'warning' : 'good',
            fields: [
                { title: 'Query',  value: summary.query, short: true },
                { title: 'Regime', value: summary.marketRegime.type, short: true },
                { title: 'Value',  value: String(ev.value),    short: true },
                { title: 'Threshold', value: String(ev.threshold), short: true },
            ],
        }],
    });
}

The ev.message field is a complete, paste-ready sentence — no LLM rewriting needed. Use the example above as the function-step body in n8n, the webhook handler in Make, or the action step in Zapier.

2. n8n switch node on recommendedActions[].action

Drop the actor's run output into n8n. Use a Switch node with the routing key set to {{$json.summary.recommendedActions[0].action}}. The action enum is stable across runs:

Switch case (action)Route to
accelerate_hiringhiring-manager Slack channel
increase_salary_bandcomp-team email distribution
learn_skill (with target)learning-recommendations queue + employee-newsletter source
invest_in_skill (with target)curriculum review board
hold_strategydashboard tile only — no notification
enable_historical_trackingDevOps queue (config change)
re_run_for_full_coverageactor scheduler — re-run with +1 retry
broaden_queryanalyst review (cohort too narrow)
diversify_sourcesdata-team queue

For more granular routing, switch on the full action plus target: {{$json.summary.recommendedActions[0].action}}:{{$json.summary.recommendedActions[0].target ?? ''}}. Combine with appliesTo filtering for persona-specific fan-out: recommendedActions.filter((a) => a.appliesTo.includes('recruiting')).

3. Recruiter workflow with decisionTension[]

Before a hiring manager applies any recommended action, surface tensions so they don't pick contradictory moves:

// Pseudocode for a recruiter Slack command / dashboard pre-check
const summary = await getJmiSummary({ query, mode: 'recruiter' });

// Filter to recruiter-relevant actions
const recruiterActions = summary.recommendedActions
    .filter((a) => a.appliesTo.includes('recruiting'));

// Block-and-explain if tensions exist
if (summary.decisionTension.length > 0) {
    const t = summary.decisionTension[0];
    return slack.postMessage({
        channel: '#hiring-decisions',
        text: `:warning: Strategy tension detected: *${t.tension}*`,
        attachments: [{
            color: 'warning',
            text: `${t.explanation}\n\n*Recommended balance:* ${t.recommendedBalance}\n\nActions involved: ${t.between.join(' ↔ ')}`,
        }],
    });
}

// Surface the rejected actions so the recruiter knows the system already considered them
if (summary.rejectedActions.length > 0) {
    const lines = summary.rejectedActions.map((r) => `• *${r.action}* — ${r.reason}`).join('\n');
    await slack.postMessage({
        channel: '#hiring-decisions',
        text: `:no_entry: System rejected the following alternatives:\n${lines}`,
    });
}

// Then surface the top 3 recruiter actions
for (const a of recruiterActions.slice(0, 3)) {
    await slack.postMessage({
        channel: '#hiring-decisions',
        text: `:white_check_mark: *${a.action}* (urgency: ${a.urgency}, confidence: ${a.confidence}/100) — ${a.reason}`,
    });
}

This pattern catches the most common hiring mistake: a recruiter applies multiple recommended actions sequentially without realising they trade off against each other (e.g. raising comp AND tightening role specs in the same week). The decisionTension[] array surfaces those pairs explicitly so the conversation happens BEFORE the spec is changed.

Use in Dify

Drop this actor into Dify workflows via the Apify plugin's Run Actor node. Each job returns scored, classified, and tagged with recommendedAction as structured JSON — apply-now / research-company / review-fit / skip-low-detail plus compensationTier (below-market / at-market / above-market / premium / unknown) that your downstream node branches on. A generic job scraper pointed at the same boards returns raw HTML; this returns decisions.

The summary record carries decisionReadiness (actionable / monitor / insufficient-data) — gate your automation on that scalar so it only fires when the cohort is statistically meaningful. confidenceLevel (high / medium / low) is the secondary lever. claim and marketSnapshot strings are usable verbatim in Slack messages, email subjects, and agent prompts — no LLM rewriting needed. actionReason on every job is the equivalent for per-listing routing.

  • Actor ID: ryanclinton/job-market-intelligence
  • Sample input (recurring senior+ Python market scan with full analytics):
{
    "query": "senior python engineer",
    "remoteOnly": true,
    "datePosted": "week",
    "analyzeSkills": true,
    "analyzeSalaries": true,
    "maxResults": 200
}

Dify branching example

A typical Dify workflow consumes the dataset in three stages:

  1. Gate the run — read the summary record (the first dataset item where recordType === "summary"). Use an if/else node:
    • decisionReadiness === "actionable" → continue to per-job routing
    • decisionReadiness === "monitor" → log the cohort but skip per-job notifications
    • decisionReadiness === "insufficient-data" → escalate / re-run with broader filters
  2. Surface cohort-level decisions — iterate summary.recommendedActions[] and route by action:
    • "increase_salary_band" / "accelerate_hiring" → notify hiring manager
    • "learn_skill" (with target) → push to a learning-recommendations channel
    • "diversify_sources" → log to monitoring channel for the data team
    • Filter by appliesTo.includes("recruiting") etc. to fan out only the actions the recipient cares about
  3. Route each job — iterate the recordType === "job" records. Use a switch node on recommendedAction:
    • "apply-now" → push to a "high-priority" Slack channel with the job's title, company, salary range, and actionReason
    • "research-company" → push to a "needs-research" queue (often compensationTier === "unknown")
    • "review-fit" → write to a spreadsheet for batch human review
    • "skip-low-detail" → drop silently

Because recommendedAction, compensationTier, decisionReadiness, confidenceLevel, marketRegime.type, and the recommendedActions[].action strings are all stable enums, branching is exact-match equality — no fuzzy matching, no LLM classification, no prompt engineering. The same enums work in n8n switch nodes, Zapier filters, Make routers, and SQL WHERE clauses.

For event-driven workflows, gate on summary.events[]: severity === "critical" → PagerDuty / on-call, severity === "warning" → Slack, severity === "info" → dashboard tile. Every event ships a complete-sentence message so notification copy is paste-ready.

actionReason, recommendedActions[].reason, marketRegime.note, and claim are emitted as plain-English sentences from deterministic templates — no LLM was called to write them, so they're free of hallucination and stable across runs. Pipe them straight into notification copy, agent tool-call summaries, or LLM prompts as authoritative ground-truth context.

Integrations

Connect the Job Market Intelligence actor to your existing tools and workflows:

  • Zapier — Trigger actions in 5,000+ apps when new job listings are found
  • Make — Build complex job monitoring automation workflows
  • Google Sheets — Export job data directly to spreadsheets for analysis
  • Slack — Get instant notifications when new jobs matching your criteria appear
  • The Apify API — Programmatic access to results via REST API
  • Apify Webhooks — Trigger custom actions when a run finishes

Related Actors

ActorUse Case
ryanclinton/website-contact-scraperExtract emails, phone numbers, and social links from company websites found in job listings
ryanclinton/b2b-lead-gen-suiteCombine multiple data sources to build enriched B2B lead lists
ryanclinton/company-deep-researchDeep-dive into a specific company with financial, social, and web data
ryanclinton/github-repo-searchFind open-source projects from companies that appear in your job market results
ryanclinton/website-tech-stack-detectorIdentify the technology stack a hiring company actually uses on their website
ryanclinton/serp-rank-trackerMonitor search engine rankings for job-related keywords
Last verified: March 27, 2026

Ready to try Job Market Intelligence?

Run it on your own Apify account. Apify offers a free tier with $5 of monthly credits.

Open on Apify Store