Hacker News Search — Stories, Comments & Developer Sentiment is an Apify actor on ApifyForge. Search & extract Hacker News stories, comments, polls, Show HN, and Ask HN posts. Filter by date range, author, minimum points, and comment count. Export JSON/CSV. It costs $0.005 per story-fetched. Best for teams who need automated hacker news search — stories, comments & developer sentiment data extraction and analysis. Not ideal for use cases requiring real-time streaming data or sub-second latency. Maintenance pulse: 92/100. Last verified March 26, 2026. Built by Ryan Clinton (ryanclinton on Apify).
Hacker News Search — Stories, Comments & Developer Sentiment
Hacker News Search — Stories, Comments & Developer Sentiment is an Apify actor available on ApifyForge at $0.005 per story-fetched. Search & extract Hacker News stories, comments, polls, Show HN, and Ask HN posts. Filter by date range, author, minimum points, and comment count. Export JSON/CSV. Track tech trends, monitor brand mentions, analyze developer sentiment. No API key needed.
Best for teams who need automated hacker news search — stories, comments & developer sentiment data extraction and analysis.
Not ideal for use cases requiring real-time streaming data or sub-second latency.
What to know
- Results depend on the availability and structure of upstream data sources.
- Large-scale runs may be subject to platform rate limits.
- Requires an Apify account — free tier available with limited monthly usage.
Maintenance Pulse
92/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| story-fetched | Charged per Hacker News story or comment retrieved. | $0.005 |
Example: 100 events = $0.50 · 1,000 events = $5.00
Documentation
Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline.
Hacker News Intelligence — turns HN search results into ranked signals, trends, thread intelligence, and smart alerts for founders, developer relations teams, researchers, and investors.
Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline.
It is a developer sentiment monitoring tool, a Hacker News trend detection tool, and a social listening tool for developers — focused on high-signal discussions.
It turns raw discussions into ranked, explainable, actionable insights. Instead of reading hundreds of posts, you get the few that actually matter — and what to do about them.
Unlike simple HN scrapers, this actor does not just return posts — it ranks, explains, expands, compares, and alerts on developer-community signals. Every result gets a 0–100 signal score (engagement + velocity + author influence + recency). Detect rising keywords with built-in trend detection (current N-day window vs previous N-day window). Expand full comment threads via the HN Firebase API. Compare two periods side-by-side. Auto-split queries that exceed Algolia's 1,000-result cap. Pick a one-click mode for the job (brand monitor, competitor tracking, Who-Is-Hiring extractor, Show HN traction, discover). Schedule it, route smart-filtered alerts to Slack or Discord. Export as JSON, CSV, Excel, or stream through the Apify API. No HN API key required.
The actor combines Algolia HN search, Firebase thread expansion, and deterministic scoring to produce structured, ranked outputs with trend detection and action recommendations. Tools in this category typically combine Algolia HN search and Firebase APIs — this actor implements that pattern with structured outputs and decision signals. Unlike general monitoring tools like Brand24 or Mention, it is purpose-built for Hacker News and developer communities — defining a new category: developer-signal extraction from high-signal technical communities.
Hacker News Intelligence turns raw discussions into ranked, explainable signals.
It is designed to reduce cognitive load by showing only the discussions that matter, why they matter, and what to do next.
Instead of reading hundreds of posts, you get the few that actually matter — and a recommendation for what to do about each one.
It extracts signal from noise in the highest-signal developer community on the internet.
It is the fastest way to understand what developers care about right now.
What is Hacker News Intelligence?
Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline.
Hacker News Intelligence is a tool that analyzes Hacker News data and converts discussions into ranked, actionable developer signals.
It:
- Analyzes Hacker News discussions (Algolia search + HN Firebase API)
- Ranks every result by importance (0–100 signal score)
- Detects trends and developer sentiment (rising n-grams, heuristic insights)
- Suggests actions based on signal (engage / investigate / monitor / ignore)
- Expands full comment threads (reply tree + thread-level summary)
- Compares time periods (rising vs declining keywords)
- Alerts to Slack or Discord (smart-filtered to high-signal mentions only)
It is also the easiest way to monitor Hacker News, track mentions, and detect developer trends in real time.
What makes this different
Most Hacker News tools return posts. This actor returns decisions:
signalScore— ranks every result 0–100 by importance (engagement + velocity + author influence + recency)whyThisMatters— explains in plain English why a result is high-signalsuggestedAction—engage/investigate/monitor/ignore— the next step you should takefeedbackType—complaint/feature_request/praise/questionfor product teamstrendStage—emerging/rising/peaked/decliningfor trend records- Thread summaries — full reply-tree expansion plus a one-paragraph aggregate (sentiment + top themes + risk level)
- Discover mode — zero-input front-page exploration with trends + insights pre-applied
This turns raw Hacker News data into actionable developer intelligence — fed straight into Slack alerts, AI agent tool calls, dashboards, or downstream automation, with no manual analysis step in between.
What problems this solves
Use this actor if you want to:
- Track mentions of your startup or product on Hacker News (brand monitoring with alerts) — daily Slack/Discord alerts, smart-filtered to high-signal mentions only
- Detect emerging developer trends before they go mainstream (rising n-grams, week-over-week growth)
- Identify complaints, feature requests, and praise from real users (heuristic feedback classification)
- Track competitor activity in the developer community (smart-alert mode filters noise)
- Analyze full discussion threads instead of just headlines (reply-tree expansion + thread-level sentiment)
- Discover high-signal startup ideas and technologies early (Show HN traction analytics, GitHub repo signals)
- Build datasets of developer sentiment and adoption signals for fine-tuning, RAG, or research
- Mine Who Is Hiring threads for structured job listings (company / location / remote / apply URL)
Capabilities at a glance
- Search and filter — full-text query against the entire HN archive (2007 → today) via the Algolia API
- Score and rank mentions — 0–100
signalScoreon every result, sortable + filterable - Detect emerging trends — n-gram analysis with current-vs-previous window comparison
- Classify feedback — complaint / feature_request / praise / question (heuristic regex)
- Suggest actions — engage / investigate / monitor / ignore
- Expand full comment threads — reply tree via the HN Firebase API
- Summarize threads — aggregate sentiment + themes + risk per thread
- Compare time periods — side-by-side delta metrics + topRisingTerms / topDecliningTerms
- Enrich with author data — karma, account age, submission count, 0–100 influence score
- Enrich with GitHub data — stars, language, last-push, plus freshness + maturity + signal classification
- Parse Who Is Hiring — structured job listings from monthly threads
- Brand-mention alerts — Slack/Discord webhooks on new mentions, smart-filtered by signal score
- Discover mode — zero-input HN front-page exploration with trends + insights pre-applied
- Auto-pagination beyond 1,000 — adaptive date-bucket splitting for archives
Decision layer (what makes this LLM-native)
This actor is designed to drop straight into LLM agents and automation pipelines without intermediate analysis steps:
| Field | Use it to |
|---|---|
signalScore | Filter high-value mentions (sort DESC, threshold ≥ 50) |
signalLevel | One-glance bucket (high / medium / low) for spreadsheet rules |
whyThisMatters | Drop directly into Slack messages or LLM summaries — no reprocessing needed |
suggestedAction | Branch downstream automation: engage / investigate / monitor / ignore |
feedbackType | Route product feedback: complaints to support, feature_requests to PM, praise to marketing |
isInfluencerMention / influencerTier | Identify high-credibility voices (top 10% / top 1%) |
recordType | Discriminator across result / thread_comment / thread_summary / trend records |
commentText / text | Feed raw text into your own LLM pipelines for deeper analysis |
A typical agent workflow: filter WHERE recordType = 'result' AND suggestedAction IN ('engage', 'investigate'), post the whyThisMatters line to Slack, link hnUrl for context, route by feedbackType. Zero glue code.
Also useful for
- Developer sentiment analysis
- Product feedback monitoring from engineers
- Startup idea validation
- Open source trend tracking
- Social listening in developer communities
- Early-stage technology discovery
- Competitive intelligence for technical products
- Developer Relations / DevRel signal monitoring
- Investor / VC trend research
- Show HN launch tracking
- Founder market validation
- HN influencer mapping
Works well with AI agents
This actor is built to plug directly into LLM workflows:
- Use
signalScoreandsignalLevelto filter inputs to high-value mentions only - Use
suggestedActionas the routing key for agent tool selection —engagetriggers a draft-reply tool,investigateopens a support ticket,monitorposts to a watchlist channel,ignoreis dropped - Use
whyThisMattersandinsightSummaryfor direct Slack / email / dashboard rendering — these are pre-written, LLM-quality sentences that need no rewriting - Feed
commentTextand threadtextinto your own LLM pipelines when you need deeper analysis (sentiment with context, summarization, classification) - The
recordTypediscriminator lets agent tool calls cleanly route across the four record shapes
Ideal for:
- AI copilots that monitor developer communities
- Internal tooling for product / DevRel / support teams
- Automated brand-mention monitoring with Slack/Discord alerts
- RAG knowledge-base ingestion of high-signal HN discussions
- Fine-tuning datasets of structured developer feedback
- Scheduled trend reports for executive stakeholders
Quick start
I want brand alerts on Hacker News
{
"mode": "brand_monitor",
"query": "MyProduct",
"alertWebhookUrl": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXX"
}
Schedule daily. Get a Slack message every time MyProduct hits HN.
I want to research a topic
{
"mode": "search",
"query": "rust async",
"detectTrends": true,
"includeInsights": true,
"expandThreads": true,
"maxResults": 50
}
Top results + reply trees + heuristic sentiment + theme detection + rising keywords on the topic.
I want to discover what's hot on HN right now (no topic needed)
{
"mode": "discover",
"query": ""
}
Front-page items + rising trends + heuristic insights, no topic specification. Leave the query empty for the full feed, or set a query to filter front-page items by topic.
I want full discussion context for a single thread
{
"query": "Show HN: my product",
"tags": "story",
"expandThreads": true,
"threadMaxDepth": 5,
"threadMaxComments": 500,
"maxResults": 1
}
The first match's complete reply tree, capped at depth 5 and 500 comments total.
I want competitor activity, smart-filtered
{
"mode": "competitor_tracking",
"query": "CompetitorName",
"alertWebhookUrl": "https://hooks.slack.com/services/...",
"includeAuthorProfile": true
}
Only signal-score-≥-50 mentions reach Slack; raw data still in the dataset for review.
I want a Show HN snapshot with traction analytics
{
"mode": "show_hn_analysis",
"query": "AI",
"detectTrends": true,
"includeInsights": true,
"maxResults": 100
}
Show HN posts about AI + trending keywords across them + sentiment + the SHOW_HN_SUMMARY aggregate KV record.
I want a side-by-side period comparison
{
"query": "kubernetes",
"compareMode": "explicit",
"compareDateFromA": "2026-04-01",
"compareDateToA": "2026-04-30",
"compareDateFromB": "2026-03-01",
"compareDateToB": "2026-03-31"
}
COMPARISON_SUMMARY KV record with mention deltas and topRising/topDeclining terms.
I want more than 1,000 results
{
"query": "AI",
"dateFrom": "2025-01-01",
"dateTo": "2026-04-30",
"autoSplitLargeQueries": true,
"maxResults": 5000
}
The actor recursively halves the date range when a bucket would exceed 900 hits, fetches each bucket, and dedupes by HN object ID.
Record types in the dataset
The actor emits four kinds of dataset records, distinguished by the recordType field:
recordType | What it is | When emitted |
|---|---|---|
result | A search hit from Algolia HN | Always (the standard dataset row) |
thread_comment | A comment in a story's reply tree | When expandThreads: true |
thread_summary | Aggregate summary of an expanded thread (count, sentiment, top themes) | When expandThreads: true (one per parent) |
trend | A rising keyword/n-gram with stage + reason | When detectTrends: true |
Filter downstream with WHERE recordType = 'result' (or 'trend', etc.) for clean routing in SQL, Sheets, or LLM tool calls.
Best-results guidance
For best monitoring results:
- Use
searchType: "date"to prioritize fresh signal over historical relevance ranking. - Use exact brand names in quotes —
"\"Acme Corp\""notacme. - Enable
includeAuthorProfileto filter out low-karma noise. - Use
alertMode: "smart"for noisy queries — only signal-score-≥-50 mentions reach the webhook. - Schedule daily, not hourly — HN moves fast but daily cadence captures everything important without alert fatigue.
For research and trend detection:
- Use
detectTrends: truewithtrendWindowDays: 7for week-over-week trend signals. - Bump
trendMinMentionsto 5+ on broad queries to filter out one-off noise. - Pair with
includeInsights: trueto see sentiment and themes alongside the trends.
For thread expansion (research mode):
- Keep
maxResultslow (1–10) whenexpandThreads: true— you'll get hundreds of comment records per parent. - Set
threadMaxDepth: 2for shallow context,5for deep dives. - Set
GITHUB_TOKENenv var if you also enable GitHub enrichment to avoid the 60/hr rate limit.
Why use Hacker News Search?
Hacker News is the highest-signal developer community on the web. Millions of posts on software, startups, AI, and policy — but the native search is basic and the Algolia API is bare-bones. Most third-party HN monitoring tools (Brand24, Mention, Syften) charge $50–$100 per month. This actor delivers the same job for $0.005 per result plus an intelligence layer those tools don't ship at all:
- Signal Score (0–100) on every result — composite of engagement (40%), velocity (25%), author influence (20%), and recency (15%). One field tells you whether a mention matters; sort by it, filter by it, route alerts on it.
- Velocity scoring —
pointsPerHour,commentsPerHour, and anisTrendingboolean for every item. Identify Show HN posts catching fire before they hit the front page. - Author influence scoring — 0–100 score from karma + account age + submission count, so you can filter out low-reputation noise.
- Smart alerts —
alertMode: "smart"routes to your Slack/Discord webhook only when signal score ≥ 50. No more "your brand was mentioned in a 0-point comment by a 3-day-old account" notifications. - One-click modes —
brand_monitor,competitor_tracking,hiring_intelligence,show_hn_analysispre-configure the actor for the job. No 20-field configuration screen. - Daily brand-mention monitoring — schedule it, get only the new mentions since last run, formatted for Slack/Discord webhooks.
- Author reputation enrichment — karma, account age, submission count for every result.
- GitHub repo signals — stars, primary language, last-push date when a result links to a repo.
- "Who Is Hiring" parser — structured extraction of company, location, remote mode, and apply URL from the monthly HN hiring threads.
- Show HN traction analytics — aggregate report (count, average points/comments/signal, top 5) saved to the run's key-value store.
- Query expansion — type
"AI"and the actor automatically searches"artificial intelligence","AI", and"LLM", deduplicating by HN object ID.
Sort by relevance or date. Restrict to specific content types. Set point and comment thresholds. Scope to date ranges. Filter by author. All structured. All cheap. All scheduled-friendly.
Key features
Intelligence layer (always-on)
- Signal Score (0–100) — composite of engagement, velocity, author influence, and recency. Sort the dataset by it for the highest-signal results first.
signalLevel(high/medium/low) for one-glance filtering. - Velocity scoring —
pointsPerHour,commentsPerHour, and anisTrendingboolean (true when < 24h old AND ≥ 5 pts/hr or ≥ 2 comments/hr). - Author influence score — 0–100 derived from karma (50%), account age (25%), submissions (25%), all log-normalized. Available when
includeAuthorProfile: true. whyThisMatters— plain-English explanation of why a result is high-signal, generated deterministically from the contributing fields. Null on low-signal results.suggestedAction—engage/investigate/monitor/ignore. Decision-tier output that bridges data → action.feedbackType— heuristic classification:complaint/feature_request/praise/question. Built from regex patterns on comment + story text.influencerTier+isInfluencerMention— tiers the author influence score:top_1_percent(≥90),top_10_percent(≥70),active(≥40),new(<40). The boolean fires on top-10%-or-better.
Analysis modes
- Trend detection —
detectTrends: trueruns two date-bounded searches (currenttrendWindowDayswindow + the previous equal-length window), extracts 1/2/3-grams from titles + story bodies + comments, and surfaces rising terms withtrendScore(40% growth + 30% mentions + 20% avg signal + 10% unique authors). Writes aTREND_SUMMARYkey-value record AND pushes top trends asrecordType: 'trend'dataset records. - Historical compare —
compareMode: previous_periodauto-shifts dateFrom/dateTo back by the same length for period B.compareMode: explicituses four date inputs. Outputs aCOMPARISON_SUMMARYKV record with delta metrics + topRisingTerms + topDecliningTerms. - Thread expansion —
expandThreads: truewalks the reply tree of every story result via the HN Firebase API and emits each comment as a separaterecordType: 'thread_comment'dataset record.threadMaxDepth(default 3) andthreadMaxComments(default 100) cap the recursion. Bundled in the existing per-result charge — no extra event. - Heuristic insights —
includeInsights: trueaddsinsightSummary,sentiment(bullish/bearish/mixed/neutral),riskLevel(high/medium/low), andkeyThemesarray to every result. Pure regex + keyword matching; no LLM. - Adaptive auto-pagination —
autoSplitLargeQueries: truehalves the date range when an Algolia query would exceed 900 hits, fetching each bucket separately and deduping. Capped bymaxSplitRuns(default 20). - GitHub correlation —
correlateGithub: trueaddsgithubFreshness(active/recent/stale/dormant),githubRepoMaturity(nascent/emerging/established/mature), and a compositegithubSignal(high/medium/low) on top of the basic stars/language/pushedAt enrichment.
Search + filtering
- Full-text search across stories, comments, polls, Show HN, Ask HN, and front-page posts via the Algolia HN API
- Sort by relevance or date — best-match for research, newest-first for monitoring
- Content type filtering — stories, comments, polls, Show HN, Ask HN, or front page
- Engagement thresholds — minimum points, minimum comments
- Date range filtering —
YYYY-MM-DDstart + end (UTC) - Author filter — find every post and comment by a specific HN username
- Up to 1,000 results per run with automatic pagination (50 hits per page)
- Query expansion —
expandQuery: trueruns short forms like"AI","k8s","agents"against their canonical synonyms ("artificial intelligence","Kubernetes","autonomous agents") and deduplicates results
Modes & output levels
- One-click modes —
mode: "brand_monitor"/"competitor_tracking"/"hiring_intelligence"/"show_hn_analysis"configure the actor for common jobs. Your explicit input fields always win over the preset. - Output levels —
outputLevel: "basic" | "enriched" | "intelligence"is shorthand for the enrichment toggle bundle.
Monitoring & alerts
- Daily brand-mention monitor —
alertOnNewOnly: truetracks IDs across runs and only outputs new mentions; pair withalertWebhookUrlto push a Slack/Discord alert - Smart alerts —
alertMode: "smart"filters webhook payloads to mentions with signal score ≥ 50 only. Cuts low-quality noise from the alert channel; raw data still appears in the dataset.
Enrichments (opt-in)
- Author profile —
includeAuthorProfile: trueadds karma, account age (days), submission count, and 0–100 influence score via the HN Firebase API - GitHub repo signals —
enrichGithubLinks: trueadds stars, primary language, last-push timestamp when a result links to a GitHub repository - Who Is Hiring parser —
parseHiringComments: trueextracts company, location, remote mode, and apply URL from comment bodies - Show HN traction summary — auto-fires when
tags: show_hn, writes count, average points/comments/signal, and top 5 to theSHOW_HN_SUMMARYkey-value record
Reliability & cost
- Built-in retry + circuit breakers — Algolia 5xx and network blips retry with backoff; enrichment loops disable themselves after 5 consecutive failures so a dead upstream never burns your credit
- No HN API key required — works out of the box; optional
GITHUB_TOKENenv var raises the GitHub rate limit from 60/hr to 5,000/hr - Multiple export formats — JSON, CSV, Excel, XML, HTML from the Apify dataset
Pricing (pay-per-event)
Pay only for what you actually fetch. Two events:
| Event | Price | When it fires |
|---|---|---|
apify-actor-start | $0.00005 | Once when each run starts |
story-fetched | $0.005 | Once per Hacker News result returned |
A 100-result search costs $0.50005. A 1,000-result search costs $5.00005. A daily brand-monitor that finds 5 new mentions per day costs $0.78 per month. Apify's spending-limit settings cap your bill — when you hit them, the actor stops mid-run cleanly.
The Algolia HN API itself is free, so there are no external data fees. The Apify Free plan includes $5 of monthly platform credits, which covers hundreds of HN searches at no extra cost.
How to use Hacker News Search
Using the Apify Console
- Go to the Hacker News Search actor page on Apify.
- Click Try for free to open the actor in the Console.
- Enter your search query (e.g.,
artificial intelligence,"large language models",Rust programming). - Choose your sort order — Relevance for best matches or Date (newest first) for recent content.
- Optionally toggle enrichments: author profile, GitHub links, Who Is Hiring parser.
- For monitoring: enable Alert on new results only and paste your Slack / webhook URL.
- Set your maximum results (default 100, up to 1,000).
- Click Start and wait for the run to finish.
- Switch to the Dataset tab to preview, download, or export results.
Scheduling for daily brand monitoring
- Configure inputs once: query,
alertOnNewOnly: true,alertWebhookUrl: https://hooks.slack.com/services/.... - Save the configuration as an Apify task.
- Open Schedules in the Apify Console, point it at the task, and choose your cadence (
0 9 * * *for 09:00 UTC daily). - Each run posts only the new mentions since the prior run to your webhook. The first run primes the state and posts nothing.
Using the API
You can start a run programmatically and retrieve results via the Apify API. See the API & Integration section below for ready-to-use Python, JavaScript, and cURL examples.
Input parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
query | String | Yes | artificial intelligence | Search query to find on Hacker News |
mode | String | No | search | Pre-configured workflow: search, brand_monitor, competitor_tracking, hiring_intelligence, show_hn_analysis. Your explicit fields always win over the preset. |
outputLevel | String | No | basic | Shorthand for enrichment toggles: basic / enriched / intelligence. Intelligence scoring runs on every result regardless. |
searchType | String | No | relevance | Sort order: relevance (best matches) or date (newest first) |
tags | String | No | (all types) | Content type filter: story, comment, poll, show_hn, ask_hn, or front_page |
author | String | No | (any) | Filter results by HN username (case-sensitive) |
minPoints | Integer | No | (none) | Minimum number of upvotes/points |
minComments | Integer | No | (none) | Minimum number of comments |
dateFrom | String | No | (none) | Start date in YYYY-MM-DD format |
dateTo | String | No | (none) | End date in YYYY-MM-DD format |
maxResults | Integer | No | 100 | Maximum deduplicated results to return. Default 100. Algolia HN caps at 1,000 hits per single query — values up to 10,000 are accepted but only useful when autoSplitLargeQueries: true. The actor logs a warning if exceeded without auto-split. |
expandQuery | Boolean | No | false | Expands known short forms (e.g., "AI" → "artificial intelligence", "AI", "LLM") and dedupes results. Triples API calls when active. |
includeAuthorProfile | Boolean | No | false | Adds karma, account age (days), submission count, and 0–100 author influence score |
enrichGithubLinks | Boolean | No | false | When a result links to a GitHub repo, adds stars, language, and last-push timestamp |
correlateGithub | Boolean | No | false | Adds githubFreshness / githubRepoMaturity / githubSignal classifications. Auto-enables enrichGithubLinks. |
parseHiringComments | Boolean | No | false | Extracts company / location / remote mode / apply URL from "Who Is Hiring" comments |
expandThreads | Boolean | No | false | Walks the reply tree of each story result and emits recordType: 'thread_comment' records |
threadMaxDepth | Integer | No | 3 | Max recursion depth for thread expansion |
threadMaxComments | Integer | No | 100 | Hard cap on total thread comments emitted per run |
includeInsights | Boolean | No | false | Adds insightSummary / sentiment / riskLevel / keyThemes per result (heuristic, no LLM) |
detectTrends | Boolean | No | false | Runs current vs previous N-day windows, extracts rising n-grams, writes TREND_SUMMARY + recordType: 'trend' records |
trendWindowDays | Integer | No | 7 | Window length (days) for each side of the trend comparison |
trendMinMentions | Integer | No | 3 | Minimum current-window mentions for a term to qualify as a trend |
trendMinGrowthPercent | Integer | No | 100 | Minimum growth % vs baseline (100 = doubled) |
trendMaxTerms | Integer | No | 50 | Maximum trends to surface |
compareMode | String | No | none | none / previous_period (auto-shift) / explicit (use the four compareDate* inputs) |
compareDateFromA / ToA / FromB / ToB | String | No | (none) | Explicit period dates when compareMode: explicit |
autoSplitLargeQueries | Boolean | No | false | Recursively halves the date range when a query exceeds 900 hits, fetching each bucket separately |
maxSplitRuns | Integer | No | 20 | Maximum date-range buckets to fetch in auto-split mode |
alertOnNewOnly | Boolean | No | false | Tracks IDs across runs and only outputs items new since the last run |
alertWebhookUrl | String (secret) | No | (none) | Slack/Discord/HTTP webhook URL — POSTs new mentions when alertOnNewOnly is enabled |
alertMode | String | No | all | all posts every new mention; smart filters to signal score ≥ 50 |
Modes (one-click workflows)
Pick a mode and the actor configures itself for the job. Your explicit input fields always win over the preset.
| Mode | What it sets | Job |
|---|---|---|
search | Nothing — flexible defaults | Default; you configure everything |
discover | tags=front_page, searchType=date, detectTrends=true, includeInsights=true | Zero-input front-page exploration with trends + insights pre-applied (clear the query for the full feed) |
brand_monitor | searchType=date, alertOnNewOnly=true, includeAuthorProfile=true, alertMode=all | Daily brand alerts to Slack/Discord |
competitor_tracking | Same as brand_monitor BUT alertMode=smart | Smart-filtered alerts (only signal ≥ 50) |
hiring_intelligence | tags=comment, author=whoishiring, parseHiringComments=true, maxResults=500 | Monthly Who Is Hiring → structured jobs |
show_hn_analysis | tags=show_hn, searchType=date, enrichGithubLinks=true | Show HN traction snapshots |
Output levels
Shorthand for enrichment toggles:
| Level | Effect |
|---|---|
basic | Raw search results only |
enriched | Auto-enables includeAuthorProfile + enrichGithubLinks (where not explicitly set) |
intelligence | Same as enriched (intelligence scoring runs on every result regardless) |
What is signalScore?
signalScore is a 0–100 metric that ranks how important a Hacker News mention is. It is the actor's signature output field. Higher = more important.
It combines four components, log-normalized so single outliers cannot dominate:
| Component | Weight | What it measures |
|---|---|---|
| Engagement | 40% | log10(points + 2 × comments) — saturates around 1,000 weighted engagement |
| Velocity | 25% | log10(pointsPerHour) — saturates around 100 pts/hr |
| Author influence | 20% | Author's 0–100 influence score (or 0.3 baseline if not enriched) |
| Recency | 15% | Linear decay over 168 hours (1 week) |
signalLevel tiers it: high (≥70), medium (40–69), low (<40). Sort by signalScore DESC for the highest-leverage results first.
What is trendScore?
trendScore is a 0–100 metric on recordType: 'trend' records that ranks how strongly a keyword is rising. It combines:
| Component | Weight | What it measures |
|---|---|---|
| Growth rate | 40% | Percentage growth in mentions vs the baseline window |
| Current mentions | 30% | log10(mentionsCurrent) — absolute volume |
| Avg signal score | 20% | How high-quality the mentions are |
| Unique authors | 10% | log10(uniqueAuthors) — breadth of adoption |
Pair with trendStage (emerging / rising / peaked / declining) to know whether a trend is just starting, accelerating, plateauing, or fading.
What is suggestedAction?
suggestedAction is the actor's decision-tier output. It tells you what to do with a result:
| Value | When it fires | What to do |
|---|---|---|
engage | High signal (≥50) + question / feature_request / praise | Reply, draft a response, follow up |
investigate | High signal (≥50) + bearish sentiment / high risk / complaint | Open a ticket, route to support or PM |
monitor | Medium signal | Watch for escalation; no immediate action |
ignore | Signal score < 25 | Skip — low-value or off-topic |
This is what makes the actor LLM-agent-native: downstream automations branch on suggestedAction directly, with no parsing of prose.
JSON input example — basic search
{
"query": "large language models",
"searchType": "date",
"tags": "story",
"minPoints": 50,
"minComments": 10,
"dateFrom": "2026-01-01",
"dateTo": "2026-12-31",
"maxResults": 200
}
JSON input example — daily brand monitor with Slack alerts
{
"query": "MyProduct",
"searchType": "date",
"alertOnNewOnly": true,
"alertWebhookUrl": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXX",
"includeAuthorProfile": true,
"maxResults": 100
}
Schedule this with cron 0 9 * * * and you get a daily Slack message at 09:00 UTC listing only the mentions you haven't seen before. includeAuthorProfile adds the karma of each commenter so you can spot the high-signal voices.
JSON input example — Who Is Hiring parser
{
"query": "remote",
"tags": "comment",
"author": "whoishiring",
"parseHiringComments": true,
"maxResults": 500
}
The HN account whoishiring posts the monthly hiring thread on the first weekday of each month. This run pulls every "remote" comment from that thread and parses it into structured columns: hiringCompany, hiringLocation, hiringRemote, hiringApplyUrl. Drops straight into a recruiting CRM.
JSON input example — Show HN traction snapshot
{
"query": "AI",
"tags": "show_hn",
"searchType": "date",
"dateFrom": "2026-04-01",
"maxResults": 100
}
This run pulls every recent Show HN post about AI. After the dataset is written, the actor saves an aggregate summary (count, average points, average comments, top 5 by points) to the run's SHOW_HN_SUMMARY key-value record. Open it from the Storage → Key-value store tab in the Apify Console.
JSON input example — trend detection (rising keywords)
{
"query": "AI",
"detectTrends": true,
"trendWindowDays": 7,
"trendMinMentions": 3,
"trendMinGrowthPercent": 100,
"maxResults": 200
}
Runs two date-bounded searches (last 7 days vs the previous 7 days), extracts 1/2/3-grams from titles + story bodies + comments, filters out stop words and HN boilerplate, and writes a TREND_SUMMARY key-value record + pushes top-20 trends as recordType: 'trend' dataset records. Each trend carries mentionsCurrent, mentionsPrevious, growthPercent, avgSignalScore, uniqueAuthors, and a composite trendScore (0–100).
JSON input example — historical comparison
{
"query": "rust",
"compareMode": "explicit",
"compareDateFromA": "2026-04-01",
"compareDateToA": "2026-04-30",
"compareDateFromB": "2026-03-01",
"compareDateToB": "2026-03-31",
"maxResults": 400
}
Fetches the same query against two date ranges, writes a COMPARISON_SUMMARY KV record with mentionsDelta, mentionsGrowthPercent, avgSignalScoreDelta, topRisingTerms, and topDecliningTerms. Use compareMode: "previous_period" to auto-shift backward by the same length without specifying period B explicitly.
JSON input example — thread expansion (research mode)
{
"query": "rust async",
"tags": "story",
"minPoints": 100,
"expandThreads": true,
"threadMaxDepth": 3,
"threadMaxComments": 100,
"maxResults": 5
}
For each of the top 5 matching stories, fetches the entire reply tree via the HN Firebase API (capped at depth 3 and 100 total thread comments per run) and emits each comment as a separate recordType: 'thread_comment' record with storyId, commentId, parentId, depth, author, text, createdAt, and hnUrl. Bundled in the existing per-result charge — no new event.
JSON input example — adaptive auto-pagination (>1000 results)
{
"query": "kubernetes",
"dateFrom": "2025-01-01",
"dateTo": "2026-04-30",
"autoSplitLargeQueries": true,
"maxSplitRuns": 20,
"maxResults": 5000
}
When a date range would exceed Algolia's 1,000-hit cap, the actor recursively halves the range until each bucket is below 900 hits, fetches each bucket independently, and dedupes by HN object ID across buckets.
JSON input example — GitHub-enriched developer story search
{
"query": "Rust",
"tags": "story",
"minPoints": 100,
"enrichGithubLinks": true,
"maxResults": 50
}
When a story's submitted URL is a GitHub repo, you get the star count, primary language, and last-pushed timestamp inline. Set the GITHUB_TOKEN environment variable in the actor's run options to raise the GitHub API rate limit from 60/hr to 5,000/hr.
Tips
- Leave
tagsempty to search across all content types. - Combine
minPointsandminCommentsto surface only high-engagement discussions. - Use
searchType: "date"withdateFrom/dateTofor chronological feeds. - Wrap your query in double quotes for exact phrase matching:
"machine learning". - For brand monitoring, narrow
queryto a unique brand name ("Acme Corp", notacme). - Start with a small
maxResultsvalue (10–20) to test filters before scaling up. - Pair
parseHiringComments: truewithauthor: "whoishiring"to get clean monthly job feeds.
Output
Each result is pushed to the default Apify dataset as a JSON object:
{
"recordType": "result",
"objectID": "39281042",
"title": "Show HN: Open-source LLM benchmark for real-world coding tasks",
"url": "https://github.com/example/llm-benchmark",
"author": "techfounder",
"points": 342,
"numComments": 87,
"createdAt": "2026-04-15T14:23:01.000Z",
"type": "show_hn",
"storyText": null,
"commentText": null,
"parentId": null,
"storyId": null,
"hnUrl": "https://news.ycombinator.com/item?id=39281042",
"signalScore": 87.2,
"signalLevel": "high",
"velocityScore": 0.71,
"pointsPerHour": 18.4,
"commentsPerHour": 4.7,
"isTrending": true,
"authorKarma": 12450,
"authorAccountAgeDays": 4520,
"authorSubmissionCount": 187,
"authorInfluenceScore": 78.4,
"isInfluencerMention": true,
"influencerTier": "top_10_percent",
"githubStars": 3400,
"githubLanguage": "Rust",
"githubPushedAt": "2026-04-12T08:14:22Z",
"feedbackType": null,
"whyThisMatters": "High-signal mention from a high-influence author with trending velocity (18.4 pts/hr) discussing developer-experience and ai with positive reception.",
"suggestedAction": "engage"
}
Output fields
| Field | Type | Description |
|---|---|---|
objectID | String | Unique Hacker News item ID |
title | String or null | Post title (null for comments) |
url | String or null | External link URL (null for text posts and comments) |
author | String | HN username of the poster |
points | Number | Number of upvotes |
numComments | Number | Number of comments |
createdAt | String | ISO 8601 timestamp |
type | String | Item type: story, comment, poll, show_hn, or ask_hn |
storyText | String or null | Body text for Ask HN and text-only posts |
commentText | String or null | Comment body (only for comment results) |
parentId | String or null | Parent item ID (for comments) |
storyId | String or null | Top-level story ID (for comments) |
hnUrl | String | Direct link to the HN discussion |
recordType | String | Always "result" for search hits. Reserved for future record types. |
signalScore | Number 0–100 | Composite signal score (engagement 40% + velocity 25% + author influence 20% + recency 15%) |
signalLevel | String | Tier of signalScore: high (≥70) / medium (40–69) / low (<40) |
velocityScore | Number 0–1 | Log-normalized engagement velocity (saturates around 100 pts/hr) |
pointsPerHour | Number | Points earned per hour since posting (capped at 168-hour window) |
commentsPerHour | Number | Comments per hour since posting (capped at 168-hour window) |
isTrending | Boolean | True when item is < 24h old AND ≥ 5 pts/hr OR ≥ 2 comments/hr |
authorKarma | Number or null | Author's HN karma — only present when includeAuthorProfile: true |
authorAccountAgeDays | Number or null | Days since the author's HN account was created |
authorSubmissionCount | Number or null | Total stories + comments the author has submitted |
authorInfluenceScore | Number 0–100 or null | Composite of karma (50%) + account age (25%) + submissions (25%) |
githubStars | Number or null | Star count when url is a GitHub repo and enrichGithubLinks: true |
githubLanguage | String or null | Primary language of the linked GitHub repo |
githubPushedAt | String or null | ISO timestamp of the last commit pushed to the linked repo |
hiringCompany | String or null | Company parsed from a Who Is Hiring comment when parseHiringComments: true |
hiringLocation | String or null | Location parsed from a hiring comment |
hiringRemote | String or null | Remote / Hybrid / On-site flag parsed from a hiring comment |
hiringApplyUrl | String or null | Apply link or mailto: address parsed from a hiring comment |
whyThisMatters | String or null | Plain-English reason this result is high-signal (deterministic, built from contributing fields) |
suggestedAction | String or null | engage / investigate / monitor / ignore — decision tier |
feedbackType | String or null | complaint / feature_request / praise / question — heuristic classification |
isInfluencerMention | Boolean or null | True when author influence score is ≥ 70 (top 10%) |
influencerTier | String or null | top_1_percent / top_10_percent / active / new |
Show HN summary output
When you search with tags: "show_hn", the actor additionally writes one aggregate record to the run's key-value store under the key SHOW_HN_SUMMARY:
{
"type": "show_hn_summary",
"query": "AI",
"totalPosts": 100,
"avgPoints": 84,
"avgComments": 41,
"top5ByPoints": [
{ "title": "Show HN: ChatGPT Plus alternative…", "points": 612, "hnUrl": "https://news.ycombinator.com/item?id=…" }
]
}
Retrieve it from the Storage → Key-value store → SHOW_HN_SUMMARY tab, or via the API:
https://api.apify.com/v2/key-value-stores/<storeId>/records/SHOW_HN_SUMMARY.
Use cases
- Daily brand and product mention alerts — schedule with
alertOnNewOnly: true+ Slack webhook to know the moment your name hits HN. - Competitor watch — same setup, different query. Track each competitor on its own schedule.
- Show HN traction tracking for makers — daily snapshot of how Show HN posts in your category are trending.
- Recruiting from "Who Is Hiring" — monthly run on the
whoishiringthread withparseHiringComments: trueproduces a clean leads CSV. - Influencer / expert tracking — pair
author: "patio11"(or any username) withsearchType: "date"to follow specific high-signal HN users. - Technology trend monitoring — track emerging topics with date ranges to see adoption curves.
- Repo discovery —
enrichGithubLinks: trueon a Rust/Python/Go query surfaces the highest-engagement repos developers are sharing right now. - Author credibility filtering —
includeAuthorProfile: truelets you ignore low-karma noise and focus on the voices the community trusts. - Content curation — filter by points and comments to build a curated link feed.
- Sentiment / NLP datasets — collect comments about a topic for downstream sentiment scoring or topic modelling.
- Academic research — historical archive back to 2007 with date-range filtering.
API & Integration
Run Hacker News Search programmatically and retrieve structured results via the Apify API. Replace <YOUR_API_TOKEN> with your Apify API token.
Python
from apify_client import ApifyClient
client = ApifyClient("<YOUR_API_TOKEN>")
run_input = {
"query": "large language models",
"searchType": "relevance",
"tags": "story",
"minPoints": 100,
"includeAuthorProfile": True,
"maxResults": 50,
}
run = client.actor("ytQ2q81fedyAGvCEJ").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['title']} — {item['points']} points — karma {item.get('authorKarma')} — {item['hnUrl']}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "<YOUR_API_TOKEN>" });
const input = {
query: "MyProduct",
searchType: "date",
alertOnNewOnly: true,
alertWebhookUrl: "https://hooks.slack.com/services/...",
maxResults: 100,
};
const run = await client.actor("ytQ2q81fedyAGvCEJ").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
console.log(`${item.title} — ${item.points} pts — ${item.hnUrl}`);
});
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ytQ2q81fedyAGvCEJ/runs?token=<YOUR_API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"query": "large language models",
"searchType": "relevance",
"tags": "story",
"minPoints": 100,
"maxResults": 50
}'
# Retrieve results from the dataset (use defaultDatasetId from the run response)
curl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?token=<YOUR_API_TOKEN>&format=json"
Slack webhook payload format
When alertOnNewOnly: true and alertWebhookUrl is set, the actor POSTs a Slack-compatible JSON payload to your webhook after each run that finds new mentions:
{
"text": "*3 new Hacker News mentions for \"MyProduct\"*\n• <https://news.ycombinator.com/item?id=…|Show HN: a faster MyProduct competitor> — 142 points, 38 comments\n…",
"query": "MyProduct",
"totalNew": 3,
"items": [ /* up to 10 full result objects */ ]
}
This works as-is with Slack incoming webhooks, Discord webhooks, and any HTTP endpoint that accepts JSON. For Zapier or Make.com, paste your webhook URL into the field — the JSON body fires on every run with new mentions.
Integrations
- Apify Schedules — built-in cron scheduling for daily / hourly monitoring runs
- Webhooks — fire HTTP callbacks when a run finishes (separate from the alert webhook above)
- Zapier / Make / n8n — pipe results into 5,000+ apps via the alert webhook or the Apify integration
- Google Sheets — export the dataset directly to a sheet for collaborative review
- Slack / Discord — paste your incoming-webhook URL into
alertWebhookUrlfor native channel alerts - Python SDK — official Apify Python client
- JavaScript SDK — official Apify JS client
Use in Dify
Drop Hacker News Intelligence into Dify workflows via the Apify plugin's Run Actor node. Each result returns scored, classified, and recommended as structured JSON — recordType enum (result / thread_comment / thread_summary / trend / trend_summary / comparison_summary), signalScore (0-100), feedbackType enum (complaint / feature_request / praise / question), suggestedAction enum (engage / investigate / monitor / ignore), and trendStage enum (emerging / rising / peaked / declining) your downstream node branches on. The raw Algolia HN API returns posts; this returns prioritised developer signals.
- Actor ID:
ryanclinton/hackernews-search - Sample input (brand-monitoring with thread-summary mode):
{
"query": "your-product-name",
"enrichTopThreads": true,
"monitorStateKey": "brand-monitor-q2"
}
- Branching example — a Dify if/else node reads
recordTypefirst, then routes per-type:recordType = "result"ANDsuggestedAction = "engage"→ notify community-team Slack with the comment thread +feedbackTypefor contextrecordType = "result"ANDfeedbackType = "complaint"ANDsignalScore > 70→ page support team + create Zendesk ticketrecordType = "result"ANDfeedbackType = "feature_request"ANDsignalScore > 60→ product backlogrecordType = "trend"ANDtrendStage = "emerging"→ notify growth/marketing teamrecordType = "thread_summary"→ pipe directly to Slack as a daily digest
- For developer feedback mining: filter
recordType = "result"ANDfeedbackType IN ("complaint", "feature_request")to surface only actionable signal — Dify routes complaints to support, feature requests to product - For trend detection: filter
recordType = "trend_summary"and readtop5ByPoints[]— Dify alerts when a new topic enters the top-5 emerging trend list - Cross-run alerts: pass
monitorStateKeyand the actor flags new high-signal threads since last run — Dify branches only on the deltas, not the full result set
The suggestedAction enum + signalScore make this drop-in for any Dify automation that needs to triage Hacker News mentions of a product, person, or technology — branch on engage for sales/community, investigate for product, monitor for marketing, ignore for archive.
How it works
Hacker News Search uses the official Algolia HN Search API, which indexes the entire Hacker News archive in near real-time. The actor constructs API queries from your input, paginates with a 1-second polite delay, classifies each result by content type, optionally enriches via the HN Firebase API + the GitHub API, optionally diffs against a named key-value store for the brand monitor, and writes structured output to the dataset.
- Input validation — reads input, clamps
maxResultsto 1–1,000 (or 1–10,000 withautoSplitLargeQueries: true), normalizes filters. - Endpoint selection —
relevance→hn.algolia.com/api/v1/search;date→/search_by_date. - Query construction — builds the full URL with query, tag filters, numeric filters, and pagination.
- Paginated fetching with retry — fetches 50 hits per page; retries 5xx and network errors up to 3 times with linear backoff; waits 1 second between pages.
- Type detection — inspects the
_tagsarray on each hit (priority: comment → poll → show_hn → ask_hn → story). - Per-item enrichment (when toggles are on):
includeAuthorProfile→ fetchhacker-news.firebaseio.com/v0/user/{username}.json(cached per run)enrichGithubLinks→ matchgithub.com/owner/repoURLs and fetchapi.github.com/repos/{owner}/{repo}(cached per run)parseHiringComments→ run regex over the comment body to extract company / location / remote / apply URL
- Brand-monitor diff (when
alertOnNewOnly: true) — load prior run's IDs from the named key-value storehackernews-search-monitor, skip seen items, save the merged ID set (FIFO 10,000 cap) before exit. - Output transformation — every item normalized to the dataset schema with camelCase field names, null-safe values, and a constructed
hnUrl. - Webhook alert (when
alertWebhookUrlis set and new mentions exist) — POST a Slack-compatible payload. - Show HN summary (when
tags: "show_hn") — write the aggregate record to theSHOW_HN_SUMMARYkey-value record.
Hacker News Search — Pipeline
+-------------+ +---------------------+ +-------------------+
| User Input |---->| Query Construction |---->| Algolia HN API |
| (14 fields) | | tags + filters + | | /search or |
+-------------+ | dates + pagination | | /search_by_date |
+---------------------+ +-------------------+
|
v
+-------------+ +---------------------+ +-------------------+
| Webhook + |<----| Per-item Enrichment |<----| Pagination + retry|
| Show HN | | + Brand-monitor | | (5xx + 429 backoff|
| summary + | | dedup + Algolia | | 3 attempts) |
| Dataset | | _tags type detect | | |
+-------------+ +---------------------+ +-------------------+
^
| (optional)
+-----+-----+ +-------------------+
| HN user | | GitHub repo API |
| Firebase | | (stars, lang, |
| API | | pushed_at) |
+-----------+ +-------------------+
Performance & cost
| Scenario | Results | Run time | PPE charges |
|---|---|---|---|
| Quick test | 10 | ~3 s | $0.05005 |
| Default search | 100 | ~5 s | $0.50005 |
| Medium search | 250 | ~10 s | $1.25005 |
| Large search | 500 | ~15 s | $2.50005 |
| Maximum search | 1,000 | ~30 s | $5.00005 |
| Daily brand monitor (5 new) | 5 | ~3 s | $0.02505 / day = $0.78 / month |
| Show HN snapshot (100 + summary KV) | 100 | ~5 s | $0.50005 |
PPE charges = apify-actor-start ($0.00005) + story-fetched × N ($0.005 each). Apify platform-compute charges are billed separately by Apify based on RAM-seconds (this actor defaults to 256 MB, the lightest workable tier for HN-scale traffic).
The Algolia HN API is free. Author profile and GitHub enrichment use free public APIs — they don't add to your PPE bill, and the actor's circuit breakers stop calling them after 5 consecutive failures so a dead upstream never burns your time.
Limitations
- Maximum 1,000 results per run — hard limit imposed by the Algolia HN API. For larger datasets, run multiple searches with non-overlapping date ranges.
- 50 results per API page — pagination is automatic, but a 1,000-result search makes ~20 sequential API calls.
- Algolia indexing delay — very new posts (last few minutes) may not yet appear in search results.
- Comment text is plain text — HTML formatting from original HN comments is stripped by the Algolia API.
- Author filter is case-sensitive — usernames must match exactly as they appear on HN.
- No Boolean query operators — query is plain text. Algolia HN does not support
AND/OR/NOTsyntax. - Single author per run — to track multiple authors, run the actor separately for each.
- Date filtering granularity —
dateFromis midnight UTC,dateTois 23:59:59 UTC. Sub-day precision is not available. - Rate limiting enforced — built-in 1-second delay between pages. Removing this is not recommended.
- Who Is Hiring parser is best-effort — comment formats vary; expect ~80% extraction accuracy on company/location, lower on tech-stack heuristics. Always review before downstream automation.
- GitHub enrichment unauthenticated rate limit is 60/hr — set the
GITHUB_TOKENenvironment variable to raise it to 5,000/hr.
What this actor does NOT do
- It does not crawl or render JavaScript. It only calls the Algolia HN API and the HN Firebase + GitHub APIs. There is no browser, no scraping, no JS execution.
- It does not detect official live front-page rank. The actor computes velocity (
pointsPerHour,commentsPerHour,isTrending) from each item's posting time, but the Algolia API doesn't expose live HN front-page position — for exact "currently #3 on HN" tracking, pair this actor with a Firebase-based front-page poller. - It does not run LLM-based sentiment or topic classification. The
includeInsights: truetoggle adds heuristic sentiment + theme detection via keyword regex — it's deterministic, fast, and free, but it is not AI sentiment. For nuanced sentiment, plug the raw text into your own LLM pipeline. - It does not deduplicate near-duplicate submissions. If the same article was posted three times by three users, you get three results — use
objectIDto dedupe at the application layer. - It does not bypass HN guidelines. Public data, polite rate, attribution-friendly. If you publish derivative analysis, credit Hacker News (Y Combinator).
- It does not aggregate Reddit, Lobsters, or other social platforms. This actor is HN-focused. For multi-platform developer-community signal aggregation, see the planned "Developer Signal Monitor" actor.
If you need any of the above, see the Related actors table at the bottom for sibling tools, or open an issue on the actor's GitHub — feature requests with concrete use cases regularly ship.
Responsible use
This actor accesses publicly available data through the official Algolia HN Search API, which is provided specifically for programmatic access. Please use it responsibly:
- Respect rate limits — the actor enforces a 1-second delay between API pages.
- Retrieve only what you need — use filters and reasonable
maxResultsvalues. - Respect user privacy — HN usernames and posts are public, but aggregating personal activity should be done thoughtfully and in compliance with GDPR / CCPA.
- Attribute your sources — credit Hacker News (Y Combinator) in any published analysis.
- Review terms of service — see Hacker News guidelines and the Algolia HN Search API documentation.
FAQ
Q: Do I need an API key to use this actor? A: No. The Algolia HN Search API is free and open. No HN API key or authentication is required. You only need an Apify account.
Q: How much does it cost? A: Pay-per-event: $0.00005 per run start + $0.005 per result fetched. A 100-result search costs about 50 cents. A daily brand monitor that finds 5 new mentions per day costs about 78 cents per month.
Q: How do I set up daily Slack alerts for my brand on Hacker News?
A: Set alertOnNewOnly: true, paste your Slack incoming-webhook URL into alertWebhookUrl, and schedule the actor to run daily via Apify Schedules. The first run primes the state and posts nothing; every subsequent run posts only mentions you haven't seen before.
Q: Does it work with Discord webhooks?
A: Yes. The payload uses Slack's text field, which Discord webhooks render as a message. You can also paste any HTTP endpoint that accepts JSON.
Q: How accurate is the "Who Is Hiring" parser?
A: It uses deterministic regex over the comment body and is best-effort. Expect ~80% accuracy on company name and location, and lower on remote-mode and apply-URL extraction (formats vary wildly across the thread). Review the parsed columns before downstream automation. The raw commentText is always preserved so you can re-parse manually.
Q: How far back does the data go?
A: The Algolia index covers essentially the entire Hacker News archive, going back to 2007. Use dateFrom and dateTo to scope to any time period.
Q: How do I get more than 1,000 results from a single query? A: The Algolia API caps at 1,000 per query. Split the search across non-overlapping date ranges (e.g., one run per month) and concatenate the datasets.
Q: Can I run this on a schedule?
A: Yes. Use Apify's built-in scheduling. Combine with searchType: "date" and alertOnNewOnly: true for a clean monitoring feed.
Q: How do I raise the GitHub enrichment rate limit?
A: Set the GITHUB_TOKEN environment variable in the actor's run options to a GitHub personal access token with public_repo scope. The unauthenticated limit is 60/hr (per IP); authenticated is 5,000/hr (per token).
Q: What happens when an enrichment API is down? A: The actor tracks consecutive failures separately for the HN Firebase API and the GitHub API. After 5 consecutive failures on either, that enrichment disables itself for the rest of the run. Main results keep flowing — you don't pay for a dead upstream.
Q: Can I search for an exact phrase?
A: Yes. Wrap your query in double quotes: "machine learning".
Q: How does trend detection work?
A: With detectTrends: true, the actor runs two date-bounded searches — the current trendWindowDays window and the previous equal-length window. It tokenizes titles + story bodies + comments into 1/2/3-grams, filters out stop words and HN boilerplate, counts occurrences in each window, and computes growth percent. Terms below trendMinMentions or trendMinGrowthPercent are dropped. The remaining trends are scored (40% growth + 30% mentions + 20% avg signal + 10% unique authors) and the top trendMaxTerms are surfaced in TREND_SUMMARY (KV) and as recordType: 'trend' dataset records.
Q: How does thread expansion work?
A: For each story, Show HN, Ask HN, or poll result, the actor fetches that item from the HN Firebase API (hacker-news.firebaseio.com/v0/item/{id}.json) and BFS-walks its kids array up to threadMaxDepth. Each comment is emitted as a separate recordType: 'thread_comment' dataset record. The threadMaxComments cap is enforced across all parents in the run, so a single very-deep thread can't exhaust the budget. Thread comments are bundled in the existing per-result charge — no additional event.
Q: What's the difference between compareMode: previous_period and detectTrends?
A: compareMode: previous_period compares whatever you specified in dateFrom/dateTo against the equal-length window immediately before. detectTrends always uses now as the end of the current window and looks back trendWindowDays. Use compare for ad-hoc "this month vs last month" reports; use trends for "what's rising right now."
Q: How does discover mode work without a query?
A: Discover mode pre-applies tags: front_page + searchType: date + detectTrends: true + includeInsights: true. If you leave the query empty, you get the full HN front page; set a query to filter front-page items by topic. Schedule it daily for a "what's hot today on HN" feed.
Q: What does whyThisMatters look like in practice?
A: It's a single sentence built deterministically from the result's other fields — examples: "High-signal mention from a high-influence author with trending velocity (18 pts/hr) discussing developer-experience and ai with positive reception." or "Moderate-signal mention from an experienced user discussing security with concerns raised (complaint)." Null on results below signalScore: 25 to avoid noise.
Q: How accurate is feedbackType classification?
A: It's deterministic regex against curated keyword patterns — accurate enough to triage automatically, not accurate enough to ship to customers without review. Expect ~80% accuracy on clear-cut cases (broken, wish you'd add X, love this). For ambiguous mixed feedback, classification falls through to null — better honest absence than confident wrong answer.
Q: Why does the dataset overview view lead with suggestedAction?
A: Because Apify Console previews are most users' first impression. Leading with suggestedAction + signalLevel + whyThisMatters means a customer can scan 5 rows and immediately see what to investigate, what to engage with, and what to ignore — without clicking into individual records or exporting to a spreadsheet.
Q: Is the includeInsights sentiment AI-powered?
A: No. It's heuristic: a curated bullish/bearish word list and a domain theme dictionary (performance, cost, developer-experience, security, reliability, scalability, open-source, AI, lock-in). Pure regex + keyword counting, deterministic, free of hallucinations, free at runtime. For nuanced sentiment, pipe the raw commentText into your own LLM pipeline.
Q: Why are some output fields null?
A: Fields like title, url, storyText, commentText, parentId, storyId are null when they don't apply to the content type. Comments have no title; stories have no commentText. Enrichment fields (authorKarma, githubStars, hiringCompany, etc.) are null when the corresponding toggle is off, when the data isn't available, or when the enrichment circuit-breaker has fired. Thread/trend-specific fields (depth, text, term, growthPercent, etc.) are only populated on recordType: 'thread_comment' and recordType: 'trend' records respectively.
Q: How does the brand-monitor remember which IDs it has seen?
A: The actor opens a named key-value store called hackernews-search-monitor and writes one record per query slug containing up to 10,000 prior objectIDs in FIFO order. Each scheduled run loads the prior IDs, skips already-seen items, and saves the merged set on exit. Different queries get separate state, so you can run multiple monitors in parallel.
Q: Does this actor scrape the news.ycombinator.com website? A: No. It only calls the Algolia HN Search API and (optionally) the HN Firebase + GitHub APIs. There is no browser automation, no HTML parsing, no rate-limit risk against HN itself.
Q: Can I run this offline or self-hosted? A: The actor is open-runtime — it requires the Apify platform to handle PPE billing, scheduling, and key-value state. The Algolia HN API itself is free and you could replicate the search logic in any HTTP client, but features like brand-monitor state, scheduled alerts, and PPE billing are Apify-specific.
Summary
Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline. It is a developer sentiment monitoring tool, a Hacker News trend detection tool, and a social listening tool for developers — focused on high-signal discussions. It turns raw discussions into ranked, explainable, actionable insights, and tells you what to do about each one.
Related actors
If you find Hacker News Search useful, check these related tools for the developer-community and web-monitoring stack:
| Actor | Description |
|---|---|
| Stack Overflow & StackExchange Search | Search questions and answers across the entire StackExchange network |
| GitHub Repository Search | Search GitHub repositories by keyword, language, stars, and more |
| Bluesky Social Search | Search posts and profiles on the Bluesky social network |
| Brand Protection Monitor | Monitor brand mentions and potential infringements across the web |
| Website Change Monitor | Track changes on any website and get notified of updates |
| Wayback Machine Search | Search the Internet Archive's Wayback Machine for historical snapshots |
| CrossRef Paper Search | Search the academic literature via the CrossRef API |
Compare this actor
Related articles
How to Analyze Hacker News Data Without Writing a Single Line of Code
Hacker News Intelligence ranks every result 0-100, explains why it matters, and routes alerts to Slack. 100 results cost 50 cents. No code required.
Stop Reading Stack Overflow Manually — Turn Developer Questions Into Your Backlog
Stop manual SO triage. A scheduled actor scores developer questions, infers root causes, and pushes Jira / Linear / GitHub tickets at $0.001/question.
Related actors
Bulk Email Verifier — MX, SMTP & Disposable Detection at Scale
Verify email deliverability in bulk — MX records, SMTP mailbox checks, disposable detection (55K+ domains), role-based flagging, catch-all detection, domain health scoring (SPF/DKIM/DMARC), and confidence scores. $0.005/email, no subscription.
CFPB Complaint Search — By Company, Product & State
Search the CFPB consumer complaint database with 5M+ complaints. Filter by company, product, state, date range, and keyword. Extract complaint details, company responses, and consumer narratives. Free US government data, no API key required.
Company Deep Research — SEC, GitHub, DNS & Social
Research any company from a domain. Get website metadata, Wikipedia summary, GitHub repos & stars, SEC EDGAR filings & ticker, academic papers, DNS records, and social media profiles in one JSON report.
SEC EDGAR Filing Search — 10-K, 10-Q, 8-K & More
Search SEC EDGAR filings by keyword, company name, or ticker symbol. Filter by form type (10-K, 10-Q, 8-K, S-1, DEF 14A, Form 4) and date range. Returns structured filing data with direct document URLs. Free, no API key.
Ready to try Hacker News Search — Stories, Comments & Developer Sentiment?
Run it on your own Apify account. Apify offers a free tier with $5 of monthly credits.
Open on Apify Store