Hacker News Search
Search Hacker News stories, comments, and polls via Algolia API. Filter by date, points, comments, author. Track tech trends, monitor brand mentions, find discussions. No API key needed.
Maintenance Pulse
92/100Documentation
Search and extract stories, comments, polls, Show HN, and Ask HN posts from Hacker News. This actor uses the Algolia HN Search API to find content by keyword, filter by author, date range, minimum points, and comment count -- then returns clean, structured JSON ready for analysis, monitoring, or integration into your data pipeline. No API key required. Export results as JSON, CSV, Excel, or connect directly via the Apify API.
Why use Hacker News Search?
Hacker News is one of the most influential technology communities on the internet, with millions of posts spanning software engineering, startups, artificial intelligence, science, and policy. The native site search is basic and does not support advanced filtering. This actor gives you full programmatic access to the entire HN archive through the Algolia Search API, with powerful filtering that the native site does not offer.
Sort by relevance or date. Restrict results to specific content types like stories, comments, or Show HN posts. Set minimum upvote and comment thresholds to surface only high-engagement content. Scope searches to exact date ranges. Filter by author to track specific users. All results come back as structured data with direct links to the original HN discussion pages.
Whether you are tracking brand mentions, researching technology trends, monitoring competitor discussions, curating content, or building datasets for NLP analysis, Hacker News Search delivers clean, structured output at scale with zero configuration and no API key.
Key features
- Full-text search across all Hacker News content -- stories, comments, polls, Show HN, Ask HN, and front page posts
- Sort by relevance or date to find the best matches or the most recent discussions
- Content type filtering -- restrict results to stories, comments, polls, Show HN, Ask HN, or front page items
- Author filtering -- find all posts and comments by a specific HN username
- Engagement thresholds -- set minimum points (upvotes) and minimum comments to surface only high-quality discussions
- Date range filtering -- scope searches to any time window using YYYY-MM-DD start and end dates
- Up to 1,000 results per run with automatic pagination (50 hits per page)
- Polite rate limiting -- 1-second delay between API pages to respect the Algolia service
- Automatic type detection -- classifies each result as story, comment, poll, show_hn, or ask_hn based on internal tag arrays
- Direct HN links -- every result includes a clickable link to the Hacker News discussion page
- No API key required -- works out of the box with zero setup or authentication
- Multiple export formats -- download results as JSON, CSV, Excel, XML, or HTML from the Apify dataset
How to use Hacker News Search
Using the Apify Console
- Go to the Hacker News Search actor page on Apify.
- Click Try for free to open the actor in the Console.
- Enter your search query (e.g.,
artificial intelligence,"large language models",Rust programming). - Choose your sort order -- Relevance for best matches or Date (newest first) for recent content.
- Optionally filter by content type, author, minimum points, minimum comments, or date range.
- Set your maximum results (default is 100, up to 1,000).
- Click Start and wait for the run to finish.
- Switch to the Dataset tab to preview, download, or export results in JSON, CSV, Excel, or other formats.
Using the API
You can start a run programmatically, send input as JSON, and retrieve results using the Apify API. See the API & Integration section below for ready-to-use code examples in Python, JavaScript, and cURL.
Input parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
query | String | Yes | artificial intelligence | Search query to find on Hacker News |
searchType | String | No | relevance | Sort order: relevance (best matches) or date (newest first) |
tags | String | No | (all types) | Content type filter: story, comment, poll, show_hn, ask_hn, or front_page |
author | String | No | (any) | Filter results by HN username |
minPoints | Integer | No | (none) | Minimum number of upvotes/points |
minComments | Integer | No | (none) | Minimum number of comments |
dateFrom | String | No | (none) | Start date in YYYY-MM-DD format |
dateTo | String | No | (none) | End date in YYYY-MM-DD format |
maxResults | Integer | No | 100 | Maximum results to return (1--1,000) |
JSON input example
{
"query": "large language models",
"searchType": "date",
"tags": "story",
"minPoints": 50,
"minComments": 10,
"dateFrom": "2025-01-01",
"dateTo": "2025-12-31",
"maxResults": 200
}
Tips
- Leave
tagsempty to search across all content types (stories, comments, polls, etc.). - Combine
minPointsandminCommentsto surface only high-engagement discussions and filter out noise. - Use
searchType: "date"withdateFromanddateToto get a chronological feed within a specific time window. - The
authorfilter matches exact HN usernames and is case-sensitive. - Wrap your query in double quotes for exact phrase matching, e.g.,
"machine learning". - Start with a small
maxResultsvalue (10--20) to test your filters before scaling up.
Output
Each result is pushed to the default Apify dataset as a JSON object. Here is an example of a single output item:
{
"objectID": "39281042",
"title": "Show HN: Open-source LLM benchmark for real-world coding tasks",
"url": "https://github.com/example/llm-benchmark",
"author": "techfounder",
"points": 342,
"numComments": 87,
"createdAt": "2025-06-15T14:23:01.000Z",
"type": "show_hn",
"storyText": null,
"commentText": null,
"parentId": null,
"storyId": null,
"hnUrl": "https://news.ycombinator.com/item?id=39281042"
}
Output fields
| Field | Type | Description |
|---|---|---|
objectID | String | Unique Hacker News item ID |
title | String or null | Post title (null for comments) |
url | String or null | External link URL (null for text posts and comments) |
author | String | HN username of the poster |
points | Number | Number of upvotes the item received |
numComments | Number | Number of comments on the post |
createdAt | String | ISO 8601 timestamp of when the item was posted |
type | String | Content type: story, comment, poll, show_hn, or ask_hn |
storyText | String or null | Body text for Ask HN and text-only posts |
commentText | String or null | Comment text (only present for comment-type results) |
parentId | String or null | Parent item ID (for comments -- references the item being replied to) |
storyId | String or null | Parent story ID (for comments -- references the top-level story) |
hnUrl | String | Direct link to the Hacker News discussion page |
Use cases
- Technology trend monitoring -- track emerging topics like AI, Rust, WebAssembly, or HTMX by searching with date ranges and sorting by date to see adoption curves over time.
- Brand and product mentions -- monitor when your company, product, or competitor is discussed on Hacker News by running scheduled searches with keyword filters.
- Content curation -- find the highest-quality discussions on a topic by filtering for high points and comment counts to build curated link feeds.
- Sentiment research -- collect comments about a technology or company for qualitative analysis, NLP sentiment scoring, or opinion mining.
- Competitive intelligence -- discover what the developer community says about competing products, frameworks, or services.
- Recruiting signals -- find active HN users discussing specific technologies to identify knowledgeable potential candidates.
- Show HN monitoring -- track new product launches and side projects posted to the Show HN category.
- Academic research -- study how technical topics gain traction in the developer community over time using historical data going back to 2007.
- Author activity analysis -- retrieve all posts from a specific HN user to study their interests, expertise, and activity patterns.
- Link discovery -- extract external URLs shared in high-engagement HN posts as a curated reading list or research bibliography.
API & Integration
Run Hacker News Search programmatically and retrieve structured results using the Apify API. Replace <YOUR_API_TOKEN> with your Apify API token.
Python
from apify_client import ApifyClient
client = ApifyClient("<YOUR_API_TOKEN>")
run_input = {
"query": "large language models",
"searchType": "relevance",
"tags": "story",
"minPoints": 100,
"maxResults": 50,
}
run = client.actor("ytQ2q81fedyAGvCEJ").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['title']} -- {item['points']} points -- {item['hnUrl']}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "<YOUR_API_TOKEN>" });
const input = {
query: "large language models",
searchType: "relevance",
tags: "story",
minPoints: 100,
maxResults: 50,
};
const run = await client.actor("ytQ2q81fedyAGvCEJ").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
console.log(`${item.title} -- ${item.points} points -- ${item.hnUrl}`);
});
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ytQ2q81fedyAGvCEJ/runs?token=<YOUR_API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"query": "large language models",
"searchType": "relevance",
"tags": "story",
"minPoints": 100,
"maxResults": 50
}'
# Retrieve results from the dataset (use defaultDatasetId from the run response)
curl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?token=<YOUR_API_TOKEN>&format=json"
Integrations
Hacker News Search connects with the full Apify ecosystem and popular external platforms:
- Webhooks -- trigger HTTP callbacks when a run finishes to pipe data into your own backend or database.
- Zapier -- connect HN search results to 5,000+ apps including Slack, Google Sheets, Airtable, and email.
- Make (Integromat) -- build automated workflows that process HN data on a schedule with complex routing logic.
- Google Sheets -- export results directly to a spreadsheet for collaborative analysis and reporting.
- Slack notifications -- get alerts when new mentions of your keywords appear on Hacker News.
- Python SDK -- use the official Apify Python client for seamless integration into data pipelines.
- JavaScript SDK -- use the official Apify JS client for Node.js applications and serverless functions.
- API polling -- schedule runs via the Apify API and poll for results in any programming language.
How it works
Hacker News Search uses the Algolia HN Search API, which indexes the entire Hacker News archive in near real-time. The actor constructs API queries from your input parameters, handles pagination automatically, classifies each result by content type, and outputs clean structured data.
- Input validation -- The actor reads your input configuration and clamps
maxResultsbetween 1 and 1,000. - Endpoint selection -- Based on
searchType, it selects either the relevance endpoint (hn.algolia.com/api/v1/search) or the date endpoint (hn.algolia.com/api/v1/search_by_date). - Query construction -- It builds the full API URL with your search query, tag filters (content type + author), and numeric filters (points, comments, date range timestamps).
- Paginated fetching -- Results are fetched in pages of 50 hits each. After each page, the actor waits 1 second before requesting the next page. Pagination continues until the desired result count is reached or no more pages are available.
- Type detection -- Each hit's
_tagsarray is inspected to classify the item. The actor checks in order: comment, poll, show_hn, ask_hn, then defaults to story. - Output transformation -- Raw Algolia response fields are mapped to a clean output schema with camelCase field names, null-safe values, and a constructed
hnUrllink for every item.
Hacker News Search -- Processing Pipeline
+-------------+ +---------------------+ +-------------------+
| | | | | |
| User Input |---->| Query Construction |---->| Algolia HN API |
| (9 fields) | | tags, numeric | | /search or |
| | | filters, author | | /search_by_date |
+-------------+ +---------------------+ +-------------------+
|
v
+-------------+ +---------------------+ +-------------------+
| | | | | |
| Structured |<----| Type Detection |<----| Pagination Loop |
| Output | | _tags inspection: | | 50 hits/page |
| (Dataset) | | comment > poll > | | 1s polite delay |
| | | show_hn > ask_hn > | | until maxResults |
| | | story (default) | | or last page |
+-------------+ +---------------------+ +-------------------+
Performance & cost
| Scenario | Results | API pages | Estimated run time | Platform cost |
|---|---|---|---|---|
| Quick test | 10 | 1 | ~3 seconds | < $0.001 |
| Default search | 100 | 2 | ~5 seconds | < $0.005 |
| Medium search | 250 | 5 | ~10 seconds | < $0.005 |
| Large search | 500 | 10 | ~15 seconds | < $0.01 |
| Maximum search | 1,000 | 20 | ~30 seconds | ~$0.01 |
Notes:
- Run times include the 1-second polite delay between paginated API requests.
- The Algolia HN API is free and requires no API key, so there are no external data costs.
- The actor uses minimal memory (256 MB is sufficient for all scenarios).
- Actual run time depends on network latency and the number of matching results available.
- The Apify Free plan includes $5 of monthly platform credits, enough for hundreds of HN searches at no cost.
Limitations
- Maximum 1,000 results per run -- this is a hard limit imposed by the Algolia HN API. For larger datasets, run multiple searches with non-overlapping date ranges.
- 50 results per API page -- pagination is handled automatically, but a 1,000-result search requires 20 sequential API calls.
- Algolia indexing delay -- very new posts (within the last few minutes) may not yet appear in search results.
- No full thread retrieval -- the actor returns individual matching items, not entire comment threads. Use the
storyIdandparentIdfields to reconstruct threads programmatically. - Comment text is plain text -- HTML formatting from the original HN comments is stripped by the Algolia API.
- Author filter is case-sensitive -- usernames must match exactly as they appear on Hacker News.
- No Boolean query operators -- the search query is a plain text string. Advanced Boolean syntax (AND, OR, NOT) is not supported by the Algolia HN endpoint.
- Single author per run -- to track multiple authors, run the actor separately for each username or filter results after collection.
- Date filtering granularity --
dateFromstarts at midnight UTC anddateToends at 23:59:59 UTC of the specified day. Sub-day precision is not available. - Rate limiting enforced -- the actor includes a mandatory 1-second delay between pages. Removing this delay is not recommended and may result in API throttling.
Responsible use
This actor accesses publicly available data from Hacker News through the official Algolia HN Search API, which is provided specifically for programmatic access. Please use this tool responsibly:
- Respect rate limits -- the actor includes a built-in 1-second delay between API requests. Do not modify or remove this delay.
- Retrieve only what you need -- use filters and reasonable
maxResultsvalues to minimize unnecessary API calls. - Respect user privacy -- while HN usernames and posts are public, aggregating personal activity data about individuals should be done thoughtfully and in compliance with applicable privacy regulations (GDPR, CCPA, etc.).
- Attribute your sources -- if you publish analysis or research based on HN data, credit Hacker News (Y Combinator) as the data source.
- Review terms of service -- consult the Hacker News guidelines and the Algolia HN Search API documentation before large-scale or commercial data collection.
FAQ
Q: Do I need an API key to use this actor? A: No. The Algolia HN Search API is free and open. No API key or authentication is required. You only need an Apify account to run the actor.
Q: How far back does the data go?
A: The Algolia index covers essentially the entire Hacker News archive, going back to the site's launch in 2007. You can use the dateFrom and dateTo parameters to search any time period.
Q: Can I search for an exact phrase?
A: Yes. Wrap your query in double quotes, e.g., "machine learning", to search for the exact phrase rather than individual words.
Q: How do I get only stories (no comments)?
A: Set the tags input parameter to story. This filters out comments, polls, and other content types.
Q: What is the difference between relevance and date search? A: Relevance sort returns the best-matching results based on Algolia's ranking algorithm, which factors in text match quality, points, and recency. Date sort returns results in strict reverse chronological order (newest first), which is better for monitoring and scheduled runs.
Q: Can I search for posts by a specific user?
A: Yes. Set the author field to the exact HN username (case-sensitive). This can be combined with any other filters.
Q: How do I get more than 1,000 results? A: The Algolia API limits results to 1,000 per query. To work around this, split your search into multiple runs with non-overlapping date ranges (e.g., one run per month or per week).
Q: Can I run this actor on a schedule?
A: Yes. Use Apify's built-in scheduling feature to run the actor at any interval -- hourly, daily, weekly. Combine with searchType: "date" and a narrow date range for an ongoing monitoring feed.
Q: Why are some fields null?
A: Fields like title, url, storyText, commentText, parentId, and storyId are null when they do not apply to the content type. For example, comments have no title, and stories have no commentText.
Q: How accurate is the type detection?
A: The actor reads the _tags array from the Algolia API to classify items. This is the same classification used by Hacker News itself, so it is highly accurate. Items are checked in priority order: comment, poll, show_hn, ask_hn, then story (default fallback).
Q: Can I filter by multiple content types at once?
A: The actor supports one content type filter per run. To collect both stories and comments, either leave the tags field empty (returns all types) or run the actor twice with different tag values.
Q: What happens if my query returns fewer results than maxResults? A: The actor returns all available matching results and stops gracefully. If your query matches 50 items but you requested 500, you will receive 50 results.
Related actors
If you find Hacker News Search useful, check out these related actors for developer community and web monitoring data:
| Actor | Description |
|---|---|
| GitHub Repository Search | Search GitHub repositories by keyword, language, stars, and more |
| Stack Overflow & StackExchange Search | Search questions and answers across the entire StackExchange network |
| Bluesky Social Search | Search posts and profiles on the Bluesky social network |
| Brand Protection Monitor | Monitor brand mentions and potential infringements across the web |
| Website Change Monitor | Track changes on any website and get notified of updates |
| Wayback Machine Search | Search the Internet Archive's Wayback Machine for historical snapshots |
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Ready to try Hacker News Search?
Start for free on Apify. No credit card required.
Open on Apify Store