OSHA Accident & Injury Intelligence
OSHA accident data is public — but retrieving it in a usable form requires joining 7 separate DOL datasets, handling pagination, and writing your own risk assessment logic. This actor does all of that in a single run. It searches the Department of Labor OSHA accident investigation database, joins incident narratives, individual injury records, emphasis programs, and general duty citations, and returns each case with a composite risk score.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| result-returned | Charged per result returned. Includes data transformation and structured output. | $0.03 |
Example: 100 events = $3.00 · 1,000 events = $30.00
Documentation
OSHA accident data is public — but retrieving it in a usable form requires joining 7 separate DOL datasets, handling pagination, and writing your own risk assessment logic. This actor does all of that in a single run. It searches the Department of Labor OSHA accident investigation database, joins incident narratives, individual injury records, emphasis programs, and general duty citations, and returns each case with a composite risk score.
Whether you are a safety consultant benchmarking industry hazard patterns, an insurance underwriter profiling employer risk, or a legal researcher building a litigation file, you get structured, enriched data ready for analysis — not raw API responses you still have to clean.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| 📋 Summary number | OSHA/accident | 202451234 |
| 📅 Event date | OSHA/accident | 2024-09-15 |
| 🏭 Employer name | OSHA/accident | SUMMIT CONSTRUCTION LLC |
| 📍 Site location | OSHA/accident | 1200 Industrial Blvd, Houston TX 77001 |
| 🏷️ NAICS / SIC code | OSHA/accident | 236220 (Commercial construction) |
| 🔑 Event keyword | OSHA/accident | Fall, Chemical, Struck, Electrocution |
| 📝 Incident narrative | OSHA/accident_abstract | Full-text description of what happened |
| 👤 Injury demographics | OSHA/accident_injury | Age 34, Male, Construction laborer |
| 🦴 Nature and body part | OSHA/accident_injury | Fractures — Head |
| ⬇️ Fall distance / height | OSHA/accident_injury | 35 feet |
| ☣️ Hazardous substance | OSHA/accident_injury | Hydrogen chloride |
| 🚨 Emphasis programs | OSHA/emphasis_codes | Falls, PSM Covered Chemical Facilities |
| ⚖️ General duty citations | OSHA/violation_gen_duty_std | Section 5(a)(1) text |
| 🔗 Related inspections | OSHA/related_activity | Linked inspection activity numbers |
| 📊 Risk score / level | Calculated | 95 — Critical |
| 🔍 Risk factors | Calculated | 4 hospitalization(s) (+60), 2 emphasis programs (+10) |
Why use OSHA Accident & Injury Intelligence?
Accessing OSHA accident data manually means working through the DOL data portal, running separate queries for each dataset, figuring out how to join on summary_nr and activity_nr, and building your own logic to identify high-severity cases. For a researcher covering 200 employers, that is a multi-day project with significant opportunity for error.
This actor automates the entire data pipeline. You specify filters — company name, state, NAICS code, date range, event keyword, severity level — and receive enriched accident case records sorted by risk score, ready for immediate analysis.
- Scheduling — run weekly or monthly to track emerging accident patterns in a target industry or region
- API access — trigger runs from Python, JavaScript, or any HTTP client to power dashboards or automated reports
- Proxy rotation — not applicable (DOL API is public), but Apify's infrastructure handles rate-limit retries automatically
- Monitoring — get Slack or email alerts when a run completes or returns unexpected result counts
- Integrations — connect to Google Sheets, Zapier, Make, HubSpot, or any webhook destination
Features
- 7-dataset join — automatically correlates
accident,accident_abstract,accident_injury,related_activity,emphasis_codes, andviolation_gen_duty_stdon shared keys - Composite risk scoring — assigns each case a score from 0 to 100+ across 9 weighted factors: fatalities (+40 each), hospitalizations (+15 each, capped at +60), amputations (+25), mass casualties (+20), falls over 20 feet (+15), emphasis programs (+5 each), general duty citations (+10), and minor workers (+20)
- Four risk tiers — Critical (80+), High (50–79), Medium (25–49), Low (below 25) for fast triage
- Injury-level demographics — per-victim records with age, sex, occupation, nature of injury, body part, source, event type, environment factor, human factor, and fall distance
- Injury summary aggregation — per-case totals: average age, male/female breakdown, top nature of injury, top body part, fall-related count
- Full incident narratives — free-text abstract from
accident_abstractdescribing exactly what happened - Hazardous substance tracking — identifies chemical exposures at the individual injury level
- Emphasis program detection — flags active OSHA National Emphasis Programs (NEPs) such as Falls, Trenching, PSM Chemical Facilities, and Heat
- General duty standard citations — includes Section 5(a)(1) citation text for novel hazards with no specific standard
- Batch enrichment in groups of 50 — injury and related data fetched in efficient batches using
infilter operators to minimize API calls - Paginated retrieval — supports up to 5,000 accident cases per run via 1,000-record pages with 500ms inter-page delay
- Exponential backoff — 4-attempt retry with exponential back-off (2, 4, 8, 16 seconds) handles rate limits and transient errors
- Sorted output — results delivered highest risk score first so the most critical cases appear at the top of your dataset
- Dry run mode — returns two realistic sample records without an API key, so you can inspect the schema before committing
Use cases for OSHA accident data
Safety consulting and EHS benchmarking
Environmental health and safety professionals need industry-specific accident patterns to build benchmarking reports and advise clients. Filter by NAICS code to retrieve all construction, manufacturing, or petrochemical incidents in a state for the past 12 months, then analyze fall heights, chemical exposures, and injury types to identify the hazards with the highest severity concentration.
Insurance underwriting and workplace risk assessment
Underwriters assessing workers' compensation or general liability policies need to understand an employer's safety history and the risk profile of their industry segment. Pull all accident investigations involving a named employer, or screen an industry vertical by NAICS prefix, to quantify fatality and hospitalization rates before pricing a policy.
Legal research and litigation support
Attorneys and expert witnesses researching workplace injury litigation need detailed incident narratives, injury demographics, and inspection linkages for cases involving similar fact patterns. Filter by event keyword (Fall, Electrocution, Caught, Struck) and severity flag to build a set of comparable investigations with full narrative text.
Supply chain and contractor due diligence
Procurement and vendor risk teams screening contractors or suppliers for workplace safety standards can query by company name and correlate the resulting risk scores against contract decisions. A Critical-rated supplier with recurring fall fatalities or chemical releases presents a different risk profile than one with only non-hospitalized injuries.
Investigative journalism and public health research
Journalists covering workplace safety and public health researchers studying occupational injury trends can retrieve mass casualty events, identify states or industries with disproportionate fatality rates, and filter for emphasis program co-designations that signal OSHA focus areas. Narratives provide the qualitative detail behind the statistics.
Regulatory compliance monitoring
Compliance teams at multi-site employers can monitor their own OSHA accident record and compare it against industry peers, tracking whether emphasis programs active at their sites align with sectors OSHA is currently targeting for enforcement.
How to search OSHA accident data
- Enter your search criteria — provide any combination of company name (e.g., "Apex Chemical"), state abbreviation (e.g., "TX"), NAICS code prefix (e.g., "23" for all construction), and date range. Leave all filters blank to retrieve the most recent cases across all industries.
- Configure severity and data depth — set
fatalitiesOnlyorhospitalizationsOnlyto focus on high-severity cases. LeaveincludeNarrativesandincludeInjuryDetailsenabled (the defaults) to get the full enriched record. EnableincludeRelatedInspectionsonly if you need emphasis program data. - Run the actor — click "Start". A run retrieving 100 cases with narratives and injury details typically completes in 2–4 minutes. Runs at the 1,000-case limit with full enrichment take approximately 15–25 minutes depending on DOL API response times.
- Download results — open the Dataset tab, then export as JSON, CSV, or Excel. Records are sorted by risk score descending so the most critical cases are first.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
apiKey | string | No | — | DOL Open Data Portal API key. Free registration at dataportal.dol.gov. Without a key, dry run mode activates. |
dryRun | boolean | No | true | Return sample data without calling the API. Automatically set to false when apiKey is provided. |
companyName | string | No | — | Partial employer name match (case-insensitive). e.g., "Summit Construction" |
state | string | No | — | Two-letter state abbreviation. e.g., "TX", "CA", "NY" |
city | string | No | — | Partial city name match. e.g., "Houston", "Detroit" |
naicsCode | string | No | — | NAICS prefix match. e.g., "23" (construction), "31" (manufacturing), "48" (transportation) |
dateFrom | string | No | — | Earliest event date in YYYY-MM-DD format. e.g., "2023-01-01" |
dateTo | string | No | — | Latest event date in YYYY-MM-DD format. e.g., "2024-12-31" |
fatalitiesOnly | boolean | No | false | Restrict results to cases where at least one fatality occurred |
hospitalizationsOnly | boolean | No | false | Restrict results to cases involving at least one hospitalization. Ignored if fatalitiesOnly is true. |
keyword | string | No | — | Event keyword match. Common values: Fall, Chemical, Struck, Caught, Electrocution, Explosion, Amputation |
includeNarratives | boolean | No | true | Join the incident narrative text from accident_abstract |
includeInjuryDetails | boolean | No | true | Join per-victim injury records from accident_injury |
includeRelatedInspections | boolean | No | false | Join related inspections, emphasis programs, and general duty citations from three additional datasets |
maxResults | integer | No | 100 | Maximum accident cases to return (1–5,000) |
maxInjuriesPerAccident | integer | No | 50 | Maximum injury records fetched per case (1–200). Increase for mass casualty events. |
Input examples
Construction fatalities in Texas (past 2 years):
{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"state": "TX",
"naicsCode": "23",
"dateFrom": "2023-01-01",
"fatalitiesOnly": true,
"includeNarratives": true,
"includeInjuryDetails": true,
"maxResults": 200
}
Full enrichment for a specific employer (with emphasis programs):
{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"companyName": "Apex Chemical",
"includeNarratives": true,
"includeInjuryDetails": true,
"includeRelatedInspections": true,
"maxResults": 500,
"maxInjuriesPerAccident": 100
}
Quick keyword scan without enrichment (fastest run):
{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"keyword": "Electrocution",
"includeNarratives": false,
"includeInjuryDetails": false,
"maxResults": 50
}
Input tips
- Start with the dry run — the default settings return two realistic sample records showing the full output schema before you commit an API key.
- Use NAICS prefixes for broad industry sweeps —
"23"matches all 6-digit codes starting with 23, covering all construction subsectors. - Disable enrichment for large date-range scans — set
includeNarrativesandincludeInjuryDetailstofalseto retrieve 1,000+ accident headers quickly, then re-run targeted queries on high-risk cases. - Enable
includeRelatedInspectionsfor compliance research — emphasis programs and general duty citations add significant context for legal and regulatory work but increase run time by 30–50%. - Combine filters — all conditions use AND logic, so
state="CA"plusnaicsCode="31"plusfatalitiesOnly=truereturns only fatal manufacturing incidents in California.
Output example
{
"summaryNumber": "202456789",
"eventDate": "2024-11-03",
"eventDescription": "Chemical release in manufacturing plant",
"eventKeyword": "Chemical",
"employerName": "APEX CHEMICAL PROCESSING INC",
"siteAddress": "5500 REFINERY RD",
"siteCity": "BAYTOWN",
"siteState": "TX",
"siteZip": "77520",
"naicsCode": "325199",
"sicCode": null,
"isFatality": false,
"isHospitalization": true,
"isAmputation": false,
"numberOfFatalities": 0,
"numberOfHospitalizations": 4,
"numberOfInjuries": 7,
"activityNumber": "1690567",
"narrative": "A reactor vessel pressure relief valve failed during a batch chemical reaction, releasing a cloud of hydrogen chloride gas into the processing area. Seven employees were exposed to the gas, four of whom required hospitalization for chemical burns to the respiratory tract and skin. The remaining three employees were treated and released from the emergency department.",
"injuries": [
{
"age": 42,
"sex": "M",
"degreeOfInjury": "2",
"degreeDescription": "Hospitalized injuries",
"natureOfInjury": "Chemical burns",
"partOfBody": "Respiratory system",
"sourceOfInjury": "Chemicals",
"eventType": "Exposure to harmful substances",
"environmentFactor": "Pressure vessel failure",
"humanFactor": null,
"occupation": "Chemical plant operator",
"taskAssigned": "Reactor monitoring",
"hazardousSubstance": "Hydrogen chloride",
"constructionOperation": null,
"fallDistance": null,
"fallHeight": null
},
{
"age": 28,
"sex": "F",
"degreeOfInjury": "2",
"degreeDescription": "Hospitalized injuries",
"natureOfInjury": "Chemical burns",
"partOfBody": "Skin",
"sourceOfInjury": "Chemicals",
"eventType": "Exposure to harmful substances",
"environmentFactor": "Pressure vessel failure",
"humanFactor": null,
"occupation": "Chemical technician",
"taskAssigned": "Quality sampling",
"hazardousSubstance": "Hydrogen chloride",
"constructionOperation": null,
"fallDistance": null,
"fallHeight": null
}
],
"injurySummary": {
"total": 2,
"fatalities": 0,
"hospitalizations": 2,
"amputations": 0,
"averageAge": 35,
"maleCount": 1,
"femaleCount": 1,
"topNatureOfInjury": "Chemical burns",
"topBodyPart": "Respiratory system",
"topSource": "Chemicals",
"fallRelated": 0
},
"relatedInspections": null,
"emphasisPrograms": [
{ "programType": "N", "programValue": "PSM Covered Chemical Facilities" },
{ "programType": "N", "programValue": "Chemical Exposure Health Hazards" }
],
"generalDutyCitations": [
"Employer did not ensure pressure relief valves were inspected and maintained per manufacturer specifications."
],
"riskScore": 95,
"riskLevel": "Critical",
"riskFactors": [
"4 hospitalization(s) (+60)",
"7 injuries — mass casualty event (+20)",
"2 emphasis program(s) active (+10)",
"General duty standard (5a1) citation (+10)"
],
"extractedAt": "2026-03-20T14:22:10.000Z"
}
Output fields
| Field | Type | Description |
|---|---|---|
summaryNumber | string | OSHA accident summary number (primary key) |
eventDate | string | null | Date of the incident (YYYY-MM-DD) |
eventDescription | string | null | Short description of the event |
eventKeyword | string | null | Standardized event keyword (Fall, Chemical, Struck, etc.) |
employerName | string | Establishment name from OSHA records |
siteAddress | string | null | Street address of the incident site |
siteCity | string | null | City |
siteState | string | null | Two-letter state abbreviation |
siteZip | string | null | ZIP code |
naicsCode | string | null | 6-digit NAICS industry code |
sicCode | string | null | Legacy SIC code (older records) |
isFatality | boolean | True if at least one fatality occurred |
isHospitalization | boolean | True if at least one hospitalization occurred |
isAmputation | boolean | True if an amputation was involved |
numberOfFatalities | number | Count of fatalities |
numberOfHospitalizations | number | Count of hospitalizations |
numberOfInjuries | number | Total injury count |
activityNumber | string | null | OSHA inspection activity number (join key for related datasets) |
narrative | string | null | Full-text incident description from accident_abstract |
injuries[] | array | null | Per-victim injury records. Each record contains: age, sex, degreeOfInjury, degreeDescription, natureOfInjury, partOfBody, sourceOfInjury, eventType, environmentFactor, humanFactor, occupation, taskAssigned, hazardousSubstance, constructionOperation, fallDistance, fallHeight. |
injurySummary.total | number | Total injury records in this case |
injurySummary.fatalities | number | Fatality count from injury records |
injurySummary.hospitalizations | number | Hospitalization count from injury records |
injurySummary.amputations | number | Amputation count from injury records |
injurySummary.averageAge | number | null | Mean age of injured workers |
injurySummary.maleCount | number | Number of male workers injured |
injurySummary.femaleCount | number | Number of female workers injured |
injurySummary.topNatureOfInjury | string | null | Most frequent injury type in the case |
injurySummary.topBodyPart | string | null | Most frequently affected body part |
injurySummary.topSource | string | null | Most frequent injury source |
injurySummary.fallRelated | number | Count of injuries with fall distance or fall height data |
relatedInspections[] | array | null | Linked inspection activities (requires includeRelatedInspections) |
emphasisPrograms[] | array | null | Active OSHA emphasis program designations |
emphasisPrograms[].programType | string | Program type code |
emphasisPrograms[].programValue | string | Program name (Falls, Trenching, PSM, etc.) |
generalDutyCitations[] | array | null | Section 5(a)(1) general duty standard citation text |
riskScore | number | Composite risk score (0–100+) |
riskLevel | string | Critical (80+), High (50–79), Medium (25–49), Low (<25) |
riskFactors | string[] | Explanation of each scoring component with point values |
extractedAt | string | ISO 8601 timestamp when the record was assembled |
How much does it cost to search OSHA accident data?
OSHA Accident & Injury Intelligence uses pay-per-result pricing — you pay $0.005 per accident case returned. Platform compute costs are included.
| Scenario | Cases | Cost per case | Total cost |
|---|---|---|---|
| Quick test | 10 | $0.005 | $0.05 |
| Small batch | 100 | $0.005 | $0.50 |
| Medium batch | 500 | $0.005 | $2.50 |
| Large batch | 1,000 | $0.005 | $5.00 |
| Maximum batch | 5,000 | $0.005 | $25.00 |
You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.
The DOL API key required to run the actor is free. Compare this to commercial occupational safety databases that charge $2,000–10,000 per year for similar OSHA investigation data access. Most users running weekly monitoring queries for a specific industry or state spend under $5 per month.
Search OSHA accident data using the API
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/osha-accident-intel").call(run_input={
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": False,
"state": "TX",
"naicsCode": "23",
"dateFrom": "2023-01-01",
"fatalitiesOnly": True,
"includeNarratives": True,
"includeInjuryDetails": True,
"maxResults": 200
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['employerName']} ({item['siteState']}) — {item['eventDate']} — Risk: {item['riskLevel']} ({item['riskScore']})")
if item.get("narrative"):
print(f" {item['narrative'][:120]}...")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/osha-accident-intel").call({
apiKey: "YOUR_DOL_API_KEY",
dryRun: false,
state: "TX",
naicsCode: "23",
dateFrom: "2023-01-01",
fatalitiesOnly: true,
includeNarratives: true,
includeInjuryDetails: true,
maxResults: 200
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
console.log(`${item.employerName} (${item.siteState}) — ${item.eventDate} — Risk: ${item.riskLevel} (${item.riskScore})`);
const topInjury = item.injurySummary?.topNatureOfInjury;
if (topInjury) console.log(` Top injury: ${topInjury}`);
}
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~osha-accident-intel/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"state": "TX",
"naicsCode": "23",
"fatalitiesOnly": true,
"includeNarratives": true,
"includeInjuryDetails": true,
"maxResults": 200
}'
# Fetch results (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How OSHA Accident & Injury Intelligence works
Phase 1 — Accident case retrieval
The actor queries the OSHA/accident endpoint at https://apiprod.dol.gov/v4/get/OSHA/accident/json using a dynamically built filter object. Each filter condition maps to a DOL API operator: eq for exact state matches, like for partial employer name and city matches, gte/lte for date range bounds, and in for batch lookups. Multiple conditions are combined using the DOL API's and compound operator. Results are retrieved in pages of 1,000 records with a 500ms inter-page delay, sorted by event_date descending so the most recent incidents come first.
Phase 2 — Parallel dataset enrichment
After collecting accident records, the actor extracts two join key sets: summary_nr values (shared across abstract and injury datasets) and activity_nr values (shared across related activity, emphasis codes, and general duty datasets). Enrichment fetches are batched in groups of 50 using the DOL in filter operator. Abstract, injury, and related fetches are staggered by 1-second increments and run concurrently via Promise.allSettled so a failure in one dataset does not block the others.
Phase 3 — Risk scoring and record assembly
For each accident, the actor joins enrichment maps by key, transforms raw injury records through a DEGREE_MAP lookup (numeric codes to human-readable labels), and aggregates the injury summary with frequency distributions of nature-of-injury, body part, and source. The composite risk score accumulates from 9 factors: 40 per fatality, 15 per hospitalization (capped at 60), 25 for amputations, 10 for 2–5 injuries, 20 for 6+ injuries, 15 for falls over 20 feet, 5 per emphasis program (capped at 15), 10 for general duty citations, and 20 for minor workers. The final dataset is sorted by risk score descending.
Phase 4 — Error handling and rate limiting
All DOL API calls go through a dolFetch function implementing 4-attempt exponential backoff: wait times of 2, 4, 8, and 16 seconds on consecutive failures. HTTP 429 (rate limit) responses trigger the same backoff. HTTP 401 and 403 responses log a clear API key error and halt immediately. HTTP 204 (no content) and empty response bodies return null gracefully. A 30-second per-request timeout prevents stalled fetches from blocking runs indefinitely.
Tips for best results
-
Get your free DOL API key first. Register at dataportal.dol.gov/registration — it takes under 2 minutes and is required for real data. The dry run mode is only useful for inspecting the output schema.
-
Filter by NAICS prefix to target an industry. The two-digit prefix
"23"covers all construction,"31-33"covers manufacturing,"48-49"covers transportation and warehousing. More specific 4- or 6-digit codes narrow to a sub-industry. -
Disable enrichment for exploratory scans. If you want to understand the volume of accidents for a filter before committing a full enriched run, set
includeNarratives=falseandincludeInjuryDetails=false. The base accident record still includes fatality/hospitalization counts and risk scoring works from the raw accident data. -
Enable
includeRelatedInspectionsfor compliance and legal use cases. Emphasis programs and general duty citations add 3 additional API dataset fetches but provide the context needed to understand whether the employer is under active OSHA scrutiny. -
Increase
maxInjuriesPerAccidentfor mass casualty research. The default of 50 covers the vast majority of cases. If you are specifically studying incidents with high victim counts (chemical releases, building collapses), set this to 100 or 200. -
Schedule monthly runs with a narrow date range. A monthly run with
dateFromset to 30 days prior provides a rolling feed of new accident investigations, suitable for continuous monitoring dashboards. -
Sort and filter by
riskLevelin your downstream tool. All records arrive sorted byriskScoredescending — filter toriskLevel == "Critical"orriskLevel == "High"in Excel, Google Sheets, or your database to focus on cases that warrant immediate attention.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Company Deep Research | After identifying high-risk employers from OSHA data, run deep research to get company financials, leadership, and corporate structure for a complete due diligence profile |
| Website Contact Scraper | Extract safety officer and HR contact details from high-risk employer websites identified through OSHA accident searches |
| B2B Lead Qualifier | Score employers flagged in OSHA data as leads for safety consulting, training, or compliance software services |
| Trustpilot Review Analyzer | Correlate OSHA accident history with employee reviews to identify companies where safety culture issues appear in both regulatory records and worker sentiment |
| Website Tech Stack Detector | Identify the EHS software platforms used by high-risk employers to target them with safety technology alternatives |
| WHOIS Domain Lookup | Verify employer domain ownership when building outreach lists from OSHA accident data |
| HubSpot Lead Pusher | Push high-risk employers directly into HubSpot as leads for safety consulting or insurance sales workflows |
Limitations
- Data freshness — OSHA investigation data is published on a delay of weeks to months after the incident date. This actor reflects what is available in the DOL public API, not real-time incident reports. Do not use this for same-day incident monitoring.
- Narrative availability — Not all accident records have an associated abstract.
narrativewill benullfor cases where OSHA has not yet published the investigation text or where no abstract was filed. - Injury record completeness — The DOL API may return partial injury records for some cases, particularly older incidents. Fields such as
occupation,environmentFactor, andhazardousSubstanceare frequently null for records before 2015. - NAICS code gaps — Approximately 5–10% of accident records in the DOL database have null or malformed NAICS codes, meaning a NAICS filter will not capture 100% of incidents in a target industry.
- Rate limiting at scale — Runs retrieving 5,000 cases with full enrichment make several hundred API calls. The exponential backoff handles occasional rate limits, but sustained heavy usage may encounter DOL API throttling that extends run times significantly.
- No real-time streaming — Results are delivered as a batch dataset at run completion, not as a streaming feed. For near-real-time monitoring, schedule frequent short runs rather than one long continuous run.
- Emphasis program and general duty data requires
includeRelatedInspections=true— These fields are null by default. The extra API calls are worth enabling for compliance and legal research but add meaningful time to large runs. - Activity number linkage is not universal — Some accident records have a null
activity_nr, which means related inspection, emphasis, and general duty data cannot be joined for those cases even withincludeRelatedInspectionsenabled.
Integrations
- Zapier — trigger OSHA accident data runs on a schedule and push Critical-risk cases to Slack, email, or a project management tool automatically
- Make — build multi-step safety monitoring workflows: run OSHA queries, filter by risk level, and route results to different teams or systems
- Google Sheets — export accident case records to a live spreadsheet for safety dashboards, underwriting models, or compliance reporting
- Apify API — integrate OSHA data pulls into internal risk management platforms, EHS software, or insurance rating engines
- Webhooks — receive a notification with a dataset link the moment a run completes, for automated pipeline triggers
- LangChain / LlamaIndex — feed incident narratives and risk-scored records into LLM pipelines for automated safety report generation or hazard pattern analysis
Troubleshooting
-
No results returned despite filters matching known incidents — The DOL API uses case-insensitive
LIKEmatching on employer names and cities, but the database stores names in uppercase. The actor uppercases your input automatically. If results are still empty, try a shorter partial name (e.g.,"Summit"instead of"Summit Construction LLC") or remove one filter at a time to identify which constraint is too narrow. -
Run returns sample data only, not real records — This happens when no
apiKeyis provided or whendryRunis set totrue. Provide your DOL API key in theapiKeyfield and ensuredryRunis either omitted or set tofalse. -
Run takes longer than expected — Large
maxResultsvalues (1,000+) combined with all enrichment options enabled generate hundreds of staggered API calls. ReducemaxResults, disableincludeRelatedInspections, or split your date range into smaller windows and run multiple focused queries. -
injuriesarray is null despite the case having victims — The case was retrieved successfully but no matching records were found inaccident_injuryfor thatsummary_nr. This is expected for older records and for cases where injury data has not yet been published. ThenumberOfInjuriesfield from the base accident record still reflects the count. -
HTTP 401 or 403 error in run logs — Your DOL API key is invalid, expired, or not yet activated. Re-register at dataportal.dol.gov/registration and allow a few minutes for the key to become active after registration.
Responsible use
- This actor only accesses publicly available U.S. Department of Labor OSHA accident investigation data published under the DOL Open Data Portal.
- The data reflects official regulatory records. Do not represent extracted data as real-time incident reports or as the complete universe of workplace accidents — only OSHA-investigated cases are included.
- Comply with applicable laws when using employer-level safety data for commercial purposes, including adverse action restrictions in credit and employment contexts.
- Do not use this data to publicly defame employers based on historical investigations without verification against the official OSHA website.
- For guidance on data use and public records, see Apify's guide on web scraping legality.
FAQ
How many OSHA accident records can I retrieve in one run? The actor supports up to 5,000 accident case records per run. The DOL database contains hundreds of thousands of historical investigations going back several decades. Use date range and NAICS filters to scope your query to the subset most relevant to your analysis.
Does OSHA accident data include the full incident narrative for every case?
Not always. Narratives come from the OSHA/accident_abstract dataset, which is only populated when OSHA publishes the investigation text. Approximately 60–70% of records in recent years have narratives. Older records have lower coverage. The narrative field will be null when no abstract is available.
What OSHA accident keywords can I filter by?
The most common event keywords in the database are: Fall, Chemical, Struck, Caught, Electrocution, Explosion, Collapse, Fire, Drowning, and Amputation. The keyword filter uses a partial match, so "Fall" also captures "Fall from ladder" and similar variants. Keywords are stored in uppercase in the DOL database; the actor normalizes your input automatically.
How accurate is the risk score?
The risk score is a composite heuristic based on objective OSHA data fields — fatality and hospitalization counts, fall heights, emphasis program designations, and citation types. It is useful for prioritizing cases for review, not for making regulatory or legal determinations. Cases with incomplete injury data (null injuries array) score only from the base accident record fields, which may undercount the true severity.
How is this different from searching the OSHA website directly? The OSHA website provides case-by-case lookup and limited filtering. This actor retrieves up to 5,000 cases per run with filters across 9 dimensions, joins 7 related datasets, computes risk scores, and exports structured JSON/CSV. It takes a manual researcher hours to compile what this actor returns in minutes.
Can I search OSHA accident data for a specific company by name?
Yes. Use the companyName field with a partial match. For example, "Apex Chemical" will match APEX CHEMICAL PROCESSING INC and any other establishment with those words in the name. If a company operates under multiple names or subsidiaries, run separate queries for each name variant.
Is it legal to use OSHA accident data for commercial purposes? Yes. OSHA accident investigation data is published by the U.S. Department of Labor under the DOL Open Data Portal and is explicitly made available for public use. There are no licensing restrictions on commercial use. Standard data protection and anti-defamation considerations apply when publishing conclusions derived from this data.
Can I use this actor to monitor a competitor or supplier's safety record on a schedule?
Yes. Set up a scheduled run on Apify with companyName set to the employer you want to monitor and a rolling dateFrom 30 or 90 days in the past. You will receive a dataset of any new OSHA investigations involving that employer each time the schedule runs.
What is an OSHA emphasis program and why does it matter? Emphasis programs are OSHA National or Regional Emphasis Programs (NEPs/REPs) that concentrate inspection resources on high-hazard industries or specific hazard types. An active emphasis program on an accident record means OSHA was already focused on that hazard type at that establishment or industry. Examples include the Falls NEP, the Trenching and Excavation NEP, and the PSM Covered Chemical Facilities NEP. Their presence is a strong signal of elevated regulatory risk.
How long does a typical run take? A run retrieving 100 cases with narratives and injury details (default settings) takes approximately 2–4 minutes. A run at 1,000 cases with all enrichment options enabled takes 15–25 minutes. A run at 5,000 cases without enrichment takes 10–15 minutes. Run times extend when the DOL API is under load or when rate limiting triggers the exponential backoff.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Weather Forecast Search
Get weather forecasts for any location worldwide using the free Open-Meteo API. Returns current conditions, daily and hourly forecasts with temperature, precipitation, wind, UV index, and more. No API key needed.
EUIPO EU Trademark Search
Search EU trademarks via official EUIPO database. Find registered and pending trademarks by name, Nice class, applicant, or status. Returns full trademark details and filing history.
Ready to try OSHA Accident & Injury Intelligence?
Start for free on Apify. No credit card required.
Open on Apify Store