MSHA Mining Safety & Health Data
MSHA mining safety data for all 86,000+ US mines — coal and metal/non-metal — sourced directly from the Mine Safety and Health Administration via the DOL Open Data Portal. This actor searches mines by name, operator, state, or commodity and automatically joins violation citations, inspection records, accident histories, and penalty assessments into a single structured output per mine.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| result-returned | Charged per result returned. Includes data transformation and structured output. | $0.03 |
Example: 100 events = $3.00 · 1,000 events = $30.00
Documentation
MSHA mining safety data for all 86,000+ US mines — coal and metal/non-metal — sourced directly from the Mine Safety and Health Administration via the DOL Open Data Portal. This actor searches mines by name, operator, state, or commodity and automatically joins violation citations, inspection records, accident histories, and penalty assessments into a single structured output per mine.
The output includes a weighted risk score (0-200+) that surfaces the most dangerous operations first. Analysts, insurers, ESG teams, and legal researchers use this to assess mine safety performance without manually querying five separate government databases. One run returns complete mine profiles with every S&S violation, withdrawal order, fatality, and MSHA inspection on record.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| 📋 Mine ID and name | MSHA Mines dataset | 4601432 / EAGLE BUTTE MINE |
| 🏭 Operator and controller | MSHA Mines dataset | EAGLE SPECIALTY MATERIALS LLC |
| 📍 Location (state, county, coordinates) | MSHA Mines dataset | WY / Campbell / 44.21, -105.38 |
| ⚙️ Mine type and classification | MSHA Mines dataset | Coal / Surface |
| 👷 Employee count and hours worked | MSHA Mines dataset | 320 employees / 640,000 hrs/yr |
| ⚠️ Violations (citations, orders, safeguards) | MSHA Violations dataset | S&S = true, Penalty = $4,416 |
| 🔍 Inspections (type, violations found, hours) | MSHA Inspections dataset | Regular Safety/Health / 2 violations / 48 hrs |
| 🚑 Accidents (injuries, fatalities, narratives) | MSHA Accidents dataset | DAFW / Powered Haulage / 45 days lost |
| 💰 Proposed and paid penalties | MSHA Assessed Violations | $27,872 proposed / $19,110 paid |
| 📊 Violation summary by type and negligence | Computed | Citations: 8 / Orders: 1 / S&S: 6 / High: 2 |
| 📈 Inspection summary with trend data | Computed | 12 inspections / 24 violations found / 186 hrs |
| 🔴 Risk score and risk level | Computed | Score: 67 / Level: High |
Why use MSHA Mining Safety & Health Data?
Researching mine safety by hand means jumping between five separate DOL portal queries, exporting separate spreadsheets per dataset, manually matching records by mine ID, and building your own scoring logic from scratch. For a single mine with full history, that takes 2-3 hours. For a portfolio of 50 operations, it becomes a week-long project.
This actor automates the entire process. Provide a search filter — state, operator, commodity, or a specific mine ID — and get back structured JSON with violations, inspections, accidents, and a calculated risk score for every matching mine. Runs sort output by risk score descending, so the highest-risk operations appear first.
- Scheduling — run monthly to track how safety profiles change over time and flag mines crossing risk thresholds
- API access — trigger runs from Python, JavaScript, or any HTTP client for programmatic integration with your research pipeline
- Monitoring — get Slack or email alerts when runs produce unexpected results or when API connectivity changes
- Integrations — connect to Google Sheets, Zapier, Make, or webhooks for downstream workflows
- Structured output — every run produces clean JSON compatible with Excel, Power BI, Tableau, and any database
Features
- Five-dataset join — automatically correlates mines, violations, inspections, accidents, and assessed penalties by
mine_idin a single run - Weighted risk scoring — 12-factor algorithm scoring fatalities (+40 each), permanent disabilities (+20 each), withdrawal orders (+10 each, cap 80), S&S violations (+3 each, cap 60), high negligence (+8 each, cap 40), penalty thresholds, violation volume, contested counts, days lost, and violation-to-employee ratio
- Four risk tiers — Critical (≥100), High (≥60), Medium (≥30), Low (<30) — with itemized risk factor breakdowns explaining every point
- 17 inspection type codes decoded — maps raw codes (AAA, E01, E13, etc.) to human-readable labels including Regular Safety/Health, Spot, Impact, Complaint, Hazard Complaint, and Imminent Danger
- 10 injury severity codes decoded — Fatality, Permanent Total or Partial Disability, DAFW, DART, Days Restricted, First Aid, and more
- Partial-match mine search — search by mine name, operator, or controller using
LIKEmatching against MSHA's 86,000+ mine registry - Batch pagination — fetches related records in batches of 30 mine IDs per API call to respect DOL rate limits; up to 1,000 records per page
- Exponential backoff on 429 — automatically retries with delays of 4s, 8s, 16s, 32s before failing gracefully
- 500ms request delay — staggered requests between dataset fetches prevent concurrent rate limiting across multi-mine searches
- Date range filtering —
dateFrom/dateToapplied to violations, inspections, and accidents simultaneously to scope historical queries - Flexible result caps — configurable
maxResults(up to 5,000 mines) and per-mine limits for violations (up to 1,000), inspections (up to 500), and accidents (up to 500) - Sort by employee count — when no filter is applied, mines are returned largest-first (by current employee count), then re-sorted by risk score after processing
- Dry run mode — returns realistic sample data without any API calls; safe for testing workflows and UI development
Use cases for MSHA mining safety data
Mining company due diligence
Private equity analysts and M&A advisors researching acquisition targets in the extractive industries need to quantify safety liability before closing. This actor returns complete violation and accident histories with penalty totals, so buyers can model regulatory risk and estimate remediation costs before bidding. Run the operator name filter across all states to surface every mine in a target's portfolio.
Insurance underwriting for mining operations
Workers' compensation and general liability underwriters need objective safety track records when quoting mining clients. The risk score — factoring in fatalities, withdrawal orders, S&S violations, and injury frequency — gives underwriters a reproducible numerical basis for rate adjustment. Pulling the last 36 months of accident data using dateFrom and dateTo narrows the analysis to the coverage period.
ESG risk assessment for mining investments
ESG analysts scoring mining companies on social and safety metrics require standardized, auditable data. This actor provides quantifiable inputs: fatality counts, lost workday totals, S&S violation rates, and penalty history. The violation-to-employee ratio and high-negligence breakdowns map directly to GRI 403 occupational health and safety indicators used in ESG reporting frameworks.
Investigative journalism and public interest research
Journalists covering mine safety, worker advocacy groups, and government watchdogs need quick access to enforcement histories for specific operations or regions. Search by state and commodity to map the worst-performing mines in a geographic area, then use accident narratives (included in the output) to identify patterns in injury causes, occupations, and equipment involved.
Legal research and litigation support
Attorneys handling MSHA citation defense or personal injury cases involving mining operations need comprehensive violation and penalty histories. The actor returns violationNumber, section, citationOrderSafeguard, negligence, proposedPenalty, contested status, and terminatedDate — all the fields needed to reconstruct an enforcement timeline for discovery or expert analysis.
Regulatory compliance monitoring
Mining safety officers and compliance managers can schedule monthly runs against their own operator name to track how their safety profile is trending. Alert on runs where riskLevel has moved from Medium to High, or where new withdrawal orders appear in the output, to trigger internal review before MSHA escalates.
How to search MSHA mining safety data
- Register for a free DOL API key — go to dataportal.dol.gov/registration, create an account, and copy your key. The same key works for all DOL datasets including OSHA and WHD.
- Enter your search criteria — type a state code like
WV, an operator name likePEABODY, a commodity likeBituminous, or paste a specific 7-digit MSHA mine ID. You can combine multiple filters. - Click "Start" and wait — runs with 10-20 mines and default dataset limits typically complete in 3-5 minutes. Larger batches (100+ mines with full violation history) may take 20-30 minutes.
- Download results — export as JSON, CSV, or Excel from the Dataset tab. Each row is one mine with all related records nested inside.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
apiKey | string | No | — | DOL Open Data Portal API key (register free). If omitted with dryRun: false, the run will fail with a 401 error. |
dryRun | boolean | No | true | Return realistic sample data without API calls. Set to false with a valid apiKey to fetch live data. |
mineId | string | No | — | Exact 7-digit MSHA mine identification number, e.g., 4601432. Takes precedence over all other filters. |
mineName | string | No | — | Partial mine name match (case-insensitive). e.g., EAGLE BUTTE or NORTH ANTELOPE. |
operatorName | string | No | — | Partial operator name match. e.g., PEABODY, ARCH, CONSOL. |
controllerName | string | No | — | Partial controller / parent company name match. |
state | string | No | — | Two-letter state code from dropdown (WV, WY, PA, KY, etc.). |
mineType | string | No | — | C for Coal; M for Metal / Non-Metal. |
mineStatus | string | No | — | Active, Abandoned, Abandoned and Sealed, Intermittent, NonProducing, New Mine, or Temporarily Idled. |
commodity | string | No | — | Primary SIC description match. e.g., Bituminous, Gold, Limestone, Sand and Gravel. |
dateFrom | string | No | — | Filter violations, inspections, and accidents from this date (YYYY-MM-DD). |
dateTo | string | No | — | Filter violations, inspections, and accidents up to this date (YYYY-MM-DD). |
includeViolations | boolean | No | true | Join MSHA violation citation records for each mine. |
includeInspections | boolean | No | true | Join MSHA inspection records for each mine. |
includeAccidents | boolean | No | true | Join MSHA accident and injury records for each mine. |
maxResults | integer | No | 100 | Maximum number of mines to return (1–5,000). |
maxViolationsPerMine | integer | No | 100 | Maximum violations to fetch per mine (1–1,000). |
maxInspectionsPerMine | integer | No | 50 | Maximum inspections to fetch per mine (1–500). |
maxAccidentsPerMine | integer | No | 50 | Maximum accidents to fetch per mine (1–500). |
Input examples
Find all active coal mines in West Virginia:
{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"state": "WV",
"mineType": "C",
"mineStatus": "Active",
"maxResults": 200,
"maxViolationsPerMine": 100,
"maxInspectionsPerMine": 50,
"maxAccidentsPerMine": 50
}
Full safety history for a specific mine:
{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"mineId": "4601432",
"maxViolationsPerMine": 500,
"maxInspectionsPerMine": 200,
"maxAccidentsPerMine": 200
}
Operator portfolio audit with recent incidents only:
{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"operatorName": "PEABODY",
"dateFrom": "2023-01-01",
"dateTo": "2025-12-31",
"includeViolations": true,
"includeInspections": true,
"includeAccidents": true,
"maxResults": 100
}
Input tips
- Start with dry run — leave
dryRun: true(the default) to verify your downstream pipeline handles the output format correctly before spending API quota. - Use
mineIdfor the fastest lookups — a direct 7-digit mine ID bypasses all filtering and returns exactly one mine at maximum speed. - Narrow date ranges for large operators — searching
PEABODYorARCHwithout a date filter will fetch thousands of violation records. UsedateFromto limit to a recent period first. - Raise per-mine limits for litigation work — the default of 100 violations per mine may truncate long enforcement histories. Set
maxViolationsPerMine: 500for comprehensive legal research. - Disable unused datasets to reduce run time — set
includeInspections: falseandincludeAccidents: falseif you only need violation counts for a quick risk screen.
Output example
{
"mineId": "4601432",
"mineName": "EAGLE BUTTE MINE",
"mineType": "Coal",
"mineTypeCode": "C",
"mineClassification": "Surface",
"operatorName": "EAGLE SPECIALTY MATERIALS LLC",
"operatorId": "OP8823441",
"controllerName": "EAGLE SPECIALTY MATERIALS LLC",
"controllerId": "CT8823441",
"state": "WY",
"county": "CAMPBELL",
"latitude": 44.21,
"longitude": -105.38,
"status": "Active",
"statusCode": "Active",
"statusDate": "2019-03-01",
"employeeCount": 320,
"averageEmployeeCount": 298,
"hoursPerYear": 640000,
"daysPerWeek": 7,
"sicCode": "1221",
"sicDescription": "Bituminous Coal and Lignite Surface Mining",
"naicsCode": "212111",
"primaryCommodity": "Bituminous",
"secondaryCommodity": null,
"district": "Western District",
"congressionalDistrict": "WY-01",
"nearestTown": "GILLETTE",
"violations": [
{
"violationNumber": "8765432",
"eventNumber": "E012341",
"issuedDate": "2024-11-14",
"section": "77.1607(b)",
"sectionTitle": "Loading and haulage - moving equipment",
"actionType": "Written",
"citationOrderSafeguard": "Citation",
"likelihood": "Reasonably Likely",
"injuryIllness": "Lost Workdays or Restricted Duty",
"personsAffected": 1,
"negligence": "Moderate",
"significantAndSubstantial": true,
"proposedPenalty": 4416,
"amountPaid": 3533,
"violatorName": "EAGLE SPECIALTY MATERIALS LLC",
"violatorType": "Operator",
"contested": false,
"terminatedDate": "2024-12-02",
"terminationType": "Abated",
"specialAssessment": false
},
{
"violationNumber": "8765433",
"eventNumber": "E012341",
"issuedDate": "2024-11-14",
"section": "77.1710(g)",
"sectionTitle": "Protective clothing - eye protection",
"actionType": "Written",
"citationOrderSafeguard": "Order",
"likelihood": "Highly Likely",
"injuryIllness": "Fatal",
"personsAffected": 2,
"negligence": "High",
"significantAndSubstantial": true,
"proposedPenalty": 23456,
"amountPaid": 0,
"violatorName": "EAGLE SPECIALTY MATERIALS LLC",
"violatorType": "Operator",
"contested": true,
"terminatedDate": null,
"terminationType": null,
"specialAssessment": true
}
],
"violationSummary": {
"total": 2,
"citations": 1,
"orders": 1,
"safeguards": 0,
"significantAndSubstantial": 2,
"totalProposedPenalties": 27872,
"totalPaid": 3533,
"contested": 1,
"byNegligence": { "Moderate": 1, "High": 1 },
"byLikelihood": { "Reasonably Likely": 1, "Highly Likely": 1 }
},
"inspections": [
{
"eventNumber": "E012341",
"beginDate": "2024-11-10",
"endDate": "2024-11-14",
"inspectionType": "E01",
"inspectionTypeDescription": "Regular Safety/Health",
"violationsFound": 2,
"totalProposedPenalties": 27872,
"onSiteHours": 48,
"operatorName": "EAGLE SPECIALTY MATERIALS LLC"
}
],
"inspectionSummary": {
"total": 1,
"byType": { "Regular Safety/Health": 1 },
"totalViolationsFound": 2,
"totalProposedPenalties": 27872,
"totalOnSiteHours": 48,
"mostRecentDate": "2024-11-10"
},
"accidents": [
{
"documentNumber": "ACC202406120012",
"accidentDate": "2024-06-12",
"accidentTime": "0930",
"degreeOfInjury": "3",
"degreeOfInjuryDescription": "Days Away From Work (DAFW)",
"classification": "Powered Haulage",
"accidentType": "Collision",
"numberOfInjuries": 1,
"occupation": "Truck Driver",
"activity": "Traveling",
"injurySource": "Vehicle / Mobile Equipment",
"natureOfInjury": "Fractures",
"bodyPartAffected": "Lower Extremities",
"daysRestricted": 0,
"daysLost": 45,
"totalExperience": 8,
"mineExperience": 3,
"narrative": "Employee was operating a 240-ton haul truck on the main haul road when the truck collided with a parked water truck. Employee sustained fractures to the lower leg and was transported to hospital."
}
],
"accidentSummary": {
"total": 1,
"fatalities": 0,
"permanentDisabilities": 0,
"daysAwayFromWork": 1,
"restrictedActivity": 0,
"noLostTime": 0,
"totalDaysLost": 45,
"totalDaysRestricted": 0,
"mostRecentDate": "2024-06-12"
},
"riskScore": 37,
"riskLevel": "Medium",
"riskFactors": [
"1 withdrawal order(s) (+10)",
"2 S&S violation(s) (+6)",
"1 high negligence violation(s) (+8)",
"45 total days lost from accidents (+15)"
],
"extractedAt": "2025-03-21T14:22:00.000Z"
}
Output fields
| Field | Type | Description |
|---|---|---|
mineId | string | 7-digit MSHA mine identification number |
mineName | string | Current mine name as registered with MSHA |
mineType | string | Coal or Metal/Non-Metal |
mineTypeCode | string | Raw code: C or M |
mineClassification | string | Surface, Underground, Facility, or SurfaceUnderground |
operatorName | string | Current mine operator |
operatorId | string | MSHA operator identification number |
controllerName | string | Current controller / parent company |
controllerId | string | MSHA controller identification number |
state | string | Two-letter state abbreviation |
county | string | FIPS county name |
latitude | number | null | Mine latitude (WGS84) |
longitude | number | null | Mine longitude (WGS84) |
status | string | Current operational status |
statusCode | string | Raw MSHA status code |
statusDate | string | null | Date status last changed (YYYY-MM-DD) |
employeeCount | number | Current employee count |
averageEmployeeCount | number | null | Average employee count over time |
hoursPerYear | number | null | Total employee hours worked per year |
daysPerWeek | number | null | Operating days per week |
sicCode | string | null | SIC industry code |
sicDescription | string | null | SIC industry description |
naicsCode | string | null | NAICS industry code |
primaryCommodity | string | null | Primary commodity mined (e.g., Bituminous, Gold) |
secondaryCommodity | string | null | Secondary commodity if applicable |
district | string | null | MSHA district name |
congressionalDistrict | string | null | US Congressional district |
nearestTown | string | null | Nearest town to the mine |
violations[] | array | Violation records (see below) |
violations[].violationNumber | string | Unique MSHA violation number |
violations[].eventNumber | string | Inspection event number linking to inspections |
violations[].issuedDate | string | null | Date citation or order was issued |
violations[].section | string | CFR section cited (e.g., 77.1607(b)) |
violations[].sectionTitle | string | Human-readable CFR section title |
violations[].citationOrderSafeguard | string | Type: Citation, Order, or Safeguard |
violations[].likelihood | string | null | Likelihood of injury: No Likelihood, Unlikely, Reasonably Likely, Highly Likely |
violations[].negligence | string | null | Negligence level: None, Low, Moderate, High, Reckless Disregard |
violations[].significantAndSubstantial | boolean | Whether the violation is S&S (Significant and Substantial) |
violations[].proposedPenalty | number | Proposed penalty amount in USD |
violations[].amountPaid | number | Amount of penalty actually paid |
violations[].contested | boolean | Whether the operator contested the violation |
violations[].terminatedDate | string | null | Date the violation was abated/terminated |
violations[].specialAssessment | boolean | Whether a special (enhanced) assessment was applied |
violationSummary.total | number | Total violation count |
violationSummary.citations | number | Number of citations |
violationSummary.orders | number | Number of withdrawal orders |
violationSummary.safeguards | number | Number of safeguards |
violationSummary.significantAndSubstantial | number | Number of S&S violations |
violationSummary.totalProposedPenalties | number | Sum of all proposed penalties in USD |
violationSummary.totalPaid | number | Sum of all penalties paid in USD |
violationSummary.contested | number | Number of contested violations |
violationSummary.byNegligence | object | Violation counts grouped by negligence level |
violationSummary.byLikelihood | object | Violation counts grouped by likelihood |
inspections[] | array | Inspection records |
inspections[].eventNumber | string | Unique MSHA inspection event number |
inspections[].beginDate | string | null | Inspection start date |
inspections[].endDate | string | null | Inspection end date |
inspections[].inspectionTypeDescription | string | Human-readable inspection type |
inspections[].violationsFound | number | Number of violations issued during this inspection |
inspections[].totalProposedPenalties | number | Total penalties from this inspection |
inspections[].onSiteHours | number | Inspector hours spent on-site |
inspectionSummary.total | number | Total inspection count |
inspectionSummary.byType | object | Inspection counts by type |
inspectionSummary.totalViolationsFound | number | Total violations found across all inspections |
inspectionSummary.totalOnSiteHours | number | Total inspector hours across all inspections |
inspectionSummary.mostRecentDate | string | null | Most recent inspection date |
accidents[] | array | Accident and injury records |
accidents[].documentNumber | string | Unique MSHA accident document number |
accidents[].accidentDate | string | null | Date of accident |
accidents[].degreeOfInjuryDescription | string | Severity: Fatality, DAFW, DART, Restricted, First Aid, etc. |
accidents[].classification | string | null | Accident classification (e.g., Powered Haulage, Fall of Roof) |
accidents[].daysLost | number | Workdays lost due to injury |
accidents[].daysRestricted | number | Days of restricted duty |
accidents[].narrative | string | null | MSHA investigator's written accident narrative |
accidentSummary.fatalities | number | Total fatality count |
accidentSummary.permanentDisabilities | number | Total permanent disability count |
accidentSummary.totalDaysLost | number | Total workdays lost across all accidents |
riskScore | number | Composite risk score (0–200+) |
riskLevel | string | Critical (≥100), High (≥60), Medium (≥30), Low (<30) |
riskFactors | string[] | Itemized list of risk factors with point contributions |
extractedAt | string | ISO 8601 timestamp of the run |
How much does it cost to search MSHA mining safety data?
This actor runs on Apify's standard compute pricing — you pay for the platform resources used during the run. There are no per-record or per-mine charges beyond compute time. A DOL API key is required and is free to register.
| Scenario | Mines | Datasets included | Estimated run time | Estimated cost |
|---|---|---|---|---|
| Quick test (dry run) | 1 (sample) | All | < 30 seconds | < $0.01 |
| Single mine full history | 1 | Violations + Inspections + Accidents | 1-2 minutes | ~$0.01 |
| State screen (active coal, WV) | ~50 | All | 5-10 minutes | ~$0.05 |
| Operator portfolio audit | ~100 | All | 15-25 minutes | ~$0.10 |
| National commodity scan | 500+ | Violations only | 45-90 minutes | ~$0.30 |
You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached. For most due diligence and ESG workflows, total monthly spend stays well under $5.
Compare this to commercial mining data providers and safety analytics platforms that charge $500–$2,000/month for subscriptions covering similar MSHA data. This actor uses the same underlying government source at a fraction of the cost.
Search MSHA mining safety data using the API
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/msha-mining-safety").call(run_input={
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": False,
"state": "WV",
"mineType": "C",
"mineStatus": "Active",
"maxResults": 50,
"dateFrom": "2023-01-01"
})
for mine in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{mine['mineName']} ({mine['state']}) — Risk: {mine['riskLevel']} ({mine['riskScore']}) — S&S violations: {mine['violationSummary']['significantAndSubstantial']}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/msha-mining-safety").call({
apiKey: "YOUR_DOL_API_KEY",
dryRun: false,
operatorName: "PEABODY",
includeViolations: true,
includeAccidents: true,
maxResults: 100
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const mine of items) {
const fatalities = mine.accidentSummary.fatalities;
const penalties = mine.violationSummary.totalProposedPenalties;
console.log(`${mine.mineName} — Fatalities: ${fatalities} — Total penalties: $${penalties.toLocaleString()} — Level: ${mine.riskLevel}`);
}
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~msha-mining-safety/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"apiKey": "YOUR_DOL_API_KEY",
"dryRun": false,
"commodity": "Gold",
"mineType": "M",
"mineStatus": "Active",
"maxResults": 30
}'
# Fetch results (replace DATASET_ID from the run response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How MSHA Mining Safety & Health Data works
Phase 1: Mine search and filter construction
The actor builds a structured filter_object for the DOL API v4 endpoint at https://apiprod.dol.gov/v4/get/MSHA/mines/json. Text-based filters (mine name, operator, controller, commodity) use LIKE matching with %value% wildcards applied against the MSHA database. Exact matches (mine ID, state, mine type, status) use equality operators. Multiple conditions are combined with an and clause. When no filter is provided, the actor fetches mines sorted by curr_empl_cnt descending, returning the largest operations first. Pagination runs in 1,000-record pages with 500ms delays between requests.
Phase 2: Batch dataset joining
Once mine IDs are collected, violations, inspections, and accidents are fetched using a batched approach: up to 30 mine IDs per API request using mine_id IN [...] filters. This reduces total API calls by ~30x compared to per-mine lookups. Date filters are injected into each batch query when dateFrom or dateTo are provided. Each dataset (violations, inspections, accidents) is fetched sequentially with 500ms delays between batches to avoid concurrent rate limiting. On HTTP 429 responses, the actor retries with exponential backoff: 4s, 8s, 16s, and 32s before failing gracefully with a warning.
Phase 3: Record transformation and decoding
Raw API field names (e.g., cit_ord_safe, sig_sub, inj_illness) are mapped to structured camelCase output fields. Coded values are decoded using reference maps: 17 inspection type codes (AAA, E01–E17) map to full descriptions; 10 injury degree codes (1–10) map to severity labels; negligence levels are normalized through a 5-tier map. Summary objects are computed for each mine across all three datasets — violation summaries count by type, negligence, and likelihood; inspection summaries aggregate by type and total on-site hours; accident summaries count by severity tier and total lost days.
Phase 4: Risk scoring and output assembly
The risk scoring function applies 12 weighted factors to each mine's summary data. Fatalities carry the highest weight (+40 each, no cap) because a single fatality represents a catastrophic safety failure. Permanent disabilities add +20 each. Withdrawal orders — mandatory cessation orders issued when MSHA finds imminent danger — contribute +10 each up to a cap of 80 points. S&S violations add +3 each up to 60 points. High negligence and reckless disregard violations add +8 each up to 40 points. Penalty thresholds, violation volume, contested counts, days lost, and the violation-to-employee ratio contribute additional factors. The final score determines one of four risk tiers: Critical (≥100), High (≥60), Medium (≥30), or Low (<30). Output is sorted by risk score descending.
Tips for best results
-
Start with a dry run to test your pipeline. The default
dryRun: truereturns a fully structured sample mine record without consuming any DOL API quota. Use it to verify your downstream pipeline, spreadsheet import, or webhook before running live data. -
Register your DOL API key in advance. Key registration at dataportal.dol.gov/registration is instant and free, but first-time API calls sometimes take a few minutes to activate. Test your key with a single
mineIdlookup before running large batches. -
Use date filters for large operators. National operators like Peabody Energy or Arch Resources have hundreds of mines with decades of violation history. A query like
operatorName: "PEABODY"without date filters will attempt to fetch all historical records. AdddateFrom: "2022-01-01"to limit to recent activity. -
Combine with Company Deep Research to augment MSHA safety profiles with corporate structure, financial data, and ownership history when preparing due diligence reports.
-
Raise
maxViolationsPerMinefor litigation work. The default cap of 100 violations per mine truncates full enforcement histories. SetmaxViolationsPerMine: 500or higher when preparing comprehensive citation timelines for legal proceedings. -
Schedule monthly runs for compliance monitoring. Use Apify's built-in scheduler to run against your own
operatorNameormineIdlist each month. CompareriskScoreandriskLeveltrends over time to identify deteriorating safety performance before an MSHA enforcement escalation. -
Use
includeInspections: falseandincludeAccidents: falsefor fast portfolio screens. If you only need violation counts and risk scores across a large mine portfolio, disabling the two optional datasets cuts run time by roughly 60%. -
Cross-reference the
eventNumberfield — violations and inspections share the sameeventNumber, so you can match every citation to the specific inspection event that generated it for a complete enforcement timeline.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Company Deep Research | Run MSHA safety profiles first, then feed the operator or controller name into Company Deep Research for corporate structure, litigation history, and financial details |
| B2B Lead Qualifier | Use MSHA risk scores as an input signal in B2B lead scoring for mining equipment, safety training, and workers' comp insurance sales |
| Trustpilot Review Analyzer | Combine MSHA safety records with employee and contractor reviews of mining operators for a holistic workforce safety picture |
| SEC EDGAR Filing Analyzer | Cross-reference MSHA violation histories with public company disclosures about regulatory risk and enforcement actions |
| Website Contact Scraper | Extract contact information from mining operator websites to reach out following safety audits or M&A screening |
| B2B Lead Gen Suite | Use high-risk mine operators identified by MSHA data as the seed list for outbound safety consulting or insurance sales campaigns |
| WHOIS Domain Lookup | Verify domain ownership of mine operator websites when building contact lists for safety-related outreach |
Limitations
- Data freshness depends on DOL update cadence — all five MSHA datasets are updated weekly by the DOL, not in real time. There is typically a 7–14 day lag between an MSHA inspection and its appearance in the API.
- Historical records only; no predictive data — the actor reports what MSHA has documented. It does not predict future violations or accidents.
- No free-text search across violation narratives — violation records include section codes and titles but not full inspector narratives. Accident records include full narratives but cannot be filtered by text.
- DOL API rate limits apply — aggressive querying across many mines in parallel will trigger 429 responses. The actor handles this with exponential backoff, but very large runs (500+ mines with full history) may take 60+ minutes.
- Contractor violations require separate queries — MSHA records violations against both operators and contractors. The actor joins violations by mine ID, which includes operator violations. Contractor-specific breakdowns require additional filtering by
violatorType. - Coordinate data is incomplete — latitude and longitude are present for most but not all mines in the MSHA database. Some older records have null coordinates.
- The DOL API occasionally returns HTTP 204 (no content) — the actor handles 204 responses gracefully as empty results, but it means that dataset had no matching records for that query, not that the mine doesn't exist.
includeAssessmentsis defined in the type interface but not yet wired up — the assessed violations dataset join is not fully implemented in the current version. Use theproposedPenaltyandamountPaidfields in the violations output for penalty data.
Integrations
- Zapier — trigger an MSHA safety lookup whenever a new mining company is added to a CRM or deal pipeline, and push results to a spreadsheet or Slack channel
- Make — schedule monthly MSHA runs and route high-risk mine profiles (Critical or High) to specific team members via email or project management tools
- Google Sheets — export mine safety results directly to a shared Google Sheet for ESG reporting, insurance underwriting, or portfolio tracking
- Apify API — integrate MSHA safety queries into due diligence platforms, compliance dashboards, or risk management systems via REST API
- Webhooks — receive a POST notification when a run completes, enabling real-time downstream processing of safety profiles in your own application
- LangChain / LlamaIndex — feed MSHA mine records into an LLM pipeline for natural language safety analysis, ESG report generation, or automated risk narrative writing
Troubleshooting
-
Run returns 0 mines despite valid search criteria — check that your
mineTypematches the commodity. Gold mines usemineType: "M"(Metal/Non-Metal), not"C"(Coal). Also verify state codes are two uppercase letters with no spaces. -
Authentication error (401 or 403) — your DOL API key is missing, invalid, or not yet activated. Verify the key at dataportal.dol.gov and wait a few minutes after first registration before retrying.
-
Run times out or takes much longer than expected — large batches with high per-mine limits are the most common cause. Reduce
maxViolationsPerMineto 50, disable inspections or accidents withincludeInspections: false, or split the search into smaller date-range windows. -
Violation counts appear lower than expected — the
maxViolationsPerMinecap may be truncating results. Increase to 500 or 1,000 for mines with long enforcement histories. Also confirm yourdateFrom/dateTowindow covers the period you expect. -
riskScoreis 0 despite known violations — the risk score is calculated from the violations, inspections, and accidents fetched in this run. If all three include flags arefalse, or ifdateFrom/dateTofilters exclude historical records, the summary data will be empty and the score will be 0.
Responsible use
- This actor only accesses publicly available data published by the US Department of Labor on the DOL Open Data Portal.
- All MSHA data accessed through this actor is government-produced public information maintained for regulatory transparency.
- Compliance with the DOL Open Data Portal's terms of service is required.
- Do not use extracted data in ways that misrepresent mine safety records or violate fair use standards in regulatory, legal, or financial contexts.
- For guidance on web scraping legality, see Apify's guide.
FAQ
How many mines can I search with MSHA Mining Safety & Health Data in one run?
Up to 5,000 mines per run using the maxResults parameter. The DOL API covers 86,000+ mines going back to 1970. For national surveys, consider splitting by state or commodity to keep individual run times manageable.
What is a Significant and Substantial (S&S) violation in MSHA data?
An S&S violation is one where an MSHA inspector determines the violation is reasonably likely to cause a reasonably serious injury or illness. S&S violations carry higher penalties and are a primary signal in the risk scoring algorithm. The significantAndSubstantial field in each violation record is a boolean, and violationSummary.significantAndSubstantial gives the total count per mine.
How current is the MSHA data returned by this actor? The DOL updates all five MSHA datasets weekly. Expect a 7–14 day lag between an on-the-ground inspection and its appearance in the API. For the most recent federal enforcement action at a specific mine, cross-check with the MSHA Enforcement Actions website.
Does this actor cover both coal and metal/non-metal mines?
Yes. Set mineType: "C" for coal mines (bituminous, lignite, anthracite) or mineType: "M" for metal and non-metal mines (gold, copper, limestone, sand and gravel, potash, etc.). Leave it blank to search across both types.
How does the risk score compare to official MSHA enforcement designations? The risk score is a computed heuristic based on the same underlying data MSHA uses, but it is not an official MSHA designation. MSHA has its own Pattern of Violations (POV) program with different criteria. The risk score in this actor is designed for internal triage and comparative analysis, not regulatory compliance determinations.
Can I use MSHA Mining Safety & Health Data to monitor a competitor's safety record?
Yes. Set operatorName or controllerName to a competitor's name and schedule monthly runs. Track changes in riskScore, riskLevel, and violation counts over time. All data returned is public information published by the US government.
How is this different from using the MSHA public website directly? The MSHA website requires separate searches for each mine, each dataset, and each date range. This actor joins all five datasets automatically, computes risk scores, and delivers structured JSON or CSV ready for analysis. What takes hours of manual data gathering and spreadsheet work runs in minutes.
Is it legal to use MSHA data for insurance underwriting or investment research? MSHA data is public information produced by the US federal government and is not subject to copyright. Using it for insurance underwriting, investment analysis, or due diligence is standard practice in the mining industry. Consult your legal team regarding the specific use of safety data in regulated financial contexts.
Can I get accident narratives for a specific mine?
Yes. Accident records include the narrative field, which contains the MSHA investigator's written description of the accident. Set includeAccidents: true (the default) and search by mineId or operatorName to retrieve these narratives.
What happens if the DOL API is down or rate-limits my run? The actor retries failed requests up to 4 times with exponential backoff (4s, 8s, 16s, 32s). If the API remains unavailable after all retries, the actor logs a warning and continues with whatever data was already collected. Individual mine records may have empty violation, inspection, or accident arrays if those specific dataset fetches failed.
Can I schedule this actor to run automatically? Yes. Use Apify's built-in scheduler to run on any interval — daily, weekly, or monthly. Scheduling is configured from the actor's page in the Apify Console without any code.
How do I find a mine's 7-digit MSHA mine ID?
Search by mine name, operator, or state first with dryRun: false. Each result includes the mineId field. Alternatively, look up the mine ID on the MSHA Mine Data Retrieval System and use it directly in mineId for subsequent runs.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Weather Forecast Search
Get weather forecasts for any location worldwide using the free Open-Meteo API. Returns current conditions, daily and hourly forecasts with temperature, precipitation, wind, UV index, and more. No API key needed.
EUIPO EU Trademark Search
Search EU trademarks via official EUIPO database. Find registered and pending trademarks by name, Nice class, applicant, or status. Returns full trademark details and filing history.
Ready to try MSHA Mining Safety & Health Data?
Start for free on Apify. No credit card required.
Open on Apify Store