OTHERAI

MSHA Mining Safety & Health Data

MSHA mining safety data for all 86,000+ US mines — coal and metal/non-metal — sourced directly from the Mine Safety and Health Administration via the DOL Open Data Portal. This actor searches mines by name, operator, state, or commodity and automatically joins violation citations, inspection records, accident histories, and penalty assessments into a single structured output per mine.

Try on Apify Store
$0.03per event
0
Users (30d)
0
Runs (30d)
90
Actively maintained
Maintenance Pulse
$0.03
Per event

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

result-returneds
Estimated cost:$3.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
result-returnedCharged per result returned. Includes data transformation and structured output.$0.03

Example: 100 events = $3.00 · 1,000 events = $30.00

Documentation

MSHA mining safety data for all 86,000+ US mines — coal and metal/non-metal — sourced directly from the Mine Safety and Health Administration via the DOL Open Data Portal. This actor searches mines by name, operator, state, or commodity and automatically joins violation citations, inspection records, accident histories, and penalty assessments into a single structured output per mine.

The output includes a weighted risk score (0-200+) that surfaces the most dangerous operations first. Analysts, insurers, ESG teams, and legal researchers use this to assess mine safety performance without manually querying five separate government databases. One run returns complete mine profiles with every S&S violation, withdrawal order, fatality, and MSHA inspection on record.

What data can you extract?

Data PointSourceExample
📋 Mine ID and nameMSHA Mines dataset4601432 / EAGLE BUTTE MINE
🏭 Operator and controllerMSHA Mines datasetEAGLE SPECIALTY MATERIALS LLC
📍 Location (state, county, coordinates)MSHA Mines datasetWY / Campbell / 44.21, -105.38
⚙️ Mine type and classificationMSHA Mines datasetCoal / Surface
👷 Employee count and hours workedMSHA Mines dataset320 employees / 640,000 hrs/yr
⚠️ Violations (citations, orders, safeguards)MSHA Violations datasetS&S = true, Penalty = $4,416
🔍 Inspections (type, violations found, hours)MSHA Inspections datasetRegular Safety/Health / 2 violations / 48 hrs
🚑 Accidents (injuries, fatalities, narratives)MSHA Accidents datasetDAFW / Powered Haulage / 45 days lost
💰 Proposed and paid penaltiesMSHA Assessed Violations$27,872 proposed / $19,110 paid
📊 Violation summary by type and negligenceComputedCitations: 8 / Orders: 1 / S&S: 6 / High: 2
📈 Inspection summary with trend dataComputed12 inspections / 24 violations found / 186 hrs
🔴 Risk score and risk levelComputedScore: 67 / Level: High

Why use MSHA Mining Safety & Health Data?

Researching mine safety by hand means jumping between five separate DOL portal queries, exporting separate spreadsheets per dataset, manually matching records by mine ID, and building your own scoring logic from scratch. For a single mine with full history, that takes 2-3 hours. For a portfolio of 50 operations, it becomes a week-long project.

This actor automates the entire process. Provide a search filter — state, operator, commodity, or a specific mine ID — and get back structured JSON with violations, inspections, accidents, and a calculated risk score for every matching mine. Runs sort output by risk score descending, so the highest-risk operations appear first.

  • Scheduling — run monthly to track how safety profiles change over time and flag mines crossing risk thresholds
  • API access — trigger runs from Python, JavaScript, or any HTTP client for programmatic integration with your research pipeline
  • Monitoring — get Slack or email alerts when runs produce unexpected results or when API connectivity changes
  • Integrations — connect to Google Sheets, Zapier, Make, or webhooks for downstream workflows
  • Structured output — every run produces clean JSON compatible with Excel, Power BI, Tableau, and any database

Features

  • Five-dataset join — automatically correlates mines, violations, inspections, accidents, and assessed penalties by mine_id in a single run
  • Weighted risk scoring — 12-factor algorithm scoring fatalities (+40 each), permanent disabilities (+20 each), withdrawal orders (+10 each, cap 80), S&S violations (+3 each, cap 60), high negligence (+8 each, cap 40), penalty thresholds, violation volume, contested counts, days lost, and violation-to-employee ratio
  • Four risk tiers — Critical (≥100), High (≥60), Medium (≥30), Low (<30) — with itemized risk factor breakdowns explaining every point
  • 17 inspection type codes decoded — maps raw codes (AAA, E01, E13, etc.) to human-readable labels including Regular Safety/Health, Spot, Impact, Complaint, Hazard Complaint, and Imminent Danger
  • 10 injury severity codes decoded — Fatality, Permanent Total or Partial Disability, DAFW, DART, Days Restricted, First Aid, and more
  • Partial-match mine search — search by mine name, operator, or controller using LIKE matching against MSHA's 86,000+ mine registry
  • Batch pagination — fetches related records in batches of 30 mine IDs per API call to respect DOL rate limits; up to 1,000 records per page
  • Exponential backoff on 429 — automatically retries with delays of 4s, 8s, 16s, 32s before failing gracefully
  • 500ms request delay — staggered requests between dataset fetches prevent concurrent rate limiting across multi-mine searches
  • Date range filteringdateFrom / dateTo applied to violations, inspections, and accidents simultaneously to scope historical queries
  • Flexible result caps — configurable maxResults (up to 5,000 mines) and per-mine limits for violations (up to 1,000), inspections (up to 500), and accidents (up to 500)
  • Sort by employee count — when no filter is applied, mines are returned largest-first (by current employee count), then re-sorted by risk score after processing
  • Dry run mode — returns realistic sample data without any API calls; safe for testing workflows and UI development

Use cases for MSHA mining safety data

Mining company due diligence

Private equity analysts and M&A advisors researching acquisition targets in the extractive industries need to quantify safety liability before closing. This actor returns complete violation and accident histories with penalty totals, so buyers can model regulatory risk and estimate remediation costs before bidding. Run the operator name filter across all states to surface every mine in a target's portfolio.

Insurance underwriting for mining operations

Workers' compensation and general liability underwriters need objective safety track records when quoting mining clients. The risk score — factoring in fatalities, withdrawal orders, S&S violations, and injury frequency — gives underwriters a reproducible numerical basis for rate adjustment. Pulling the last 36 months of accident data using dateFrom and dateTo narrows the analysis to the coverage period.

ESG risk assessment for mining investments

ESG analysts scoring mining companies on social and safety metrics require standardized, auditable data. This actor provides quantifiable inputs: fatality counts, lost workday totals, S&S violation rates, and penalty history. The violation-to-employee ratio and high-negligence breakdowns map directly to GRI 403 occupational health and safety indicators used in ESG reporting frameworks.

Investigative journalism and public interest research

Journalists covering mine safety, worker advocacy groups, and government watchdogs need quick access to enforcement histories for specific operations or regions. Search by state and commodity to map the worst-performing mines in a geographic area, then use accident narratives (included in the output) to identify patterns in injury causes, occupations, and equipment involved.

Legal research and litigation support

Attorneys handling MSHA citation defense or personal injury cases involving mining operations need comprehensive violation and penalty histories. The actor returns violationNumber, section, citationOrderSafeguard, negligence, proposedPenalty, contested status, and terminatedDate — all the fields needed to reconstruct an enforcement timeline for discovery or expert analysis.

Regulatory compliance monitoring

Mining safety officers and compliance managers can schedule monthly runs against their own operator name to track how their safety profile is trending. Alert on runs where riskLevel has moved from Medium to High, or where new withdrawal orders appear in the output, to trigger internal review before MSHA escalates.

How to search MSHA mining safety data

  1. Register for a free DOL API key — go to dataportal.dol.gov/registration, create an account, and copy your key. The same key works for all DOL datasets including OSHA and WHD.
  2. Enter your search criteria — type a state code like WV, an operator name like PEABODY, a commodity like Bituminous, or paste a specific 7-digit MSHA mine ID. You can combine multiple filters.
  3. Click "Start" and wait — runs with 10-20 mines and default dataset limits typically complete in 3-5 minutes. Larger batches (100+ mines with full violation history) may take 20-30 minutes.
  4. Download results — export as JSON, CSV, or Excel from the Dataset tab. Each row is one mine with all related records nested inside.

Input parameters

ParameterTypeRequiredDefaultDescription
apiKeystringNoDOL Open Data Portal API key (register free). If omitted with dryRun: false, the run will fail with a 401 error.
dryRunbooleanNotrueReturn realistic sample data without API calls. Set to false with a valid apiKey to fetch live data.
mineIdstringNoExact 7-digit MSHA mine identification number, e.g., 4601432. Takes precedence over all other filters.
mineNamestringNoPartial mine name match (case-insensitive). e.g., EAGLE BUTTE or NORTH ANTELOPE.
operatorNamestringNoPartial operator name match. e.g., PEABODY, ARCH, CONSOL.
controllerNamestringNoPartial controller / parent company name match.
statestringNoTwo-letter state code from dropdown (WV, WY, PA, KY, etc.).
mineTypestringNoC for Coal; M for Metal / Non-Metal.
mineStatusstringNoActive, Abandoned, Abandoned and Sealed, Intermittent, NonProducing, New Mine, or Temporarily Idled.
commoditystringNoPrimary SIC description match. e.g., Bituminous, Gold, Limestone, Sand and Gravel.
dateFromstringNoFilter violations, inspections, and accidents from this date (YYYY-MM-DD).
dateTostringNoFilter violations, inspections, and accidents up to this date (YYYY-MM-DD).
includeViolationsbooleanNotrueJoin MSHA violation citation records for each mine.
includeInspectionsbooleanNotrueJoin MSHA inspection records for each mine.
includeAccidentsbooleanNotrueJoin MSHA accident and injury records for each mine.
maxResultsintegerNo100Maximum number of mines to return (1–5,000).
maxViolationsPerMineintegerNo100Maximum violations to fetch per mine (1–1,000).
maxInspectionsPerMineintegerNo50Maximum inspections to fetch per mine (1–500).
maxAccidentsPerMineintegerNo50Maximum accidents to fetch per mine (1–500).

Input examples

Find all active coal mines in West Virginia:

{
    "apiKey": "YOUR_DOL_API_KEY",
    "dryRun": false,
    "state": "WV",
    "mineType": "C",
    "mineStatus": "Active",
    "maxResults": 200,
    "maxViolationsPerMine": 100,
    "maxInspectionsPerMine": 50,
    "maxAccidentsPerMine": 50
}

Full safety history for a specific mine:

{
    "apiKey": "YOUR_DOL_API_KEY",
    "dryRun": false,
    "mineId": "4601432",
    "maxViolationsPerMine": 500,
    "maxInspectionsPerMine": 200,
    "maxAccidentsPerMine": 200
}

Operator portfolio audit with recent incidents only:

{
    "apiKey": "YOUR_DOL_API_KEY",
    "dryRun": false,
    "operatorName": "PEABODY",
    "dateFrom": "2023-01-01",
    "dateTo": "2025-12-31",
    "includeViolations": true,
    "includeInspections": true,
    "includeAccidents": true,
    "maxResults": 100
}

Input tips

  • Start with dry run — leave dryRun: true (the default) to verify your downstream pipeline handles the output format correctly before spending API quota.
  • Use mineId for the fastest lookups — a direct 7-digit mine ID bypasses all filtering and returns exactly one mine at maximum speed.
  • Narrow date ranges for large operators — searching PEABODY or ARCH without a date filter will fetch thousands of violation records. Use dateFrom to limit to a recent period first.
  • Raise per-mine limits for litigation work — the default of 100 violations per mine may truncate long enforcement histories. Set maxViolationsPerMine: 500 for comprehensive legal research.
  • Disable unused datasets to reduce run time — set includeInspections: false and includeAccidents: false if you only need violation counts for a quick risk screen.

Output example

{
    "mineId": "4601432",
    "mineName": "EAGLE BUTTE MINE",
    "mineType": "Coal",
    "mineTypeCode": "C",
    "mineClassification": "Surface",
    "operatorName": "EAGLE SPECIALTY MATERIALS LLC",
    "operatorId": "OP8823441",
    "controllerName": "EAGLE SPECIALTY MATERIALS LLC",
    "controllerId": "CT8823441",
    "state": "WY",
    "county": "CAMPBELL",
    "latitude": 44.21,
    "longitude": -105.38,
    "status": "Active",
    "statusCode": "Active",
    "statusDate": "2019-03-01",
    "employeeCount": 320,
    "averageEmployeeCount": 298,
    "hoursPerYear": 640000,
    "daysPerWeek": 7,
    "sicCode": "1221",
    "sicDescription": "Bituminous Coal and Lignite Surface Mining",
    "naicsCode": "212111",
    "primaryCommodity": "Bituminous",
    "secondaryCommodity": null,
    "district": "Western District",
    "congressionalDistrict": "WY-01",
    "nearestTown": "GILLETTE",
    "violations": [
        {
            "violationNumber": "8765432",
            "eventNumber": "E012341",
            "issuedDate": "2024-11-14",
            "section": "77.1607(b)",
            "sectionTitle": "Loading and haulage - moving equipment",
            "actionType": "Written",
            "citationOrderSafeguard": "Citation",
            "likelihood": "Reasonably Likely",
            "injuryIllness": "Lost Workdays or Restricted Duty",
            "personsAffected": 1,
            "negligence": "Moderate",
            "significantAndSubstantial": true,
            "proposedPenalty": 4416,
            "amountPaid": 3533,
            "violatorName": "EAGLE SPECIALTY MATERIALS LLC",
            "violatorType": "Operator",
            "contested": false,
            "terminatedDate": "2024-12-02",
            "terminationType": "Abated",
            "specialAssessment": false
        },
        {
            "violationNumber": "8765433",
            "eventNumber": "E012341",
            "issuedDate": "2024-11-14",
            "section": "77.1710(g)",
            "sectionTitle": "Protective clothing - eye protection",
            "actionType": "Written",
            "citationOrderSafeguard": "Order",
            "likelihood": "Highly Likely",
            "injuryIllness": "Fatal",
            "personsAffected": 2,
            "negligence": "High",
            "significantAndSubstantial": true,
            "proposedPenalty": 23456,
            "amountPaid": 0,
            "violatorName": "EAGLE SPECIALTY MATERIALS LLC",
            "violatorType": "Operator",
            "contested": true,
            "terminatedDate": null,
            "terminationType": null,
            "specialAssessment": true
        }
    ],
    "violationSummary": {
        "total": 2,
        "citations": 1,
        "orders": 1,
        "safeguards": 0,
        "significantAndSubstantial": 2,
        "totalProposedPenalties": 27872,
        "totalPaid": 3533,
        "contested": 1,
        "byNegligence": { "Moderate": 1, "High": 1 },
        "byLikelihood": { "Reasonably Likely": 1, "Highly Likely": 1 }
    },
    "inspections": [
        {
            "eventNumber": "E012341",
            "beginDate": "2024-11-10",
            "endDate": "2024-11-14",
            "inspectionType": "E01",
            "inspectionTypeDescription": "Regular Safety/Health",
            "violationsFound": 2,
            "totalProposedPenalties": 27872,
            "onSiteHours": 48,
            "operatorName": "EAGLE SPECIALTY MATERIALS LLC"
        }
    ],
    "inspectionSummary": {
        "total": 1,
        "byType": { "Regular Safety/Health": 1 },
        "totalViolationsFound": 2,
        "totalProposedPenalties": 27872,
        "totalOnSiteHours": 48,
        "mostRecentDate": "2024-11-10"
    },
    "accidents": [
        {
            "documentNumber": "ACC202406120012",
            "accidentDate": "2024-06-12",
            "accidentTime": "0930",
            "degreeOfInjury": "3",
            "degreeOfInjuryDescription": "Days Away From Work (DAFW)",
            "classification": "Powered Haulage",
            "accidentType": "Collision",
            "numberOfInjuries": 1,
            "occupation": "Truck Driver",
            "activity": "Traveling",
            "injurySource": "Vehicle / Mobile Equipment",
            "natureOfInjury": "Fractures",
            "bodyPartAffected": "Lower Extremities",
            "daysRestricted": 0,
            "daysLost": 45,
            "totalExperience": 8,
            "mineExperience": 3,
            "narrative": "Employee was operating a 240-ton haul truck on the main haul road when the truck collided with a parked water truck. Employee sustained fractures to the lower leg and was transported to hospital."
        }
    ],
    "accidentSummary": {
        "total": 1,
        "fatalities": 0,
        "permanentDisabilities": 0,
        "daysAwayFromWork": 1,
        "restrictedActivity": 0,
        "noLostTime": 0,
        "totalDaysLost": 45,
        "totalDaysRestricted": 0,
        "mostRecentDate": "2024-06-12"
    },
    "riskScore": 37,
    "riskLevel": "Medium",
    "riskFactors": [
        "1 withdrawal order(s) (+10)",
        "2 S&S violation(s) (+6)",
        "1 high negligence violation(s) (+8)",
        "45 total days lost from accidents (+15)"
    ],
    "extractedAt": "2025-03-21T14:22:00.000Z"
}

Output fields

FieldTypeDescription
mineIdstring7-digit MSHA mine identification number
mineNamestringCurrent mine name as registered with MSHA
mineTypestringCoal or Metal/Non-Metal
mineTypeCodestringRaw code: C or M
mineClassificationstringSurface, Underground, Facility, or SurfaceUnderground
operatorNamestringCurrent mine operator
operatorIdstringMSHA operator identification number
controllerNamestringCurrent controller / parent company
controllerIdstringMSHA controller identification number
statestringTwo-letter state abbreviation
countystringFIPS county name
latitudenumber | nullMine latitude (WGS84)
longitudenumber | nullMine longitude (WGS84)
statusstringCurrent operational status
statusCodestringRaw MSHA status code
statusDatestring | nullDate status last changed (YYYY-MM-DD)
employeeCountnumberCurrent employee count
averageEmployeeCountnumber | nullAverage employee count over time
hoursPerYearnumber | nullTotal employee hours worked per year
daysPerWeeknumber | nullOperating days per week
sicCodestring | nullSIC industry code
sicDescriptionstring | nullSIC industry description
naicsCodestring | nullNAICS industry code
primaryCommoditystring | nullPrimary commodity mined (e.g., Bituminous, Gold)
secondaryCommoditystring | nullSecondary commodity if applicable
districtstring | nullMSHA district name
congressionalDistrictstring | nullUS Congressional district
nearestTownstring | nullNearest town to the mine
violations[]arrayViolation records (see below)
violations[].violationNumberstringUnique MSHA violation number
violations[].eventNumberstringInspection event number linking to inspections
violations[].issuedDatestring | nullDate citation or order was issued
violations[].sectionstringCFR section cited (e.g., 77.1607(b))
violations[].sectionTitlestringHuman-readable CFR section title
violations[].citationOrderSafeguardstringType: Citation, Order, or Safeguard
violations[].likelihoodstring | nullLikelihood of injury: No Likelihood, Unlikely, Reasonably Likely, Highly Likely
violations[].negligencestring | nullNegligence level: None, Low, Moderate, High, Reckless Disregard
violations[].significantAndSubstantialbooleanWhether the violation is S&S (Significant and Substantial)
violations[].proposedPenaltynumberProposed penalty amount in USD
violations[].amountPaidnumberAmount of penalty actually paid
violations[].contestedbooleanWhether the operator contested the violation
violations[].terminatedDatestring | nullDate the violation was abated/terminated
violations[].specialAssessmentbooleanWhether a special (enhanced) assessment was applied
violationSummary.totalnumberTotal violation count
violationSummary.citationsnumberNumber of citations
violationSummary.ordersnumberNumber of withdrawal orders
violationSummary.safeguardsnumberNumber of safeguards
violationSummary.significantAndSubstantialnumberNumber of S&S violations
violationSummary.totalProposedPenaltiesnumberSum of all proposed penalties in USD
violationSummary.totalPaidnumberSum of all penalties paid in USD
violationSummary.contestednumberNumber of contested violations
violationSummary.byNegligenceobjectViolation counts grouped by negligence level
violationSummary.byLikelihoodobjectViolation counts grouped by likelihood
inspections[]arrayInspection records
inspections[].eventNumberstringUnique MSHA inspection event number
inspections[].beginDatestring | nullInspection start date
inspections[].endDatestring | nullInspection end date
inspections[].inspectionTypeDescriptionstringHuman-readable inspection type
inspections[].violationsFoundnumberNumber of violations issued during this inspection
inspections[].totalProposedPenaltiesnumberTotal penalties from this inspection
inspections[].onSiteHoursnumberInspector hours spent on-site
inspectionSummary.totalnumberTotal inspection count
inspectionSummary.byTypeobjectInspection counts by type
inspectionSummary.totalViolationsFoundnumberTotal violations found across all inspections
inspectionSummary.totalOnSiteHoursnumberTotal inspector hours across all inspections
inspectionSummary.mostRecentDatestring | nullMost recent inspection date
accidents[]arrayAccident and injury records
accidents[].documentNumberstringUnique MSHA accident document number
accidents[].accidentDatestring | nullDate of accident
accidents[].degreeOfInjuryDescriptionstringSeverity: Fatality, DAFW, DART, Restricted, First Aid, etc.
accidents[].classificationstring | nullAccident classification (e.g., Powered Haulage, Fall of Roof)
accidents[].daysLostnumberWorkdays lost due to injury
accidents[].daysRestrictednumberDays of restricted duty
accidents[].narrativestring | nullMSHA investigator's written accident narrative
accidentSummary.fatalitiesnumberTotal fatality count
accidentSummary.permanentDisabilitiesnumberTotal permanent disability count
accidentSummary.totalDaysLostnumberTotal workdays lost across all accidents
riskScorenumberComposite risk score (0–200+)
riskLevelstringCritical (≥100), High (≥60), Medium (≥30), Low (<30)
riskFactorsstring[]Itemized list of risk factors with point contributions
extractedAtstringISO 8601 timestamp of the run

How much does it cost to search MSHA mining safety data?

This actor runs on Apify's standard compute pricing — you pay for the platform resources used during the run. There are no per-record or per-mine charges beyond compute time. A DOL API key is required and is free to register.

ScenarioMinesDatasets includedEstimated run timeEstimated cost
Quick test (dry run)1 (sample)All< 30 seconds< $0.01
Single mine full history1Violations + Inspections + Accidents1-2 minutes~$0.01
State screen (active coal, WV)~50All5-10 minutes~$0.05
Operator portfolio audit~100All15-25 minutes~$0.10
National commodity scan500+Violations only45-90 minutes~$0.30

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached. For most due diligence and ESG workflows, total monthly spend stays well under $5.

Compare this to commercial mining data providers and safety analytics platforms that charge $500–$2,000/month for subscriptions covering similar MSHA data. This actor uses the same underlying government source at a fraction of the cost.

Search MSHA mining safety data using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/msha-mining-safety").call(run_input={
    "apiKey": "YOUR_DOL_API_KEY",
    "dryRun": False,
    "state": "WV",
    "mineType": "C",
    "mineStatus": "Active",
    "maxResults": 50,
    "dateFrom": "2023-01-01"
})

for mine in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{mine['mineName']} ({mine['state']}) — Risk: {mine['riskLevel']} ({mine['riskScore']}) — S&S violations: {mine['violationSummary']['significantAndSubstantial']}")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/msha-mining-safety").call({
    apiKey: "YOUR_DOL_API_KEY",
    dryRun: false,
    operatorName: "PEABODY",
    includeViolations: true,
    includeAccidents: true,
    maxResults: 100
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const mine of items) {
    const fatalities = mine.accidentSummary.fatalities;
    const penalties = mine.violationSummary.totalProposedPenalties;
    console.log(`${mine.mineName} — Fatalities: ${fatalities} — Total penalties: $${penalties.toLocaleString()} — Level: ${mine.riskLevel}`);
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~msha-mining-safety/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "apiKey": "YOUR_DOL_API_KEY",
    "dryRun": false,
    "commodity": "Gold",
    "mineType": "M",
    "mineStatus": "Active",
    "maxResults": 30
  }'

# Fetch results (replace DATASET_ID from the run response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How MSHA Mining Safety & Health Data works

Phase 1: Mine search and filter construction

The actor builds a structured filter_object for the DOL API v4 endpoint at https://apiprod.dol.gov/v4/get/MSHA/mines/json. Text-based filters (mine name, operator, controller, commodity) use LIKE matching with %value% wildcards applied against the MSHA database. Exact matches (mine ID, state, mine type, status) use equality operators. Multiple conditions are combined with an and clause. When no filter is provided, the actor fetches mines sorted by curr_empl_cnt descending, returning the largest operations first. Pagination runs in 1,000-record pages with 500ms delays between requests.

Phase 2: Batch dataset joining

Once mine IDs are collected, violations, inspections, and accidents are fetched using a batched approach: up to 30 mine IDs per API request using mine_id IN [...] filters. This reduces total API calls by ~30x compared to per-mine lookups. Date filters are injected into each batch query when dateFrom or dateTo are provided. Each dataset (violations, inspections, accidents) is fetched sequentially with 500ms delays between batches to avoid concurrent rate limiting. On HTTP 429 responses, the actor retries with exponential backoff: 4s, 8s, 16s, and 32s before failing gracefully with a warning.

Phase 3: Record transformation and decoding

Raw API field names (e.g., cit_ord_safe, sig_sub, inj_illness) are mapped to structured camelCase output fields. Coded values are decoded using reference maps: 17 inspection type codes (AAA, E01–E17) map to full descriptions; 10 injury degree codes (1–10) map to severity labels; negligence levels are normalized through a 5-tier map. Summary objects are computed for each mine across all three datasets — violation summaries count by type, negligence, and likelihood; inspection summaries aggregate by type and total on-site hours; accident summaries count by severity tier and total lost days.

Phase 4: Risk scoring and output assembly

The risk scoring function applies 12 weighted factors to each mine's summary data. Fatalities carry the highest weight (+40 each, no cap) because a single fatality represents a catastrophic safety failure. Permanent disabilities add +20 each. Withdrawal orders — mandatory cessation orders issued when MSHA finds imminent danger — contribute +10 each up to a cap of 80 points. S&S violations add +3 each up to 60 points. High negligence and reckless disregard violations add +8 each up to 40 points. Penalty thresholds, violation volume, contested counts, days lost, and the violation-to-employee ratio contribute additional factors. The final score determines one of four risk tiers: Critical (≥100), High (≥60), Medium (≥30), or Low (<30). Output is sorted by risk score descending.

Tips for best results

  1. Start with a dry run to test your pipeline. The default dryRun: true returns a fully structured sample mine record without consuming any DOL API quota. Use it to verify your downstream pipeline, spreadsheet import, or webhook before running live data.

  2. Register your DOL API key in advance. Key registration at dataportal.dol.gov/registration is instant and free, but first-time API calls sometimes take a few minutes to activate. Test your key with a single mineId lookup before running large batches.

  3. Use date filters for large operators. National operators like Peabody Energy or Arch Resources have hundreds of mines with decades of violation history. A query like operatorName: "PEABODY" without date filters will attempt to fetch all historical records. Add dateFrom: "2022-01-01" to limit to recent activity.

  4. Combine with Company Deep Research to augment MSHA safety profiles with corporate structure, financial data, and ownership history when preparing due diligence reports.

  5. Raise maxViolationsPerMine for litigation work. The default cap of 100 violations per mine truncates full enforcement histories. Set maxViolationsPerMine: 500 or higher when preparing comprehensive citation timelines for legal proceedings.

  6. Schedule monthly runs for compliance monitoring. Use Apify's built-in scheduler to run against your own operatorName or mineId list each month. Compare riskScore and riskLevel trends over time to identify deteriorating safety performance before an MSHA enforcement escalation.

  7. Use includeInspections: false and includeAccidents: false for fast portfolio screens. If you only need violation counts and risk scores across a large mine portfolio, disabling the two optional datasets cuts run time by roughly 60%.

  8. Cross-reference the eventNumber field — violations and inspections share the same eventNumber, so you can match every citation to the specific inspection event that generated it for a complete enforcement timeline.

Combine with other Apify actors

ActorHow to combine
Company Deep ResearchRun MSHA safety profiles first, then feed the operator or controller name into Company Deep Research for corporate structure, litigation history, and financial details
B2B Lead QualifierUse MSHA risk scores as an input signal in B2B lead scoring for mining equipment, safety training, and workers' comp insurance sales
Trustpilot Review AnalyzerCombine MSHA safety records with employee and contractor reviews of mining operators for a holistic workforce safety picture
SEC EDGAR Filing AnalyzerCross-reference MSHA violation histories with public company disclosures about regulatory risk and enforcement actions
Website Contact ScraperExtract contact information from mining operator websites to reach out following safety audits or M&A screening
B2B Lead Gen SuiteUse high-risk mine operators identified by MSHA data as the seed list for outbound safety consulting or insurance sales campaigns
WHOIS Domain LookupVerify domain ownership of mine operator websites when building contact lists for safety-related outreach

Limitations

  • Data freshness depends on DOL update cadence — all five MSHA datasets are updated weekly by the DOL, not in real time. There is typically a 7–14 day lag between an MSHA inspection and its appearance in the API.
  • Historical records only; no predictive data — the actor reports what MSHA has documented. It does not predict future violations or accidents.
  • No free-text search across violation narratives — violation records include section codes and titles but not full inspector narratives. Accident records include full narratives but cannot be filtered by text.
  • DOL API rate limits apply — aggressive querying across many mines in parallel will trigger 429 responses. The actor handles this with exponential backoff, but very large runs (500+ mines with full history) may take 60+ minutes.
  • Contractor violations require separate queries — MSHA records violations against both operators and contractors. The actor joins violations by mine ID, which includes operator violations. Contractor-specific breakdowns require additional filtering by violatorType.
  • Coordinate data is incomplete — latitude and longitude are present for most but not all mines in the MSHA database. Some older records have null coordinates.
  • The DOL API occasionally returns HTTP 204 (no content) — the actor handles 204 responses gracefully as empty results, but it means that dataset had no matching records for that query, not that the mine doesn't exist.
  • includeAssessments is defined in the type interface but not yet wired up — the assessed violations dataset join is not fully implemented in the current version. Use the proposedPenalty and amountPaid fields in the violations output for penalty data.

Integrations

  • Zapier — trigger an MSHA safety lookup whenever a new mining company is added to a CRM or deal pipeline, and push results to a spreadsheet or Slack channel
  • Make — schedule monthly MSHA runs and route high-risk mine profiles (Critical or High) to specific team members via email or project management tools
  • Google Sheets — export mine safety results directly to a shared Google Sheet for ESG reporting, insurance underwriting, or portfolio tracking
  • Apify API — integrate MSHA safety queries into due diligence platforms, compliance dashboards, or risk management systems via REST API
  • Webhooks — receive a POST notification when a run completes, enabling real-time downstream processing of safety profiles in your own application
  • LangChain / LlamaIndex — feed MSHA mine records into an LLM pipeline for natural language safety analysis, ESG report generation, or automated risk narrative writing

Troubleshooting

  • Run returns 0 mines despite valid search criteria — check that your mineType matches the commodity. Gold mines use mineType: "M" (Metal/Non-Metal), not "C" (Coal). Also verify state codes are two uppercase letters with no spaces.

  • Authentication error (401 or 403) — your DOL API key is missing, invalid, or not yet activated. Verify the key at dataportal.dol.gov and wait a few minutes after first registration before retrying.

  • Run times out or takes much longer than expected — large batches with high per-mine limits are the most common cause. Reduce maxViolationsPerMine to 50, disable inspections or accidents with includeInspections: false, or split the search into smaller date-range windows.

  • Violation counts appear lower than expected — the maxViolationsPerMine cap may be truncating results. Increase to 500 or 1,000 for mines with long enforcement histories. Also confirm your dateFrom / dateTo window covers the period you expect.

  • riskScore is 0 despite known violations — the risk score is calculated from the violations, inspections, and accidents fetched in this run. If all three include flags are false, or if dateFrom / dateTo filters exclude historical records, the summary data will be empty and the score will be 0.

Responsible use

  • This actor only accesses publicly available data published by the US Department of Labor on the DOL Open Data Portal.
  • All MSHA data accessed through this actor is government-produced public information maintained for regulatory transparency.
  • Compliance with the DOL Open Data Portal's terms of service is required.
  • Do not use extracted data in ways that misrepresent mine safety records or violate fair use standards in regulatory, legal, or financial contexts.
  • For guidance on web scraping legality, see Apify's guide.

FAQ

How many mines can I search with MSHA Mining Safety & Health Data in one run? Up to 5,000 mines per run using the maxResults parameter. The DOL API covers 86,000+ mines going back to 1970. For national surveys, consider splitting by state or commodity to keep individual run times manageable.

What is a Significant and Substantial (S&S) violation in MSHA data? An S&S violation is one where an MSHA inspector determines the violation is reasonably likely to cause a reasonably serious injury or illness. S&S violations carry higher penalties and are a primary signal in the risk scoring algorithm. The significantAndSubstantial field in each violation record is a boolean, and violationSummary.significantAndSubstantial gives the total count per mine.

How current is the MSHA data returned by this actor? The DOL updates all five MSHA datasets weekly. Expect a 7–14 day lag between an on-the-ground inspection and its appearance in the API. For the most recent federal enforcement action at a specific mine, cross-check with the MSHA Enforcement Actions website.

Does this actor cover both coal and metal/non-metal mines? Yes. Set mineType: "C" for coal mines (bituminous, lignite, anthracite) or mineType: "M" for metal and non-metal mines (gold, copper, limestone, sand and gravel, potash, etc.). Leave it blank to search across both types.

How does the risk score compare to official MSHA enforcement designations? The risk score is a computed heuristic based on the same underlying data MSHA uses, but it is not an official MSHA designation. MSHA has its own Pattern of Violations (POV) program with different criteria. The risk score in this actor is designed for internal triage and comparative analysis, not regulatory compliance determinations.

Can I use MSHA Mining Safety & Health Data to monitor a competitor's safety record? Yes. Set operatorName or controllerName to a competitor's name and schedule monthly runs. Track changes in riskScore, riskLevel, and violation counts over time. All data returned is public information published by the US government.

How is this different from using the MSHA public website directly? The MSHA website requires separate searches for each mine, each dataset, and each date range. This actor joins all five datasets automatically, computes risk scores, and delivers structured JSON or CSV ready for analysis. What takes hours of manual data gathering and spreadsheet work runs in minutes.

Is it legal to use MSHA data for insurance underwriting or investment research? MSHA data is public information produced by the US federal government and is not subject to copyright. Using it for insurance underwriting, investment analysis, or due diligence is standard practice in the mining industry. Consult your legal team regarding the specific use of safety data in regulated financial contexts.

Can I get accident narratives for a specific mine? Yes. Accident records include the narrative field, which contains the MSHA investigator's written description of the accident. Set includeAccidents: true (the default) and search by mineId or operatorName to retrieve these narratives.

What happens if the DOL API is down or rate-limits my run? The actor retries failed requests up to 4 times with exponential backoff (4s, 8s, 16s, 32s). If the API remains unavailable after all retries, the actor logs a warning and continues with whatever data was already collected. Individual mine records may have empty violation, inspection, or accident arrays if those specific dataset fetches failed.

Can I schedule this actor to run automatically? Yes. Use Apify's built-in scheduler to run on any interval — daily, weekly, or monthly. Scheduling is configured from the actor's page in the Apify Console without any code.

How do I find a mine's 7-digit MSHA mine ID? Search by mine name, operator, or state first with dryRun: false. Each result includes the mineId field. Alternatively, look up the mine ID on the MSHA Mine Data Retrieval System and use it directly in mineId for subsequent runs.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try MSHA Mining Safety & Health Data?

Start for free on Apify. No credit card required.

Open on Apify Store