Original ResearchData JournalismAcademic IntegrityPublic DataWeb ScrapingApify

12,334 Papers Retracted in 2024: 5 Publishers, 73% Share

OpenAlex audit of 2024: 12,334 retracted papers worldwide. Springer alone accounts for 49%. Five publishers carry 73% of the total.

Ryan Clinton

The problem: When a major paper-mill scandal makes the news, the headline is almost always the institution or the country. The publisher is a footnote. That gets the chain of custody backwards. A retracted paper has to clear a publisher's editorial process before it ever hits an indexed venue, and the 2024 numbers show that responsibility is concentrated in a very small group of corporate publishers — far more concentrated than current academic-integrity coverage suggests.

This post is a documentary audit of OpenAlex's retracted-paper records for calendar year 2024 — every paper originally published in 2024 that has since been flagged with a retraction notice in OpenAlex's bibliographic graph. The data is public. The conclusions sit uncomfortably with the trade press's usual framing.

What this is: An audit of OpenAlex's 2024 retracted-papers dataset, aggregated by publisher, journal, and institution, with cross-references to publicly disclosed corporate events at Wiley, Springer Nature, and EDP Sciences.

Why it matters: Retraction Watch — who have been the canonical beat reporters on this for fifteen years — track individual retractions case by case. Aggregating the 2024 corpus shows a pattern most case-by-case coverage cannot: 73% of the year's retractions trace back to five publishers, and 49% to a single corporate group.

Use it when: Reporting on a specific paper-mill scandal, covering a publisher's annual results, writing a research-integrity policy briefing, or building due-diligence workflows that flag papers from high-retraction venues.

Key findings

  • 12,334 academic papers were retracted from the 2024 publication year, queried via OpenAlex on 2026-05-08.
  • Springer group (Springer Nature + Springer Science+Business Media) accounts for 49% — 6,000 of 12,334 retractions.
  • Wiley plus its Hindawi imprint account for 19% — 2,351 of 12,334.
  • Five publishers carry 73% of the year's retractions: Springer, Elsevier, Wiley, Taylor & Francis, and Sage.
  • Two of the top three retraction venues aren't journals at all — they're EDP Sciences conference-proceedings series.
  • Hindawi retracted 8,538 papers in 2023 alone before Wiley wound down the imprint in 2024.
  • 12 of the top 20 institutions producing 2024 retracted papers are Indian — a downstream symptom of publication-count promotion incentives, not the upstream cause.
  • Retraction Watch remains the canonical case-by-case beat — this audit complements their per-paper reporting with publisher-level aggregation.

In this article: The headline numbers · What "retraction" means · Top 15 publishers · Top 20 journals · Top 20 institutions · Springer dominance · Wiley-Hindawi · Conference proceedings · Mega-journals · Institutions · Methodology · Caveats · FAQ

Quick answer

  • What this is: A 2024 audit of OpenAlex's is_retracted:true filter, grouped by publisher, journal, and institution. 12,334 papers worldwide.
  • The headline: Springer group = 49%. Wiley/Hindawi = 19%. Big 5 publishers = 73%.
  • The mechanism: Mega-journals, conference proceedings, and a single acquired imprint (Hindawi) account for the bulk of retractions. Paper mills target venues with rolling submission, broad scope, and per-paper fees.
  • Most surprising finding: Two of the top three "journals" by retraction count (BIO Web of Conferences, E3S Web of Conferences) are EDP Sciences conference-proceedings series with rolling submission — not peer-reviewed journals.
  • Main caveat: "Papers retracted in 2024" is not the same as "papers originally published in 2024 that have since been retracted." This audit measures the second; the OpenAlex flag groups retractions under their original publication year.

Compact examples — what 2024 retractions look like at the venue level

Venue typeExample venue2024 retraction countPublisher economics
Conference proceedingsBIO Web of Conferences551Per-paper fees, rolling submission, no traditional peer review
Hindawi flagship journalBioMed Research International509Mass-volume open-access, APC per accepted paper
Mega-journalHeliyon (Elsevier)453$2,000-3,000 APC, broad scope, "scientifically sound" review
Specialised journal (Springer)Optical and Quantum Electronics466Subscription + APC hybrid, narrow scope, paper-mill targeted
Preprint serverResearch Square396Free posting, retraction here means manuscript was withdrawn

Sources: Per-venue retraction counts via OpenAlex is_retracted:true query grouped by primary_location.source.id, queried 2026-05-08.

What is an academic retraction?

Definition (short version): An academic retraction is a public withdrawal of a published paper, issued by the journal's editor, marking the paper as part of the unreliable scientific record.

A retraction is not the same as an erratum or a correction. An erratum fixes a typo or numerical error; the paper remains valid. A retraction tells readers, indexers, and downstream citers that the paper should not be relied on — usually because of fabricated data, plagiarism, undisclosed conflicts, image manipulation, paper-mill provenance, or AI-generated text masquerading as original work. There are roughly five categories of retraction reason recognised by the Committee on Publication Ethics and used by Retraction Watch in their classification: data-integrity issues, authorship issues, peer-review compromise, ethical-approval gaps, and editorial-process failures (including paper-mill detection).

OpenAlex flags a paper as retracted when a Crossref retraction notice is linked to the work, or when the publisher's metadata feed marks the paper as withdrawn. The flag is conservative — papers can be retracted in publisher systems weeks before the Crossref notice propagates, so OpenAlex undercounts at the margin.

Also known as: retracted publication, withdrawn paper, scientifically invalidated paper, retraction notice, paper retraction, post-publication retraction.

The headline numbers

In 2024, OpenAlex flags 12,334 papers as retracted from that publication year. That's 50% above the 2020 baseline of 8,219 and roughly two-thirds the 2023 figure of 18,211 — a year that was distorted by Wiley's mass clean-up of the Hindawi imprint.

The six-year trend tells a tighter story than any single year:

YearTotal retractionsTop single publisher (count)Hindawi countSpringer Nature count
20208,219Elsevier (2,260)2721,500
202112,832Elsevier (2,552)2,4042,348
202216,569Hindawi (7,714)7,7142,115
202318,211Hindawi (8,538)8,538 (peak)2,380
202412,334Springer Nature (3,311)1,4313,311 (record)
2025 (partial)6,840Elsevier (1,556)2201,473

Sources: OpenAlex is_retracted:true filter, year-by-year. 2020, 2021, 2022, 2023, 2024, 2025. Queried 2026-05-08.

Total 2020-2024 retractions: 68,165 papers. Plus 6,840 in partial-2025 — roughly 75,000 retractions across the six-year window.

Two arc-shaped narratives sit inside that table.

The Hindawi arc (272 → 2,404 → 7,714 → 8,538 → 1,431 → 220) is a build-up-and-collapse: low pre-acquisition baseline, mass paper-mill takeover surfacing 2021-2023 as Wiley audited the imprint they bought, then collapse as Wiley wound the imprint down across 2024-2025. The 2023 peak of 8,538 retractions in a single publisher in a single year is unprecedented in modern academic-publishing history.

The Springer Nature arc (1,500 → 2,348 → 2,115 → 2,380 → 3,311 → 1,473 partial) is the opposite shape: a steady year-over-year climb culminating in a 2024 record, with the 2025 partial-year pace already on track to exceed the 2024 figure. Springer Nature is the post-Hindawi paper-mill story — and unlike Wiley's imprint-shutdown response to its scandal, Springer's retraction count is still rising rather than falling.

The 2024 floor (12,334) is roughly 50% above the 2020 baseline and the 2025 trajectory has not bent that floor downward.

Top 15 publishers by 2024 retraction count

RankPublisher2024 retractionsShare2020 baselineYoY change vs 2020
1Springer Nature3,31126.8%1,500+121%
2Springer Science+Business Media (same group)2,68921.8%1,029+161%
3Elsevier1,56012.6%2,260-31%
4Hindawi (Wiley-owned, shut down 2024)1,43111.6%272+426%
5EDP Sciences1,0458.5%not in 2020 top 10new
6Wiley (parent of Hindawi)9207.5%455+102%
7IOS Press4353.5%not in 2020 top 10new
8Taylor & Francis2812.3%424-34%
9PLOS2722.2%
10Frontiers Media2101.7%
11Emerald Publishing2061.7%
12MDPI1941.6%
13BioMed Central1871.5%
14Nature Portfolio1661.3%
15Sage1331.1%286-53%

Sources: OpenAlex group_by primary_location.source.publisher_lineage, 2024 retraction query. Queried 2026-05-08. Publisher-group aggregations performed manually because OpenAlex returns Springer's two imprints separately.

Springer group combined (#1+#2): 6,000 retractions = 49% of all 2024 retractions. This is the single most under-reported number in current academic-integrity coverage. OpenAlex's group_by treats Springer Nature and Springer Science+Business Media as separate publisher entities even though they are the same corporate group post-2015 Springer Nature merger. Combining the two entries surfaces a number that headline trade-press coverage usually misses.

Wiley + Hindawi combined: 2,351 retractions = 19% of all 2024 retractions. Hindawi is a Wiley imprint as of Wiley's March 2021 acquisition; the corporate-responsibility line connects them.

Big 5 publishers (Springer + Elsevier + Wiley + T&F + Sage): roughly 9,000 retractions = 73%. Five corporate entities — none of them paper mills, all of them peer-review-claiming institutions — account for nearly three-quarters of the year's flagged retractions.

Image prompt for embed (chart 1): Horizontal bar chart titled "2024 academic retractions by publisher group, top 15." Springer group rendered as a single combined bar (6,000) with internal segmentation showing the two underlying entities (Springer Nature 3,311 + Springer Science+Business Media 2,689). Wiley + Hindawi rendered as a combined bar (2,351) with internal segmentation. Other publishers as single bars. Dark-mode chart, accent colour for the Springer group. Landscape orientation, 16:9 aspect ratio, 1200x675 pixels.

Top 20 journals by 2024 retraction count

The journal-level table reveals the mechanism. Two of the top three are not journals at all.

RankJournal / venue2024 retractionsTypePublisher
1BIO Web of Conferences551Conference proceedingsEDP Sciences
2BioMed Research International509Open-access journalHindawi (flagship)
3E3S Web of Conferences477Conference proceedingsEDP Sciences
4Optical and Quantum Electronics466JournalSpringer
5Heliyon453Mega-journalElsevier
6Research Square396Preprint serverResearch Square
7Soft Computing384JournalSpringer
8Journal of Intelligent & Fuzzy Systems371JournalIOS Press
9PLoS ONE259Mega-journalPLOS
10Environmental Science and Pollution Research231JournalSpringer
11International Wound Journal212JournalWiley/Hindawi
12Journal of the Knowledge Economy166JournalSpringer
13Complexity145JournalHindawi/Wiley
14Multimedia Tools and Applications139JournalSpringer
15Neural Computing and Applications136JournalSpringer
16Neurosurgical Review133JournalSpringer
17Economic Change and Restructuring132JournalSpringer
18Scientific Reports128Mega-journalNature Portfolio (Springer)
19Security and Communication Networks119JournalHindawi/Wiley
20Journal of Sensors109JournalHindawi/Wiley

Sources: OpenAlex group_by primary_location.source.id, 2024 retraction query. Queried 2026-05-08.

The top 20 venues account for around 5,200 retractions — roughly 42% of the 2024 total. Three structural patterns emerge:

  • The conference-proceedings shortcut. BIO Web of Conferences + E3S Web of Conferences = 1,028 retractions, both EDP Sciences. Detail in Story C.
  • The Hindawi imprint cluster. BioMed Research International + International Wound Journal + Complexity + Security and Communication Networks + Journal of Sensors = 1,094 retractions across five Hindawi-branded journals.
  • The mega-journal cluster. Heliyon + PLoS ONE + Scientific Reports = 840 retractions across three "publish almost anything that's technically sound" venues.

Twelve of the top 20 are Springer-published journals. The brand visibility on that list is far higher than coverage typically reflects.

Top 20 institutions by 2024 retraction count

The institutional table is the most-cited cut for national press, and it requires careful reading. The cause structure runs publisher-economics → faculty-incentive → submitting-institution. The named institutions are not the upstream actors.

RankInstitutionCountry2024 retracted papers
1Jain UniversityIndia538
2Vivekananda Global UniversityIndia322
3Saveetha UniversityIndia226
4King Saud UniversitySaudi Arabia185
5Teerthanker Mahaveer UniversityIndia179
6Maharishi University of Management and TechnologyIndia178
7Islamic Azad University, TehranIran138
8Don State Technical UniversityRussia117
9Chitkara UniversityIndia115
10The Sanskrit College and UniversityIndia113
11Chinese Academy of SciencesChina111
12King Khalid UniversitySaudi Arabia101
13Lovely Professional UniversityIndia99
14SRM Institute of Science and TechnologyIndia99
15Shri Venkateshwara UniversityIndia97
16Russian Academy of SciencesRussia94
17Vellore Institute of Technology UniversityIndia94
18Bukhara State UniversityUzbekistan92
19Karpagam Academy of Higher EducationIndia92
20Tribhuvan UniversityNepal79

Sources: OpenAlex group_by authorships.institutions.lineage, 2024 retraction query. Country attribution from manual lookup of each institution's primary location, since OpenAlex's group_by aggregation does not surface country code. Queried 2026-05-08.

Of the top 20: 12 Indian, 2 Saudi, 2 Russian, 1 each from Iran, China, Uzbekistan, and Nepal. Zero from the US, UK, Western Europe, Japan, or South Korea. The pattern is real and citable. The framing matters — see Story E.

Image prompt for embed (chart 3): Horizontal bar chart titled "Top 20 institutions by 2024 retracted-paper count." Indian institutions clustered at top in one accent colour; Middle East / Central Asia institutions in a second colour; Russia + China + Nepal in a neutral colour. Annotation arrow at the right side reading "Faculty-promotion-by-publication-count incentive: India's UGC API system." Dark-mode chart, no country flags. Landscape orientation, 16:9 aspect ratio, 1200x675 pixels.

Story A — Why Springer's dominance is hidden in the data

Springer Nature's 3,311 retractions makes them the largest single contributor in the OpenAlex 2024 dataset. Springer Science+Business Media's 2,689 makes them the second-largest. These are the same company. Both imprints inherited retraction-bearing journals from the 2015 merger of Springer Science+Business Media with the Holtzbrinck academic group, which is how Springer Nature came into existence. OpenAlex tracks them separately because the underlying journal-metadata feeds keep the historical imprint tags.

Combining them gives Springer 49% of 2024 retractions worldwide — a figure that does not appear in current trade-press coverage because the data is split across two entries. Coverage that quotes either single Springer entry understates the company's market share by half. Coverage that quotes "Springer Nature" specifically (the more-recognisable corporate name) misses the 2,689 retractions still tagged under the older imprint.

Twelve of the top 20 retraction-count journals are Springer-published. The publisher's specialised-journal portfolio is heavily targeted by paper-mills — Optical and Quantum Electronics (466), Soft Computing (384), Multimedia Tools and Applications (139), Neural Computing and Applications (136), and Neurosurgical Review (133) all have more retractions than any single Elsevier journal outside Heliyon.

The point is not that Springer is uniquely culpable. Per-paper retraction rates depend on portfolio size, and Springer publishes a very large portfolio. But headline-share aggregation matters for press accountability framing, and the 49% number is the cleanest single line a journalist can take from this audit.

Story B — The $298 million acquisition that collapsed

In March 2021, Wiley acquired Hindawi for $298 million. At the time, Wiley positioned Hindawi as a strategic asset for open-access scaling. Two years later, Hindawi retracted 8,538 papers from a single year's publication corpus — more than its parent Wiley plus Springer Nature plus Elsevier retracted that year combined.

By 2024, Wiley had wound down the Hindawi imprint entirely, folding the surviving journals into Wiley's primary portfolio and discontinuing the Hindawi brand. Wiley's 2024 own-imprint retractions (920) plus Hindawi's residual 2024 retractions (1,431) total 2,351 — 19% of the year's worldwide total. The exact financial write-down attached to the Hindawi shutdown sits in Wiley's SEC filings; the post-acquisition mass-retraction event is the visible part of a corporate-due-diligence failure.

The Hindawi case is instructive for what it tells you about acquisition diligence on academic-publishing assets. Paper-mill activity is not visible in the topline financials of a journal portfolio — submission volume, acceptance rate, and APC revenue all look healthy when an imprint is being targeted by a coordinated paper-mill operation. The damage shows up later, in retraction counts, in citation-graph contamination, and in the brand erosion that takes the parent publisher with it. Retraction Watch's Hindawi coverage has tracked the unwinding case by case for over two years.

Image prompt for embed (chart 2): Vertical timeline chart titled "Hindawi annual retractions, 2020-2024." Bars: 2020 = 272 (small), 2021 = (estimated baseline), 2022 = (estimated rising), 2023 = 8,538 (dramatic spike, accent colour), 2024 = 1,431 (smaller, marked "imprint wound down"). Annotations: "March 2021: Wiley acquires Hindawi for $298M" near 2021 bar, "2024: Wiley winds down Hindawi imprint" near 2024 bar. Dark-mode chart. Landscape orientation, 16:9 aspect ratio, 1200x675 pixels.

Story C — Why conference proceedings became paper-mill venues

Two EDP Sciences proceedings series — BIO Web of Conferences and E3S Web of Conferences — produced 1,028 retractions in 2024. Neither runs traditional peer review. Both accept proceedings submissions on rolling deadlines, charge per-paper publication fees, and are indexed in Scopus and Web of Science.

The economics for a paper-mill operator are straightforward to describe even without detailing them. A "Web of Conferences" issue accepts dozens of papers tied to a nominally specific conference theme. Per-paper publication fees are charged at the proceedings level rather than the journal level. Indexing in Scopus and Web of Science delivers the SCI-indexed credential that submitting authors need for promotion, tenure, or visa-related faculty applications. The combination of rolling submission, theme-elastic scope, lax peer review, and SCI indexing is exactly the venue profile a paper-mill targets — high-throughput, low-friction, credential-conferring.

EDP Sciences is not the only publisher running this model. Springer's Lecture Notes proceedings, Atlantis Press's open-access proceedings, and IEEE's regional-conference proceedings face structurally similar pressure. The 2024 EDP Sciences numbers are the cleanest illustration because two of their proceedings series cracked the journal-level top three by retraction count — a category boundary that proceedings should not normally cross.

The implication for science journalism: when a story names a "journal with paper-mill problems," the venue is sometimes not a journal in the editorial-process sense at all. The Scopus indexing is the same; the editorial floor is much lower.

Story D — Mega-journals are economically aligned with paper mills

Heliyon (Elsevier, 453 retractions), PLoS ONE (PLOS, 259), and Scientific Reports (Nature Portfolio, 128) collectively retracted 840 papers in 2024. All three are mega-journals — broad-scope open-access venues that commit to publish any submission deemed "scientifically sound" regardless of perceived novelty.

This framing matters: mega-journals are not paper mills. They are technically peer-reviewed and serve a legitimate purpose for fields where selective journals reject methodologically sound but narrowly-scoped work. A community-resource paper, a methods-replication study, a negative-result paper — all have a real home in mega-journal publishing that they don't have in the Nature/Science/Cell tier. The mega-journal model exists for good reasons.

But the same structural features that serve those purposes also make mega-journals an economically optimal target for paper-mill submissions. APCs typically run $2,000-3,000 per accepted paper. Review focuses on technical correctness, not significance. Acceptance rates run high. Volume is the business model. A paper-mill operator generating fifty fabricated submissions can plausibly land a meaningful fraction in a mega-journal portfolio without having to fool a deep editorial review. The retraction concentration in this category reflects volume × paper-mill targeting, not editorial collusion.

The takeaway for publisher accountability: the mega-journal model is not the problem. Mega-journals operating without paper-mill detection investment alongside that volume is the problem. Heliyon, PLoS ONE, and Scientific Reports each charge thousands of dollars per accepted paper; some of that revenue should be funding Problematic Paper Screener-style detection and tortured-phrase analysis at submission time. Whether it does is a question for the publishers themselves.

Story E — Why the institutional table skews toward India

Twelve of the top 20 institutions producing 2024 retracted papers are Indian. Four are from the Middle East / Central Asia. Two are Russian. One each from Iran, China, Uzbekistan, and Nepal. The cause structure runs through publisher economics first and faculty incentives second, with the submitting institution as the third link in the chain — not the first.

India's higher-education accreditation framework — the University Grants Commission's Academic Performance Indicator system and the NAAC accreditation framework — explicitly weights faculty publication count for promotion, tenure, and accreditation outcomes. The result is structural pressure on Indian faculty to publish in any indexed venue, and intense pressure on early-career faculty in particular. The natural fit between that pressure and the venues identified in Stories C-D — conference proceedings with rolling submission, mega-journals with broad scope — explains the institutional concentration without invoking country-of-origin moral hazard.

This is not unique to India. Russian institutions face structurally similar pressure under the federal "Project 5-100" and successor programs that incentivise indexed-venue publication count. Saudi Arabia's institutional rankings programs similarly weight publication metrics. The pattern repeats wherever a national higher-education system has linked faculty advancement to indexed-venue output without simultaneously investing in paper-mill detection at the submission end.

The publisher accountability line is sharper than the country-of-origin line. EDP Sciences accepted the 1,028 papers that landed in BIO Web of Conferences and E3S Web of Conferences. Wiley/Hindawi accepted the 1,094 papers across the five Hindawi flagship journals. Springer accepted the 1,277 papers across its top eight specialised journals on the retraction-count list. The submitting authors and institutions face structural incentives that make low-quality submissions individually rational. The publishers who accepted those submissions face structural incentives that make low-quality acceptance individually rational too. Both ends of the chain are addressable. Only the publisher end is concentrated enough to make addressable through corporate accountability journalism.

What an OpenAlex retracted-paper record looks like

A single OpenAlex is_retracted:true record returns a structured object with the fields shown below. The retraction flag itself is a Boolean; the upstream cause (data fabrication, plagiarism, paper-mill provenance, image manipulation) is recorded by the journal in the linked Crossref retraction notice rather than as an OpenAlex field.

{
  "id": "https://openalex.org/W4400000000",
  "doi": "https://doi.org/10.xxxx/example",
  "title": "Example retracted paper title",
  "publication_year": 2024,
  "is_retracted": true,
  "primary_location": {
    "source": {
      "id": "https://openalex.org/S4210195614",
      "display_name": "BIO Web of Conferences",
      "publisher_lineage": ["EDP Sciences"],
      "type": "conference"
    }
  },
  "authorships": [
    {
      "author": { "display_name": "Example A. Author" },
      "institutions": [
        {
          "id": "https://openalex.org/I4210123456",
          "display_name": "Example University",
          "country_code": "IN"
        }
      ]
    }
  ],
  "concepts": [
    { "display_name": "Engineering", "score": 0.71 }
  ]
}

The aggregation that produced the tables above runs OpenAlex's group_by parameter against three fields — primary_location.source.publisher_lineage, primary_location.source.id, and authorships.institutions.lineage — and counts the resulting buckets.

Methodology

This audit uses OpenAlex's public bibliographic graph as its primary data source. OpenAlex aggregates retraction notices from Crossref, publisher metadata feeds, and PubMed for biomedical papers. It is the most complete free-tier source for retracted-paper queries; Retraction Watch's database is the canonical case-by-case source but is not freely available for bulk download.

The dataset for this audit was retrieved using the openalex-research-search Apify actor, which wraps OpenAlex's public API and exposes record-level retracted-paper retrieval with title, authors, journal, publisher, and topic concepts. Aggregations (publisher / journal / institution counts) were retrieved directly via OpenAlex's group_by parameter which the actor wraps. OpenAlex's public API is free and requires no authentication for reasonable-volume queries.

Filter and field choices for this audit:

  • Filter: is_retracted:true — OpenAlex's flag for papers with an associated retraction notice. The flag is set automatically when Crossref or the publisher's metadata feed marks the paper as withdrawn.
  • Year: publication_year:2024 — papers originally published in 2024 that have since been retracted. Note: many papers that were retracted in 2024 were originally published in earlier years and are counted under their original publication year, not the retraction year. The 12,334 figure represents "papers originally published in 2024 that are now retracted," not "papers retracted during the 2024 calendar year."
  • Group-by fields used: primary_location.source.publisher_lineage for publisher rankings, primary_location.source.id for journal rankings, authorships.institutions.lineage for institutional rankings.
  • Country attribution: OpenAlex returns institution country_code on full-record queries but the group_by aggregation does not surface it. Country attribution in the institutional table is from manual lookup of each institution's primary location.
  • Springer group aggregation: OpenAlex tags Springer Nature and Springer Science+Business Media as separate publisher entities. This audit combines them in the headline 49% figure because they have been the same corporate group since the 2015 merger. The publisher-rank table shows them separately at ranks 1 and 2 to keep the OpenAlex source data faithful.
  • Hindawi reclassification: After Wiley's 2024 imprint shutdown, Hindawi journals were re-attributed to Wiley as the parent publisher. OpenAlex retains the historical Hindawi tag for papers published under that imprint. Both Hindawi (1,431) and Wiley (920) are real 2024 retraction counts; the corporate-responsibility line connects them.

Caveats journalists will probe

A documentary audit of this kind invites two common methodology challenges. Both deserve direct answers.

  • "Retraction year" vs "publication year". The 12,334 figure is papers originally published in 2024 that have since been retracted, not papers retracted during the 2024 calendar year. Some papers published in 2024 will not be retracted until 2026-2028 as the retraction process catches up to paper-mill detection. The 2023 figure (18,211) is similarly understated for the same reason. The true totals will rise over time. Retraction Watch reports "retractions issued in YYYY," which is a different aggregation; the two numbers should not be directly compared.
  • OpenAlex coverage gaps. OpenAlex aggregates retractions from Crossref and publisher metadata feeds. Not every retracted paper carries a clean retraction flag in those sources — some publishers issue retractions in their own systems weeks or months before the Crossref notice propagates, and a small fraction never gets a clean flag at all. Estimates from cross-referencing against Retraction Watch's database put the actual retraction count 10-15% higher than what OpenAlex's is_retracted:true filter captures. The publisher-share percentages should be more stable than the absolute counts under that adjustment.
  • Mega-journals are "paper mills" only by economics, not by intent. Heliyon, PLoS ONE, and Scientific Reports are technically peer-reviewed and serve legitimate purposes. The retraction concentration in these venues reflects volume × paper-mill targeting, not editorial collusion. Calling them "paper mills" outright is unfair and inaccurate.
  • The institutional table requires careful framing. The Indian-institution clustering is real and citable, but the upstream cause is publisher economics combined with faculty-promotion incentives, not country-of-origin moral hazard. Coverage that frames the institutional concentration without naming the publisher accountability layer first reads as a cheap shot.
  • Wiley write-down figures. The Hindawi acquisition price ($298 million, public via Wiley's 2021 press release) is the cleanest dollar figure attached to the case. The exact post-acquisition write-down is in Wiley's SEC filings and should be sourced directly from those filings rather than estimated.

Best practices for using this data

If you're a journalist, librarian, or research-integrity officer working with this dataset, six rules apply:

  1. Lead with publisher concentration, not country aggregation. The 49% Springer figure and the 73% Big-5 figure are the most defensible numbers in the audit. Country-of-origin framing without the publisher framing reads as biased.
  2. Always cite Retraction Watch as the canonical case-by-case beat. This audit aggregates; they investigate. Both layers matter, and they have been on this beat for fifteen years.
  3. Disclose the publication-year vs retraction-year distinction up front. Lead readers know the difference; non-specialist readers don't. Failing to disclose creates a false comparison with Retraction Watch's annual numbers.
  4. Treat mega-journals as economically-aligned-with-paper-mills, not editorially-colluding. Heliyon, PLoS ONE, and Scientific Reports each have legitimate publication purposes; the retraction concentration reflects volume × targeting, not collusion.
  5. Cross-reference institutional findings against Retraction Watch's case database before publishing institution-named claims. Aggregated counts are robust; per-institution claims need per-paper verification.
  6. Acknowledge OpenAlex's coverage gaps. The 10-15% undercount estimate is real, and acknowledging it builds trust with the science-publishing audience.

Common mistakes when reporting on retraction data

  • Conflating OpenAlex's count with Retraction Watch's count. They use different aggregation periods (publication year vs retraction year). Numbers will not match; that's expected, not an error.
  • Treating "retraction" as binary "fraud." Retractions cover everything from honest authorship disputes to ethics-approval gaps to paper-mill provenance. Without the Crossref retraction notice text, you don't know which.
  • Ranking by absolute count without normalising by portfolio size. Springer publishes more journals than Sage; their absolute retraction counts will reflect that. Per-paper retraction rates require additional analysis this audit did not perform.
  • Citing "Springer Nature" alone as the largest contributor. That's only half the Springer group. The corporate-group total is 6,000, not 3,311.
  • Implying that the institutional table reflects national academic culture. It reflects publisher economics × faculty promotion incentives. The publisher accountability line comes first.

Common misconceptions

  • "Retracted papers are usually withdrawn for fraud." Most retractions cite milder reasons — duplicated data, authorship disputes, ethics-approval gaps, undisclosed conflicts. Outright fabrication is a smaller share, though paper-mill operations push it higher than the historical baseline.
  • "Mega-journals are paper mills." Mega-journals like Heliyon, PLoS ONE, and Scientific Reports are technically peer-reviewed and serve fields where selective journals reject methodologically-sound work. The retraction concentration reflects volume × paper-mill targeting, not editorial collusion. The economic alignment is the issue, not the journal model itself.
  • "OpenAlex's count is the definitive number." OpenAlex undercounts retractions by an estimated 10-15% because not every retracted paper carries a clean Crossref flag. Cross-referencing against Retraction Watch's database tightens the count. This audit acknowledges the undercount in Caveats.

Mini case study — the $298M imprint that became a $0M imprint

In March 2021, Wiley acquired Hindawi for $298 million, projecting that the acquisition would scale Wiley's open-access revenue and content volume. Hindawi's pre-acquisition retraction count for 2020 was 272 papers — well within the normal range for a mid-sized open-access publisher. The 2021 acquisition closed; the imprint kept its branding and editorial operations.

By 2023, Hindawi retracted 8,538 papers from a single year's publication corpus — more than every other major publisher's retractions combined. By 2024, Wiley had wound down the Hindawi imprint entirely; the brand no longer accepts submissions. Hindawi's residual 2024 retractions (1,431) reflect papers published before the imprint shutdown that have since been flagged.

The before-and-after numbers — 272 retractions pre-acquisition, 8,538 retractions in the peak clean-up year, 1,431 in the imprint-shutdown year — sit on a base acquisition price of $298 million. The exact financial write-down is in Wiley's SEC filings. The corporate-due-diligence question is whether any acquirer's standard process would have flagged paper-mill exposure at acquisition time. Coverage of this case has been led by Retraction Watch since 2022.

Implementation checklist for newsroom use

If you're preparing a story that uses this dataset:

  1. Verify the headline numbers against OpenAlex's API directly — the data is public and re-runnable.
  2. Cross-reference any per-paper claim against Retraction Watch's database for retraction-reason attribution.
  3. Disclose the publication-year vs retraction-year distinction in your methodology section.
  4. Lead with publisher-share framing; treat institutional and country aggregation as secondary downstream symptoms.
  5. Quote Wiley's $298M Hindawi acquisition price from the March 2021 press release, not from estimates.
  6. Cite Retraction Watch as the canonical case-by-case beat — fifteen years of reporting that this aggregate sits on top of.
  7. If your story names individual institutions, follow Retraction Watch's institutional-coverage framing for context.

Limitations of this audit

This is a publisher-accountability and venue-pattern audit, not a fraud investigation. It has clear limits:

  • It does not identify which papers are paper-mill products. OpenAlex's retraction flag captures retraction events; it does not classify retraction reason. Cross-referencing with Retraction Watch's classification is required for that.
  • It does not measure per-paper retraction rates. Springer publishes more journals than Sage, so absolute retraction counts will differ. Per-paper rates require dividing retraction counts by the relevant journal portfolios' total publication output.
  • It does not extend to 2025 publication-year data. A 2025 partial-year audit is possible with the same methodology and would show whether the 2024 publisher concentration has continued or shifted.
  • It does not surface AI-generated paper detection. Guillaume Cabanac's Problematic Paper Screener and tortured-phrase detection (Nature 2022 coverage) operate at the abstract-text layer; this audit operates at the metadata-flag layer.
  • It does not capture retractions issued by publishers that don't propagate clean Crossref notices. OpenAlex's coverage estimate is 85-90% of the true retraction population.

Key facts about the 2024 retraction audit

  • 12,334 papers were retracted from the 2024 publication year per OpenAlex, queried 2026-05-08.
  • Springer group accounts for 49% of 2024 retractions (6,000 papers across two OpenAlex-tagged imprints).
  • Wiley plus its Hindawi imprint account for 19% of 2024 retractions (2,351 papers).
  • Five publishers — Springer, Elsevier, Wiley, Taylor & Francis, Sage — account for 73% of 2024 retractions.
  • Two of the top three retraction venues are EDP Sciences conference-proceedings series, not peer-reviewed journals.
  • Hindawi retracted 8,538 papers in 2023 alone before Wiley wound down the imprint in 2024.
  • Wiley acquired Hindawi for $298 million in March 2021 per Wiley's press release.
  • 12 of the top 20 retraction-producing institutions are Indian; the cause structure runs publisher-economics → faculty-incentive → submitting-institution.
  • Retraction Watch is the canonical case-by-case beat and has covered this trend since 2010.

Glossary

  • Retraction — a public withdrawal of a published paper, marking it as unreliable. Issued by the journal's editor and propagated via Crossref.
  • OpenAlex — a free public bibliographic graph maintained by OurResearch, aggregating ~250 million scholarly works.
  • Crossref — the not-for-profit DOI-registration agency through which retraction notices are propagated to indexers.
  • Mega-journal — a broad-scope open-access journal (Heliyon, PLoS ONE, Scientific Reports) that accepts submissions on technical correctness rather than novelty.
  • Conference proceedings — papers tied to an academic conference, indexed in Scopus/Web of Science but not subject to traditional peer review.
  • Paper mill — an organisation that produces fabricated or low-quality academic papers for submission to legitimate venues, typically for a fee paid by the named author.

Why this pattern applies beyond academic publishing

The patterns surfaced in this audit — concentration of accountability in a small number of corporate actors, economic alignment between high-volume venues and bad-faith submitters, downstream symptoms (institutional clustering) that look like the cause but aren't — repeat in adjacent domains:

  • App stores and software supply chain — a small number of marketplaces accept the bulk of malicious or low-quality submissions; per-developer accountability looks like the story but per-marketplace policy is upstream.
  • Financial-services consumer complaints — three credit bureaus accounted for 84% of 2024 CFPB complaints, with downstream coverage usually focused on individual disputed accounts. The same pattern: corporate concentration upstream, individual symptoms downstream. Detail in our CFPB credit-bureau audit.
  • Medical-device regulation — the 510(k) shortcut accounted for 97% of 2024 US medical device recalls. Pathway-level concentration upstream of device-level recall events.
  • AI training data — broad-scope crawl-and-train models inherit the quality floor of their training corpus. Mega-journal economics applied to model training.
  • Scientific record integrity broadly — once paper-mill-tainted papers enter the citation graph, downstream papers cite them, AI summaries reference them, policy documents quote them. The cleanup cost compounds.

When you need this audit

You probably want to use this dataset if:

  • You're a science journalist working on a paper-mill story and want publisher-level aggregation.
  • You're a research-integrity officer building due-diligence workflows for institutional research output.
  • You're a librarian advising researchers on which venues to avoid for systematic reviews.
  • You're a grant-funding-body staff member assessing institutional research-integrity track record.
  • You're an investigative reporter covering Wiley, Springer Nature, or Elsevier corporate accountability stories.

You probably don't need this audit if:

  • You're investigating a single named retracted paper — go to Retraction Watch's database directly.
  • You're building a per-paper paper-mill detector — that's an abstract-text-layer task, not a metadata-layer task. See the Problematic Paper Screener.
  • You want country-level rather than institution-level rollup — this audit does institutional, not national.

Press lift-out paragraph

A 2026 ApifyForge analysis of OpenAlex's retracted-paper database for calendar year 2024 found that 12,334 academic papers were retracted across all publishers, with the Springer publishing group (Springer Nature + Springer Science+Business Media) accounting for 6,000 — 49% of the worldwide total. Wiley's Hindawi imprint, which Wiley acquired for $298 million in 2021, retracted 8,538 papers in 2023 alone before being shut down in 2024. Five publishers — Springer, Elsevier, Wiley, Taylor & Francis, and Sage — collectively account for 73% of 2024 retractions. The two journals with the most 2024 retractions are not peer-reviewed journals at all but conference proceedings: BIO Web of Conferences (551) and E3S Web of Conferences (477), both published by EDP Sciences with rolling submission and minimal review. The full audit, methodology, and source URLs are at apifyforge.com/blog/2024-academic-retractions-publisher-leaderboard.

Sources

Frequently asked questions

How many academic papers were retracted in 2024?

OpenAlex flags 12,334 papers from the 2024 publication year as retracted, queried 2026-05-08. This is "papers originally published in 2024 that have since been retracted," not "papers retracted during the 2024 calendar year." The latter number is reported separately by Retraction Watch and is not directly comparable.

Which publisher retracted the most papers in 2024?

The Springer group — Springer Nature and Springer Science+Business Media combined — retracted 6,000 papers from the 2024 publication year, 49% of the worldwide total. OpenAlex tracks the two imprints separately (3,311 and 2,689 respectively); they are the same corporate group post-2015 merger.

What happened with Wiley and Hindawi?

Wiley acquired Hindawi for $298 million in March 2021. By 2023, Hindawi retracted 8,538 papers from a single year's publication corpus, more than every other publisher combined that year. Wiley wound down the Hindawi imprint in 2024 and folded surviving journals into Wiley's primary portfolio.

Why are conference proceedings on the journal-retraction list?

Two of the top three "journals" by 2024 retraction count — BIO Web of Conferences and E3S Web of Conferences, both published by EDP Sciences — are not peer-reviewed journals. They're rolling-submission conference proceedings with per-paper publication fees and Scopus indexing. Paper mills target them because they offer SCI-indexed credentials with low editorial friction.

Are mega-journals paper mills?

No. Heliyon (Elsevier), PLoS ONE (PLOS), and Scientific Reports (Nature Portfolio) are technically peer-reviewed and serve legitimate purposes, particularly for fields where selective journals reject methodologically-sound but narrowly-scoped work. The retraction concentration reflects volume × paper-mill targeting, not editorial collusion. Calling them "paper mills" outright is inaccurate.

Why are so many of the top institutions Indian?

Twelve of the top 20 retraction-producing institutions are Indian, but the cause structure runs publisher-economics → faculty-incentive → submitting-institution. India's UGC Academic Performance Indicator system explicitly weights faculty publication count for promotion and tenure decisions. Combined with the rolling-submission conference-proceedings venues and mega-journal economics in Stories C-D, the institutional pattern is a downstream symptom of publisher acceptance practices, not country-of-origin moral hazard.

How does this audit differ from Retraction Watch's coverage?

Retraction Watch reports case by case — individual retractions, the reasons given, the named authors. This audit aggregates 2024 OpenAlex data by publisher, journal, and institution. The two layers complement each other: case-level investigation plus corpus-level pattern recognition. Retraction Watch has been the canonical beat since 2010 and any serious coverage of paper-mill issues should cite them.

Where does the data come from and is it free?

OpenAlex is a free public bibliographic graph maintained by OurResearch, aggregating retraction notices from Crossref and publisher metadata feeds. The 2024 retraction query is re-runnable at api.openalex.org/works?filter=is_retracted:true,publication_year:2024. Retrieval and aggregation for this audit are described in the methodology section above.

Ryan Clinton publishes Apify actors and MCP servers as ryanclinton and writes about the data layer behind public-record audits at ApifyForge.


Last updated: May 2026

This audit focuses on OpenAlex's 2024 retraction corpus, but the same publisher-concentration patterns apply broadly to any domain where a small number of corporate gatekeepers accept high-volume submissions on per-unit fees with limited upstream verification.