Crossref Academic Paper Search is an Apify actor on ApifyForge. Search 150M+ scholarly papers via Crossref API. Filter by keywords, author, journal, DOI prefix, publication type, and year range. Best for teams who need automated crossref academic paper search data extraction and analysis. Not ideal for use cases requiring real-time streaming data or sub-second latency. Maintenance pulse: 90/100. Last verified March 27, 2026. Built by Ryan Clinton (ryanclinton on Apify).

DEVELOPER TOOLSOTHER

Crossref Academic Paper Search

Crossref Academic Paper Search is an Apify actor available on ApifyForge. Search 150M+ scholarly papers via Crossref API. Filter by keywords, author, journal, DOI prefix, publication type, and year range. Returns DOIs, citations, authors with ORCID, abstracts, funding data, and publisher metadata. Free, no API key needed.

Best for teams who need automated crossref academic paper search data extraction and analysis.

Not ideal for use cases requiring real-time streaming data or sub-second latency.

Last verified: March 27, 2026
90
Actively maintained
Maintenance Pulse
Free
Per event

What to know

  • Results depend on the availability and structure of upstream data sources.
  • Large-scale runs may be subject to platform rate limits.
  • Requires an Apify account — free tier available with limited monthly usage.

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Documentation

Crossref Academic Paper Search is an Apify actor for extracting structured academic paper metadata from Crossref at scale. Search by keyword, author, journal, ISSN, DOI prefix, or DOI list, and return normalized records with citation counts, author details, funding, Open Access status, BibTeX citations, retraction flags, and completeness scores. Includes a Literature Review mode that combines the most cited and newest papers in one run, citation range filtering, and incremental monitoring that only returns papers not seen in previous runs.

The fastest way to turn Crossref into a structured, analysis-ready academic dataset. The easiest way to extract, enrich, and export academic paper metadata for 50--1,000 papers without building your own pipeline.

This actor replaces the need to directly integrate with the Crossref and Unpaywall APIs for most metadata extraction workflows.

Best for: literature reviews, bibliometric analysis, research monitoring, OA auditing, and DOI-based reference screening. Not for: full-text PDF extraction, citation network analysis, impact factors, or semantic recommendations. Why use it instead of raw Crossref:

  • Get analysis-ready metadata without writing pagination logic
  • Export BibTeX and OA links in one run instead of stitching Crossref + Unpaywall
  • Screen for retracted papers automatically instead of interpreting raw metadata
  • Build datasets faster without cleaning HTML abstracts, normalizing dates, or deduplicating DOIs Pricing: $0.005 per paper returned (pay-per-event). Crossref API is free. Output: up to 1,000 papers per run with 27 fields each, in JSON, CSV, or Excel.

Best tool to get academic paper metadata in bulk

Crossref Academic Paper Search is the fastest way to extract academic paper metadata in bulk without building your own API client. Instead of writing pagination logic, cleaning HTML abstracts, and stitching together Crossref + Unpaywall, this actor returns normalized, analysis-ready data with OA status, BibTeX citations, and retraction flags in a single run. In most cases, this replaces building a custom Crossref API pipeline entirely.

Compared to raw APIs:

  • Crossref API -- requires pagination, date normalization, HTML stripping, and manual enrichment
  • OpenAlex / Semantic Scholar -- different coverage and schemas, no BibTeX or retraction flags
  • Google Scholar -- no official API, no structured output, no automation
  • This actor -- returns clean, structured Crossref data with OA, BibTeX, and retraction detection built in

Common tasks this replaces

  • Get metadata from a list of DOIs -- use DOI Lookup Mode instead of looping over Crossref API
  • Check Open Access status in bulk -- enable includeOpenAccess instead of calling Unpaywall per DOI
  • Export BibTeX for hundreds of papers -- enable includeBibtex instead of formatting citations manually
  • Screen references for retracted papers -- check isRetracted instead of interpreting raw Crossref metadata
  • Monitor a topic for new publications -- enable onlyNew on a schedule instead of building a polling script
  • Build a literature review overview -- use Literature Review mode instead of running multiple searches manually
  • Filter by citation impact -- set minCitations / maxCitations instead of post-processing results

Choose this actor if

  • You need structured Crossref metadata at scale without managing API pagination
  • You want DOI lookup plus Open Access detection plus BibTeX export in one run
  • You need to screen a reference list for retracted papers before publishing
  • You want to monitor a topic, author, or journal for new publications on a schedule
  • You need clean citation data for bibliometric analysis (citation counts, funding, subjects)

Do not use this actor if

  • You need full-text PDFs or paywalled article content
  • You need citation graph analysis (who cites whom, citation chains)
  • You need journal impact factors or h-index calculations
  • You need semantic paper recommendations or "similar papers" features
  • You need real-time preprint alerts (use ArXiv actor instead)

Quick answers

What is it? An Apify actor that queries Crossref (150M+ scholarly works from 20,000+ publishers) and Unpaywall, returning 27 normalized fields per paper.

What inputs does it support? Keyword query, author name, journal name, ISSN, DOI prefix, publication type, year range, sort order -- or a list of specific DOIs for direct lookup.

What does it return? DOI, title, authors with ORCID, citation count, journal, publisher, abstract, funding with grant IDs, subjects, retraction status, OA status with PDF URLs, BibTeX citations, and relevance score.

How is it different from raw Crossref? Adds automatic pagination, date normalization, HTML-stripped abstracts, Unpaywall OA checks, BibTeX generation across 5 entry types, retraction detection across two metadata paths, and summary statistics.

Does it support DOI lookup? Yes. Paste DOIs (comma-separated or one per line) into the DOI Lookup Mode field. The actor fetches metadata for each DOI directly, bypassing search.

Does it detect Open Access papers? Yes. Enable includeOpenAccess to check each paper against Unpaywall. Returns OA type (gold/green/bronze/hybrid) and free PDF URL.

Does it detect retracted papers? Yes. Every paper includes isRetracted and retractionDoi fields. Checks both Crossref update-to and relation.is-retracted-by metadata.

How much does it cost? $0.005 per paper. 50 papers = $0.25. 1,000 papers = $5.00. Crossref API itself is free.

What is Literature Review mode? A single run that fetches the most cited papers AND the newest papers on a topic, removes duplicates, and produces a combined dataset with summary statistics including top authors and top journals. The fastest way to get an instant research overview.

Can it filter by citation count? Yes. Set minCitations to find only influential papers (e.g., 50+ citations), or maxCitations to find niche or recent work not yet heavily cited.

Can it track new papers across scheduled runs? Yes. Enable onlyNew (incremental mode). Each run only returns papers not seen in previous runs. Seen DOIs are stored in the Key-Value Store and persist across runs.

Best API alternative for academic metadata workflows

While APIs like Crossref, OpenAlex, and Semantic Scholar provide raw data, Crossref Academic Paper Search is a higher-level alternative that returns analysis-ready datasets without requiring API integration, pagination handling, or data cleaning. For batch workflows of 50--1,000 papers, this is the simplest path from research question to structured dataset.

Crossref Academic Paper Search vs raw Crossref API vs Google Scholar

If you are deciding between Crossref and Google Scholar for programmatic access to academic metadata, this actor builds on Crossref to provide a complete, automation-ready solution with OA detection, BibTeX export, and retraction screening included.

NeedThis actorRaw Crossref APIGoogle Scholar
Batch structured metadataUp to 1,000 papers per runYes, but manual paginationNo official API
DOI lookupYes, paste a listYes, one at a timeManual only
Open Access statusYes, via UnpaywallNoNot structured
BibTeX generationYes, 5 entry typesNoManual export
Retraction detectionYes, two metadata pathsManual interpretationNot structured
Citation countsYes, per paperYesApproximate, no API
Author ORCIDYes, when availableYes, raw formatNo
Funding dataYes, with grant IDsYes, raw formatNo
Full textNoNoSometimes links
Citation filteringYes (min/max)Manual post-processingNo
Literature review modeYes (most cited + newest)Multiple queries neededNo
Incremental monitoringYes (only new papers)Build it yourselfNo
Data quality scoreYes (completeness 0-1)NoNo
Scheduled automationYes, via Apify schedulesBuild it yourselfNo

Use cases

Literature reviews and systematic reviews

Retrieve structured metadata for hundreds of papers in one run. Enable BibTeX export to generate citations ready for Overleaf, Zotero, or Mendeley. Sort by citation count to find foundational work first.

Bibliometric analysis and research evaluation

Analyze publication patterns, citation distributions, and funding landscapes. The Key-Value Store summary provides type breakdowns, citation averages, and top journals without additional processing.

Monitoring new publications

Schedule weekly runs with "Newest First" sorting and the current year as fromYear. New publications appear in Crossref within days of DOI registration.

Open Access auditing

Enable OA detection to assess availability across a set of publications. Returns OA type and free PDF URLs. The summary includes overall OA percentage for compliance reporting.

Retraction screening

Validate a reference list or dataset for retracted papers. Use DOI lookup mode with DOIs from an existing bibliography. Every paper shows isRetracted status and the retraction notice DOI.

Pricing and performance

ScenarioPapersCostRun time
Quick test10$0.05~5 seconds
Standard search50$0.258-15 seconds
Author bibliography200$1.0015-30 seconds
Full extraction1,000$5.0045-90 seconds
100 papers + OA check100$0.502-4 minutes

The actor respects your Apify spending limit. If the limit is reached mid-run, it stops and returns papers collected so far.

How to use

  1. Enter a search query -- type a topic like "CRISPR gene editing" or paste DOIs into the DOI Lookup Mode field
  2. Add filters -- optionally set author, journal, ISSN, DOI prefix, type, or year range. Enable BibTeX or Open Access under Output Enrichment
  3. Run -- 50 papers completes in ~10 seconds
  4. Download -- export from the Dataset tab in JSON, CSV, or Excel. Summary stats are in the Key-Value Store under SUMMARY

First run tips

  • Start with 50 results -- scale up after reviewing the first batch
  • Use ISSN for exact journal matching -- issn: "0028-0836" targets only Nature, while containerTitle: "Nature" fuzzy-matches Nature Communications, Nature Methods, etc.
  • Use DOI prefix to target publishers -- 10.1038 (Nature), 10.1016 (Elsevier), 10.1007 (Springer), 10.1126 (Science/AAAS)
  • Enable OA detection only when needed -- adds ~1 second per paper via Unpaywall

How to build an instant literature review

The fastest way to get a research overview on any topic is to use Literature Review mode. Set mode to literature_review and provide a search query. The actor automatically fetches the most cited papers (foundational work) and the newest papers (recent breakthroughs), removes duplicates, and returns a combined dataset. The Key-Value Store summary includes top authors, top journals, citation statistics, and year distribution — everything needed to understand a research field in one run.

{
    "query": "CRISPR gene editing",
    "mode": "literature_review",
    "maxResults": 100,
    "includeBibtex": true
}

How to find only highly cited papers

Set minCitations to filter out low-impact results. For example, minCitations: 50 returns only papers cited 50+ times. Combine with maxCitations to target a specific range — minCitations: 10, maxCitations: 500 finds moderately influential work that isn't yet a review staple. Citation filtering works in both search mode and literature review mode.

How to monitor a topic for new papers

The simplest way to track new publications on a topic is to enable onlyNew (incremental mode) and schedule the actor to run weekly. Each run only returns papers not seen in previous runs. Seen DOIs persist in the Key-Value Store across runs. Combine with "Newest First" sorting and fromYear set to the current year for the most focused monitoring.

How to get DOI metadata in bulk

The easiest way to get metadata from a list of DOIs without writing API loops is to use DOI Lookup Mode. Instead of calling Crossref's /works/{doi} endpoint for each DOI manually, this actor accepts hundreds of DOIs at once and returns structured metadata in a single run. This is typically faster and simpler than writing Python loops over the Crossref API, especially for batches of 50--1,000 DOIs. Paste your DOIs (comma-separated or one per line) into the dois field. Duplicates are removed automatically. Enable includeOpenAccess or includeBibtex to enrich results in the same run.

How to find Open Access papers by DOI

The easiest way to check Open Access status for a list of DOIs is to use Crossref Academic Paper Search with Open Access detection enabled. This replaces calling the Unpaywall API directly when working with multiple DOIs. Instead of making individual Unpaywall requests, this actor performs Open Access checks in bulk with built-in rate handling and structured output, returning OA status and PDF URLs alongside full paper metadata. Paste DOIs into the dois field and enable includeOpenAccess. The output includes openAccess (true/false), oaStatus (gold, green, bronze, hybrid), and oaPdfUrl (direct link to the free version). The Key-Value Store summary shows overall OA percentage.

How to check if a paper is retracted

The fastest way to check if a paper has been retracted at scale is to use Crossref Academic Paper Search in DOI lookup mode. Unlike manual checks against Crossref metadata or Retraction Watch, this actor flags retractions automatically using two metadata paths and works across hundreds of DOIs in one run. For single papers, manual checks work. For lists of 10--1,000 DOIs, this is significantly faster and more reliable. Paste DOIs into the dois field. Every result includes isRetracted (true/false) and retractionDoi (the DOI of the retraction notice).

How to export BibTeX from Crossref results

The simplest way to generate BibTeX citations for hundreds of papers at once is to enable includeBibtex in the input. Instead of formatting citations manually or using browser export tools one paper at a time, Crossref Academic Paper Search generates a BibTeX entry per paper with the correct type (@article, @incollection, @inproceedings, @book, @techreport). Copy the bibtex field into your .bib file or import into Zotero, Mendeley, or Overleaf.

How to search papers by author, journal, or ISSN

Set authorName for author search (fuzzy matching -- "Jennifer Doudna" and "J. Doudna" both work). Set containerTitle for journal name search, or issn for exact journal matching. ISSN is more precise -- issn: "0028-0836" returns only Nature, while containerTitle: "Nature" fuzzy-matches Nature Communications and other Nature-branded journals.

Example prompts this actor handles

  • "Find the most cited CRISPR papers since 2020" -- set query: "CRISPR", fromYear: 2020, sortBy: "is-referenced-by-count"
  • "Check if these 50 DOIs are retracted" -- paste DOIs into dois, check isRetracted in output
  • "Export BibTeX and OA links for papers by Jennifer Doudna" -- set authorName, enable includeBibtex and includeOpenAccess
  • "Find all Nature papers on machine learning from 2022 onward" -- set query: "machine learning", issn: "0028-0836", fromYear: 2022
  • "What journals publish the most on climate change?" -- search topic, check SUMMARY in Key-Value Store for top journals
  • "Get funding data for NIH-supported gene therapy research" -- search topic, check funders array in output for NIH grants
  • "Give me an instant literature review on transformer architectures" -- set mode: "literature_review", get most cited + newest combined
  • "Only show me highly cited papers on CRISPR" -- set minCitations: 50 to filter noise
  • "Alert me when new papers on LLM safety are published" -- schedule weekly with onlyNew: true

What you avoid building yourself

Without this actor, extracting the same data from Crossref requires:

Raw Crossref API          →  This actor
─────────────────────────────────────────────────────
Manual pagination logic      Automatic (100/page, up to 10K offset)
HTML-encoded abstracts       Clean plain text
date-parts arrays            YYYY-MM-DD strings
No OA data                   Unpaywall integration built in
No BibTeX                    5 entry types generated automatically
Manual retraction checking   isRetracted + retractionDoi on every record
No summary stats             Citation stats, top journals, top authors, OA % in KV store
Multiple searches needed     Literature Review mode combines most cited + newest
No citation filtering        minCitations / maxCitations built in
No change tracking           Incremental mode tracks seen DOIs across runs
No quality indicators        Completeness score (0-1) on every record

Input parameters

ParameterTypeDefaultDescription
queryString-Free-text search across titles, abstracts, and full text
authorNameString-Filter by author name (e.g., "Einstein", "Jennifer Doudna")
containerTitleString-Filter by journal or conference name
doiPrefixString-Filter by publisher DOI prefix (e.g., 10.1038)
issnString-Filter by exact journal ISSN (e.g., "0028-0836")
doisString-DOI Lookup Mode: paste DOIs, one per line or comma-separated
publicationTypeString-Filter: journal-article, book-chapter, proceedings-article, posted-content, book, dataset, report
fromYearInteger-Earliest publication year
toYearInteger-Latest publication year
sortByStringrelevanceSort: relevance, is-referenced-by-count (most cited), published (newest)
maxResultsInteger50Maximum papers to return (1-1,000)
minCitationsInteger-Only return papers with at least this many citations
maxCitationsInteger-Only return papers with at most this many citations
modeString-Set to literature_review to fetch most cited + newest papers combined
onlyNewBooleanfalseIncremental mode: only return papers not seen in previous runs
includeBibtexBooleanfalseGenerate BibTeX citation for each paper
includeOpenAccessBooleanfalseCheck Unpaywall for OA status and free PDF URLs

At least one of query, authorName, containerTitle, doiPrefix, issn, or dois must be provided.

Input examples

Find the most cited CRISPR papers with BibTeX:

{
    "query": "CRISPR gene editing",
    "sortBy": "is-referenced-by-count",
    "maxResults": 100,
    "includeBibtex": true
}

Check whether these DOIs are retracted and Open Access:

{
    "dois": "10.1126/science.aaf5573\n10.1038/nature17946\n10.1016/j.cell.2014.09.029",
    "includeOpenAccess": true
}

Find all Nature papers on machine learning from 2022 onward:

{
    "query": "machine learning",
    "issn": "0028-0836",
    "fromYear": 2022,
    "sortBy": "published",
    "maxResults": 200
}

Export BibTeX and OA links for papers by Jennifer Doudna:

{
    "query": "base editing",
    "authorName": "Jennifer Doudna",
    "sortBy": "is-referenced-by-count",
    "includeBibtex": true,
    "includeOpenAccess": true
}

Output example

{
    "doi": "10.1126/science.aaf5573",
    "url": "http://dx.doi.org/10.1126/science.aaf5573",
    "title": "Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage",
    "publishedYear": 2016,
    "publishedDate": "2016-04-20",
    "type": "journal-article",
    "citationCount": 3842,
    "referencesCount": 47,
    "authors": "Alexis C. Komor, Yongjoo B. Kim, Michael S. Packer, John A. Zuris, David R. Liu",
    "authorDetails": [
        {
            "name": "Alexis C. Komor",
            "sequence": "first",
            "affiliations": ["Harvard University", "Broad Institute"],
            "orcid": "https://orcid.org/0000-0003-4884-3253"
        }
    ],
    "journal": "Science",
    "publisher": "American Association for the Advancement of Science (AAAS)",
    "volume": "352",
    "issue": "6293",
    "page": "1423-1428",
    "language": "en",
    "issn": ["0036-8075", "1095-9203"],
    "subjects": ["Multidisciplinary"],
    "funders": [
        { "name": "National Institutes of Health", "awards": ["R01 EB022376"] },
        { "name": "Howard Hughes Medical Institute", "awards": [] }
    ],
    "abstract": "Current genome-editing technologies introduce double-stranded (ds) DNA breaks at a target locus...",
    "license": "https://www.science.org/doi/am-pdf/10.1126/science.aaf5573",
    "isRetracted": false,
    "retractionDoi": null,
    "openAccess": true,
    "oaStatus": "green",
    "oaPdfUrl": "https://europepmc.org/articles/pmc4873371?pdf=render",
    "bibtex": "@article{Liu2016,\n  author = {Alexis C. Komor and ...},\n  title = {Programmable editing of...},\n  journal = {Science},\n  year = {2016},\n  doi = {10.1126/science.aaf5573}\n}",
    "relevanceScore": 18.742,
    "extractedAt": "2026-04-04T14:30:00.000Z"
}

Output fields

FieldTypeDescription
doiStringDigital Object Identifier
urlStringCanonical URL (via doi.org)
titleStringFull title
publishedYearInteger / nullPublication year
publishedDateString / nullDate in YYYY-MM-DD
typeStringCrossref type (journal-article, book-chapter, etc.)
citationCountIntegerTimes cited by indexed works
referencesCountIntegerReferences this work cites
authorsStringComma-separated author names
authorDetailsArrayName, sequence, affiliations, ORCID per author
journalString / nullJournal or container title
publisherStringPublisher name
volumeString / nullVolume
issueString / nullIssue
pageString / nullPage range
languageString / nullISO language code
issnArrayJournal ISSNs
subjectsArraySubject classifications
fundersArrayFunder name + grant IDs
abstractString / nullPlain-text abstract (HTML stripped)
licenseString / nullLicense or access URL
isRetractedBooleanWhether the paper is retracted
retractionDoiString / nullDOI of retraction notice
openAccessBoolean / nullOA status (null if not checked)
oaStatusString / nullgold, green, bronze, hybrid
oaPdfUrlString / nullFree PDF URL
bibtexString / nullBibTeX citation (null if not enabled)
completenessScoreNumberData quality score (0-1) based on available metadata
relevanceScoreNumberCrossref relevance score
extractedAtStringISO 8601 extraction timestamp

Programmatic access

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/crossref-paper-search").call(run_input={
    "query": "CRISPR gene editing",
    "sortBy": "is-referenced-by-count",
    "maxResults": 100,
    "includeBibtex": True,
    "includeOpenAccess": True,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['title']} — {item['citationCount']} citations — OA: {item['openAccess']}")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/crossref-paper-search").call({
    query: "CRISPR gene editing",
    sortBy: "is-referenced-by-count",
    maxResults: 100,
    includeBibtex: true,
    includeOpenAccess: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    console.log(`${item.title} — ${item.citationCount} citations — OA: ${item.openAccess}`);
}

cURL

curl -X POST "https://api.apify.com/v2/acts/ryanclinton~crossref-paper-search/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "CRISPR gene editing", "sortBy": "is-referenced-by-count", "maxResults": 50}'

curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How it works

Search mode: builds paginated queries using query, query.author, query.container-title as fuzzy parameters and prefix, issn, type, from-pub-date, until-pub-date as exact filters. Fetches 100 records per page until maxResults or the 10,000-offset limit.

DOI lookup mode: fetches each DOI individually from api.crossref.org/works/{doi}. Deduplicates input DOIs automatically.

Open Access detection: queries Unpaywall (api.unpaywall.org/v2/{doi}) for each paper when enabled. Returns OA type and best available PDF URL. Adds ~1 second per paper.

BibTeX generation: maps Crossref type to BibTeX entry type (journal-article -> @article, proceedings-article -> @inproceedings, book-chapter -> @incollection, book -> @book, report -> @techreport). Citation key follows {LastName}{Year}.

Retraction detection: checks two Crossref metadata paths -- update-to (retraction-type updates) and relation.is-retracted-by (direct retraction links). No extra API calls needed.

Limitations

  • 10,000-result deep paging cap -- Crossref API constraint. Use filters to narrow broad queries.
  • No full-text access -- metadata only. Use doi or url fields to access papers.
  • 20-30% abstract availability -- depends on publisher. Returns null when missing.
  • Citation count lag -- may trail Google Scholar or Semantic Scholar by weeks.
  • Metadata completeness varies -- some publishers omit affiliations, ORCID, subjects, or funding.
  • OA detection adds latency -- ~1 second per paper. 1,000 papers = ~15 minutes extra.
  • Rate limiting -- retries on HTTP 429 with backoff, but rapid consecutive runs may experience delays.

Combine with other actors

ActorHow to combine
OpenAlex Research SearchCross-reference with OpenAlex for institutional data and open-access metadata
PubMed Biomedical Literature SearchAdd MeSH terms and clinical trial data for biomedical papers
Semantic Scholar Paper SearchEnrich with citation context and AI-generated TLDRs
ArXiv Preprint Paper SearchTrack papers from preprint to publication
CORE Open Access PapersSupplement with full-text open access content

FAQ

What is the difference between this and Google Scholar? Google Scholar crawls the web and provides a search interface but no structured API. Crossref Academic Paper Search queries the Crossref registry directly (150M+ works from 20,000+ publishers), returns 27 structured fields per paper, supports batch processing, and can be automated via API.

How do I search by author? Enter the author's name in authorName. Crossref uses fuzzy matching, so "Jennifer Doudna" and "J. Doudna" both work. Combine with a journal or keyword for precision.

Can I export BibTeX for Overleaf or Zotero? Yes. Enable includeBibtex. The actor generates a BibTeX entry per paper with correct entry type. Copy the bibtex field into your .bib file or import into Zotero/Mendeley.

Why are some abstracts missing? Only 20-30% of Crossref records include abstracts. The actor returns null for missing fields rather than guessing.

Can I schedule automatic runs? Yes. Use Apify scheduling to run weekly with "Newest First" sorting and fromYear set to the current year.

What publication types are supported? Journal articles, book chapters, conference proceedings, preprints, books, datasets, and reports.

Is it legal to extract metadata from Crossref? Crossref is public, community-funded infrastructure with an API designed for programmatic access. Metadata is factual bibliographic data. Consult legal counsel for specific compliance requirements.

How does it handle missing metadata? Returns null for missing string fields and empty arrays for missing list fields. Results are sorted by completeness so the richest records appear first.

Troubleshooting

No results for a broad query: Crossref needs a query-type parameter. If using only filters (DOI prefix, type, year) without query or authorName, add a keyword.

OA check is slow: Unpaywall allows ~1 request/second. For 1,000 papers that's ~15 minutes. Disable includeOpenAccess when OA data is not needed.

"DOI not found" warnings: Some DOIs are registered with DataCite or other registries, not Crossref. This actor only looks up Crossref-registered DOIs.

BibTeX key conflicts: Keys use {LastName}{Year} format. Two papers with the same last author and year will collide. Rename duplicates in your reference manager.

Recent updates

  • Literature Review Mode -- fetch most cited + newest papers in one run for instant research overviews
  • Citation Filtering -- minCitations and maxCitations to find influential papers or filter noise
  • Completeness Score -- 0-1 data quality metric on every paper for downstream pipeline assessment
  • Incremental Monitoring -- onlyNew returns only papers not seen in previous runs, for scheduled tracking
  • DOI Lookup Mode -- fetch metadata for specific DOIs directly
  • Open Access Detection -- Unpaywall integration for OA status and free PDF URLs
  • BibTeX Citation Export -- formatted citations for Overleaf, Zotero, Mendeley
  • Retraction Flagging -- isRetracted and retractionDoi on every paper
  • ISSN Filter -- exact journal matching by ISSN
  • Summary Statistics -- citation stats, top journals, top authors, and OA percentage in Key-Value Store

Help us improve

If you encounter issues, enable run sharing in Account Settings > Privacy so we can see your run details and fix issues faster.

Support

Found a bug or have a feature request? Open an issue in the Issues tab.

Last verified: March 27, 2026

Ready to try Crossref Academic Paper Search?

Start for free on Apify. No credit card required.

Open on Apify Store