How much does Academic Research Intelligence MCP Server cost?

Academic Research Intelligence MCP Server uses pay-per-event pricing at $0.05 per search-pubmed. For example, 100 events cost $5.00 and 1,000 events cost $50.00. You only pay for what you use — there are no monthly fees.

How do I use Academic Research Intelligence MCP Server?

Add the Academic Research Intelligence MCP Server MCP endpoint to Claude Desktop, Cursor, Windsurf, or any MCP-compatible AI client using your Apify API token for authentication. The server exposes 7 tools that your AI assistant can call directly via natural language prompts. Results return as structured JSON within the conversation. Each tool call costs $0.05 per search-pubmed.

Is Academic Research Intelligence MCP Server reliable?

Academic Research Intelligence MCP Server has a maintenance pulse score of 90/100, with 8 builds in the last 30 days and the most recent build today.

What output format does Academic Research Intelligence MCP Server return?

Academic Research Intelligence MCP Server returns structured data in JSON format by default. You can also export results as CSV or Excel from the Apify Console. Each result includes all extracted fields in a flat, machine-readable structure that integrates directly with spreadsheets, CRMs, and automation tools via Apify integrations.

Are there alternatives to Academic Research Intelligence MCP Server?

Yes. ApifyForge lists multiple actors in each category with different strengths. Browse related actors on the Academic Research Intelligence MCP Server page or use the ApifyForge actor recommender to find the best fit for your use case. The right choice depends on your input data, budget, and required output fields.

AIDEVELOPER TOOLS

Academic Research Intelligence MCP Server

Academic Research Intelligence MCP Server is an MCP (Model Context Protocol) server available on ApifyForge at $0.05 per search-pubmed. MCP server for multi-database academic literature search. Wraps 6 specialized actors: PubMed (biomedical), Semantic Scholar (all disciplines with AI summaries), ArXiv (preprints), Crossref (DOI metadata with citations/funders), OpenAlex (250M+ works), and ORCID (researcher profiles). Includes a unified literature review tool with cross-database deduplication.

Best for AI developers and agent builders who need structured real-world data inside Claude, Cursor, or other MCP-compatible clients.

Not ideal for non-AI workflows or use cases that don't involve an MCP-compatible client.

Coming soon on Apify Store

$0.05per event

Tools exposed

Each pricing event corresponds to a tool your AI agent can call through MCP.

search-pubmedSearch biomedical literature on PubMed. · $0.05/call

search-semantic-scholarSearch academic papers on Semantic Scholar. · $0.05/call

search-arxivSearch preprint papers on ArXiv. · $0.05/call

search-crossrefSearch academic papers via Crossref DOI registry. · $0.05/call

search-openalexSearch research works on OpenAlex. · $0.05/call

find-researcherLook up researcher profiles via ORCID. · $0.05/call

literature-reviewComposite literature review across multiple academic databases. · $0.15/call

Example prompts

Natural language queries you can ask your AI assistant that would trigger this MCP server.

"Run a search pubmed on Acme Corp and summarize the findings"

"Can you search semantic scholar and highlight any red flags?"

"What tools does the Academic Research Intelligence MCP Server have available?"

Last verified: March 27, 2026

Actively maintained

Maintenance Pulse

$0.05

Per event

What to know

Requires an MCP-compatible client (Claude Desktop, Cursor, Windsurf, or similar).
Tool call results depend on the availability of upstream public APIs.
Requires an Apify account and API token for authentication.

Maintenance Pulse

90/100

Last Build

Today

Last Version

1d ago

Builds (30d)

Issue Response

N/A

Cost Estimate

How many results do you need?

search-pubmeds

Estimated cost:$5.00

Pricing

Pay Per Event model. You only pay for what you use.

Event	Description	Price
search-pubmed	Search biomedical literature on PubMed.	$0.05
search-semantic-scholar	Search academic papers on Semantic Scholar.	$0.05
search-arxiv	Search preprint papers on ArXiv.	$0.05
search-crossref	Search academic papers via Crossref DOI registry.	$0.05
search-openalex	Search research works on OpenAlex.	$0.05
find-researcher	Look up researcher profiles via ORCID.	$0.05
literature-review	Composite literature review across multiple academic databases.	$0.15

Example: 100 events = $5.00 · 1,000 events = $50.00

Documentation

Academic Research Intelligence MCP, multi-database academic literature search for AI agents

Academic Research Intelligence MCP is multi-database scholarly search infrastructure for AI agents and research workflows.

It wraps six free academic data sources (PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, ORCID) behind one MCP endpoint and adds a composite literature review tool that queries multiple databases in parallel and deduplicates results by DOI. No API keys required. Built for AI research assistants, systematic-review teams, biomedical analysts, science writers, and any agent that needs a single tool for "find me the papers."

The category

Academic Research Intelligence MCP is multi-database scholarly search infrastructure. Unlike single-database wrappers (which force the agent to pick the right source before searching) or scraping tools (which break when site HTML changes), it exposes six official academic APIs as one MCP toolset and adds DOI-based cross-database deduplication. Agents see one literature surface instead of six, and the composite literature review returns coverage statistics so a thin search is never mistaken for a complete one.

In one sentence

Search PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, and ORCID through a single MCP server, or run one composite literature review that queries three databases in parallel and deduplicates by DOI.

Category: Academic research MCP. Multi-database literature search. AI agent tooling. Primary use case: Give an AI agent one tool that covers all the major free academic databases. Can also be used for systematic-review compilation, researcher discovery, and DOI cross-referencing.

Also known as: academic research MCP, scholarly search MCP server, multi-database literature search, PubMed MCP, ArXiv MCP, Semantic Scholar MCP, OpenAlex MCP, Crossref MCP, ORCID MCP, literature review agent tool.

What this actor does

What it is: A standby-mode MCP server exposing 8 tools that wrap 6 free academic data sources.
What it checks: Biomedical literature, all-discipline papers with AI summaries, preprints, DOI metadata, broad academic works, and researcher profiles.
What it returns: Structured JSON paper lists with titles, authors, year, journal, DOI, citations, abstracts, plus cross-database coverage stats for the composite literature review tool.
What it does NOT do: No full-text PDF download, no peer-review verdict, no plagiarism checking, no integrity scoring (see Research Integrity Screening MCP for that).
Who it's for: AI research assistants, systematic-review teams, biomedical analysts, science journalists, R&D analysts, academic recruiters.

What you get from one call

research_literature_review fans out to PubMed, Semantic Scholar, and Crossref (and optionally ArXiv) in parallel and returns:

papers[] ranked by source count then citation count (papers found in multiple databases sort first)
coverage.sourcesSearched the exact list of databases that ran
coverage.resultsPerSource per-database hit counts
coverage.totalBeforeDedup raw result count across all sources
coverage.uniquePapersWithDoi deduplicated paper count
coverage.papersFoundInMultipleSources corroboration count (the strongest "this paper is real and relevant" signal)
papersWithoutDoi[] results that lacked a DOI, kept separate so dedup confidence stays clean

Each paper carries doi, title, authors, year, journal, citationCount, abstract (or AI TLDR from Semantic Scholar), isOpenAccess, url, plus foundIn[] listing every database that returned it.

What you also get: 6 free databases, no API keys, DOI deduplication, AI TLDRs from Semantic Scholar, parallel fan-out

What makes this different

Six databases, one MCP toolset, zero credentials. PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, and ORCID all run through one endpoint with no API keys to provision.
DOI-based cross-database deduplication. research_literature_review collapses the same paper appearing in three databases into one ranked record with sourceCount, so coverage and corroboration are visible.
Coverage-honest. Every literature review returns sourcesSearched and resultsPerSource, so a thin search (one database delivered, two empty) is never mistaken for a complete one.

Before vs after

Without this MCP	With this MCP
Open PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, and ORCID in six tabs	One MCP endpoint, six databases reachable as tools
Provision API keys per service (Semantic Scholar, Crossref polite pool, etc.)	No keys required, all sources are free public APIs
Manually deduplicate paper lists by title across databases	DOI-based dedup with multi-source confidence ranking
Write custom retry and rate-limit code per database	Parallel `Promise.all` fan-out with per-source timeouts
Lose track of which databases were actually checked	`coverage.sourcesSearched` and `resultsPerSource` returned every time

Architecture

agent prompt
   ↓
MCP /mcp endpoint (StreamableHTTP)
   ↓
8 registered tools (6 single-source + literature review + list sources)
   ↓
6 sub-actors called in parallel via apify-client
   ↓
PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, ORCID
   ↓
literature review: DOI normalization → dedup map → multi-source ranking
   ↓
structured JSON response with coverage stats

The MCP runs in Apify Standby mode with a configurable idle-shutdown window (default 300s) so platform compute stops billing when no tools are firing.

Built for

AI research assistants embedded in Claude Desktop, Cursor, Windsurf, or LangChain pipelines; systematic-review teams running PubMed-plus-others searches under Cochrane guidelines; biomedical analysts and science journalists who need a single "find me the papers" tool; academic recruiters and tenure committees verifying publication records through ORCID; R&D and competitive-intelligence teams tracking ArXiv preprints and conference proceedings; agent builders who want one academic-search interface instead of six.

This server runs in Standby mode on the Apify platform. It orchestrates six official academic APIs through six sibling Apify actors, normalizes the response shapes, and returns structured JSON your AI agent reasons over directly.

Questions this MCP answers

"Find me all papers on CAR-T cell therapy efficacy across PubMed, Semantic Scholar, and Crossref."
"What are the latest ArXiv preprints on transformer attention mechanisms?"
"Which biomedical papers cite this NIH-funded study?"
"Has this researcher published in predatory venues? Show their full publication record from ORCID."
"Give me a literature review of CRISPR off-target effects with cross-database coverage stats."
"What is the AI TLDR for this Semantic Scholar paper, and how many influential citations does it have?"
"Find me Crossref-registered papers from this journal between 2020 and 2024 with funder information."
"Which authors at MIT have published on protein folding since 2022?"

For AI agents

First tool to reach for when a prompt mentions academic papers, scholarly literature, systematic reviews, citations, DOIs, preprints, or researcher publication records.
Call research_list_sources first (it is free). It enumerates the 8 tools, the 6 sources, and the record counts behind each, so the agent can decide which single-database tool to call before paying for a multi-database review.
Use research_literature_review when the user wants coverage, not just speed. It runs PubMed + Semantic Scholar + Crossref in parallel and deduplicates by DOI, one call replaces three. Add include_arxiv: true if the topic is physics, math, CS, or stats.
Use a single-database tool when the database is named. "Find PubMed papers on..." goes to research_search_pubmed, not the composite review. Single-database calls are $0.05 vs $0.15 for the review.
Read coverage.sourcesSearched and papersFoundInMultipleSources before summarizing. A literature review where only one of three databases returned papers is low-corroboration, treat it as preliminary.
For researcher lookups, use research_find_researcher. Set fetch_works: true only when the user explicitly asks for the full publication list (it is slower).

Use this MCP when an AI agent needs to:

run a literature review across multiple academic databases
look up biomedical, AI/ML, physics, or cross-discipline papers
find researcher profiles, affiliations, and external IDs
pull DOI metadata with funder, ORCID, and licensing information
discover ArXiv preprints before journal publication
get AI-generated paper summaries and influential citation counts
build a corroborated paper list with cross-database confidence scores

What data can you access?

Data Point	Source	Example
🧬 Biomedical citations with MeSH terms and abstracts	PubMed / MEDLINE	36M+ citations, "CRISPR gene editing" returns ~12k papers
📄 All-discipline papers with AI TLDR and influential citations	Semantic Scholar	200M+ papers, TLDR field is one-sentence AI summary
📑 Open-access preprints in physics, math, CS, biology, stats	ArXiv	2.4M+ preprints, prefix syntax (ti:, au:, abs:, cat:)
🔗 DOI-registered works with funders and ORCID-linked authors	Crossref	150M+ works, funder names + grant numbers + license URLs
📚 Broad academic index with concept tagging and institution data	OpenAlex	250M+ works, concept hierarchy + institution affiliations
👤 Researcher profiles with career history and external IDs	ORCID	18M+ profiles, Scopus / ResearcherID linkage, employment history
🧠 AI-generated TLDR (one-sentence paper summary)	Semantic Scholar	"Transformers replace recurrence with self-attention for sequence modeling"
📈 Influential citation count (citations central to the citing paper)	Semantic Scholar	influentialCitationCount: 1,247 (of 18,392 total citations)
🏛️ Author affiliations and institution employment history	ORCID + OpenAlex	"Dr. Y. Bengio, Mila / Université de Montréal, 2016-present"
💰 Funder names with grant numbers and licensing URLs	Crossref	"NIH R01CA123456, CC-BY 4.0"

Why use Academic Research Intelligence MCP?

Most agent-driven academic search is:

single-database (the agent calls only the source it knows about, missing cross-database corroboration)
credential-heavy (Semantic Scholar polite pool, Crossref polite pool, ORCID token rotation, all separately provisioned)
inconsistent in response shape (every database returns a different paper schema)
silent on coverage (the agent does not know which databases actually returned data)

This MCP turns that into one tool surface. A single literature-review call queries three databases in parallel, normalizes the paper shapes, deduplicates by DOI, and returns explicit coverage stats your agent acts on directly. Single-database tools stay available when the user names the database.

Scheduling: run periodic literature sweeps on Apify Scheduler; pipe new papers to Slack or email via webhooks
API access: trigger searches from Python, JavaScript, or any HTTP client using standard MCP protocol
Parallel fan-out: literature review queries three databases simultaneously, not sequentially
No API keys: all six data sources are free public academic APIs, no credentials to provision
Integrations: pipe results into Notion, Airtable, Google Sheets, or any webhook-compatible knowledge base

Features

Multi-database search (8 MCP tools, 6 academic sources)

PubMed (36M+ biomedical citations) with field-tag syntax ([Title], [MeSH Terms], [Author]), boolean AND/OR/NOT, article-type filter, date range.
Semantic Scholar (200M+ papers) with AI-generated TLDR summaries, influential citation counts, venue and field-of-study filters, citation-sorted results.
ArXiv (2.4M+ preprints) with prefix syntax (ti:, au:, abs:, cat:), category filter, submission-date sort. Rate-limited at the source (1 request per 3s).
Crossref (150M+ DOI-registered works) with funder names, grant numbers, ORCID author IDs, publication-type filter (journal-article, book-chapter, dataset, etc.).
OpenAlex (250M+ works) with concept tagging, institution affiliations, citation-count sort, open-access filter.
ORCID (18M+ researchers) with name, affiliation, keyword, or raw Lucene query. Optional fetch_works flag pulls full publication lists.

Composite literature review

research_literature_review queries PubMed + Semantic Scholar + Crossref in parallel (and optionally ArXiv).
DOI normalization (strips https://doi.org/ prefix, lowercases) before dedup.
Multi-source ranking: papers found in more databases sort higher, ties broken by citation count.
Coverage stats (sourcesSearched, resultsPerSource, totalBeforeDedup, uniquePapersWithDoi, papersFoundInMultipleSources) returned on every call.
Papers without a DOI kept in a separate papersWithoutDoi[] array so the dedup count stays honest.

Operational layer

Apify Standby mode with configurable idle-shutdown (default 300s, env var STANDBY_IDLE_TIMEOUT_SECS).
Failure-webhook registration on every container start, customer-side failures push to the operator's webhook handler automatically.
Per-tool timeouts (120s default, 180s for Semantic Scholar and ORCID, 300s for ArXiv) so a slow source degrades the result instead of blocking it.

Quickstart workflows

Systematic review (Cochrane-style)

topic from user
 → research_literature_review (PubMed + S2 + Crossref + ArXiv)
 → sort papers by sourceCount desc, then citationCount desc
 → for each paper in top 50: include if foundIn.length >= 2
 → export DOIs + titles + abstracts for full-text retrieval

Single-database lookup (named source)

user names a database, e.g. "find me PubMed papers on..."
 → research_search_pubmed with field tags
 → return papers with PubMed-specific fields (MeSH terms, article type, pubmed URL)

Researcher discovery

researcher name from user
 → research_find_researcher (family_name + affiliation)
 → if disambiguation needed: fetch_works=true on the top match
 → return ORCID profile + career history + publication list

Use cases for multi-database academic search

AI research assistants embedded in chat interfaces

Conversational AI assistants embedded in Claude Desktop, Cursor, ChatGPT, or custom agents need a single tool for "find me the papers." Without this MCP, an agent has to pick one database before searching and may miss corroborating results from other sources. The research_literature_review tool queries three databases in parallel, returns a unified DOI-deduplicated paper list with source coverage stats, and lets the agent answer "what does the literature say about X" with cross-database confidence. One MCP call replaces three database lookups plus a manual dedup step.

Systematic-review and meta-analysis teams

Systematic reviewers working under Cochrane or PRISMA guidelines must search at least three databases (typically PubMed plus two others) and document the search strategy per database. research_literature_review returns coverage.sourcesSearched and coverage.resultsPerSource on every call, so the audit trail is produced automatically. DOI-based dedup means the team starts the screening phase with a unique-record list, not a raw union with duplicates. Adding include_arxiv: true extends coverage into preprints, useful for fast-moving fields like AI/ML and bioRxiv-style biology research.

Biomedical analysts and science journalists

Journalists tracking a breaking science story or analysts compiling a clinical-evidence brief need PubMed (peer-reviewed biomedical), Crossref (DOI metadata with funders), and Semantic Scholar (AI TLDRs for fast skimming) in one place. The composite review tool runs all three in parallel, ranks papers by cross-database corroboration, and surfaces highly-cited or open-access work first. The Semantic Scholar TLDR field is particularly useful when triaging dozens of paper titles into a shortlist.

R&D and competitive-intelligence teams tracking preprints

Industry R&D teams in biotech, pharma, AI, and quantum need early signals from preprints before journal publication. research_search_arxiv with sort_by: submittedDate returns the latest preprints in a category (cs.AI, cs.CL, math.OC, q-bio.GN). Pair with research_search_semantic_scholar filtered to venue: ArXiv and sort_by: citationCount to find which preprints are already accumulating citations, the strongest signal that a result will land. Schedule weekly via Apify Scheduler and pipe new high-citation preprints to Slack.

Academic recruiters and tenure committees

Recruiters, provosts, and tenure committees screening candidates need verified publication records, not LinkedIn summaries. research_find_researcher queries ORCID by name and affiliation, and with fetch_works: true returns the full publication list with external IDs (Scopus, ResearcherID) for cross-referencing. Combine with research_search_crossref filtered to the candidate's author name for funder and grant data. The ORCID profile carries employment history, useful for verifying claimed positions and start dates.

Knowledge-base builders for RAG pipelines

Teams building retrieval-augmented generation pipelines for a research domain need a clean, structured paper corpus, not scraped HTML. research_literature_review returns title, authors, year, journal, abstract (or Semantic Scholar TLDR), DOI, citation count, and open-access status as structured JSON ready to chunk and embed. Schedule monthly to keep the corpus fresh. The papersFoundInMultipleSources count is a useful pre-filter for trust-weighted retrieval.

How to connect this academic research MCP

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "academic-research": {
      "url": "https://academic-research-mcp.apify.actor/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Cursor, Windsurf, or Cline

Use the same URL and token in your MCP server settings panel. The server communicates via standard MCP protocol over HTTP POST to /mcp.

Python (via requests)

import requests

response = requests.post(
    "https://academic-research-mcp.apify.actor/mcp",
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
    },
    json={
        "jsonrpc": "2.0",
        "method": "tools/call",
        "params": {
            "name": "research_literature_review",
            "arguments": {
                "query": "CRISPR off-target effects",
                "year_from": 2022,
                "max_per_source": 50,
                "include_arxiv": False
            }
        },
        "id": 1
    }
)
result = response.json()
review = result["result"]["content"][0]["text"]
print(review)

JavaScript

const response = await fetch(
  "https://academic-research-mcp.apify.actor/mcp",
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": "Bearer YOUR_APIFY_TOKEN"
    },
    body: JSON.stringify({
      jsonrpc: "2.0",
      method: "tools/call",
      params: {
        name: "research_search_semantic_scholar",
        arguments: {
          query: "transformer attention mechanism",
          year_from: 2020,
          min_citations: 50,
          sort_by: "citationCount",
          max_results: 25
        }
      },
      id: 1
    })
  }
);
const data = await response.json();
const review = JSON.parse(data.result.content[0].text);
console.log(`Found ${review.total} papers from ${review.source}`);

cURL

# Run a multi-database literature review
curl -X POST "https://academic-research-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "research_literature_review",
      "arguments": {
        "query": "CAR-T cell therapy efficacy",
        "year_from": 2021,
        "year_to": 2025,
        "max_per_source": 50,
        "include_arxiv": false
      }
    },
    "id": 1
  }'

Environment variables

All six data sources are free public academic APIs and need no key.

Variable	Required	Purpose
`STANDBY_IDLE_TIMEOUT_SECS`	Optional	Standby idle-shutdown window in seconds (default 300, minimum 60). The instance exits after this idle period to release platform compute; the next request cold-starts a fresh one.

MCP tools

Available tools: the full MCP tool catalogue with per-call pricing

Tool	PPE event	Price	What it returns
`research_search_pubmed`	`search-pubmed`	$0.05	PubMed biomedical citations with title, authors, journal, MeSH terms, abstract, pubmed URL. Field tags ([Title], [MeSH Terms], [Author]), boolean AND/OR/NOT, article-type filter, date range, max 500.
`research_search_semantic_scholar`	`search-semantic-scholar`	$0.05	Semantic Scholar papers with AI-generated TLDR, influentialCitationCount, venue, field-of-study, open-access PDF link. Year range, venue, field, min_citations, sort by citationCount or publicationDate, max 500.
`research_search_arxiv`	`search-arxiv`	$0.05	ArXiv preprints with title, authors, abstract, category, PDF URL. Prefix syntax (ti:, au:, abs:, cat:), category filter (cs.AI, math.CO, stat.ML, physics.hep-th, etc.), sort by relevance or date, max 500. Rate-limited at source (1 req per 3s).
`research_search_crossref`	`search-crossref`	$0.05	Crossref DOI-registered works with funders, grant numbers, ORCID author IDs, licensing, publication type. Filter by query / author / journal / DOI prefix / type, year range, sort by relevance / citation count / publication date, max 500.
`research_search_openalex`	`search-openalex`	$0.05	OpenAlex works with concept tagging, institution affiliations, citation counts, open-access status. Year filter, min_citations, sort by relevance / cited_by_count / publication_date, max 500.
`research_find_researcher`	`find-researcher`	$0.05	ORCID researcher profiles with career history, employment, external IDs (Scopus, ResearcherID). Family name, given names, affiliation, keyword, or raw Lucene query. Optional `fetch_works: true` pulls full publication list. Max 100.
`research_literature_review`	`literature-review`	$0.15	Composite review: queries PubMed + Semantic Scholar + Crossref in parallel (+ optional ArXiv), deduplicates by DOI, ranks by source count then citation count. Returns `papers[]` with `foundIn[]` and `sourceCount`, plus `coverage` block (sourcesSearched, resultsPerSource, totalBeforeDedup, uniquePapersWithDoi, papersFoundInMultipleSources).
`research_list_sources`	(none, free)	Free	Enumerates the 8 tools, 6 sources, and record counts. No upstream fetch, no charge. Useful for agent planning before paying for a search.

Tool input reference

Tool	Parameter	Type	Required	Description
`research_search_pubmed`	`query`	string	No (one of three)	Search query, supports PubMed field tags (e.g. "diabetes AND metformin[MeSH Terms]")
`research_search_pubmed`	`author`	string	No	Author name (e.g. "Doudna JA")
`research_search_pubmed`	`journal`	string	No	Journal name (e.g. "Nature", "JAMA", "Lancet")
`research_search_pubmed`	`date_from` / `date_to`	string	No	YYYY/MM/DD or YYYY
`research_search_pubmed`	`article_type`	enum	No	Review / Clinical Trial / Randomized Controlled Trial / Meta-Analysis / Systematic Review / Case Reports
`research_search_pubmed`	`sort_by`	enum	No	relevance (default) or pub_date
`research_search_pubmed`	`max_results`	number	No	1 to 500, default 50
`research_search_semantic_scholar`	`query`	string	Yes	Search query (e.g. "transformer attention mechanism")
`research_search_semantic_scholar`	`year_from` / `year_to`	number	No	Year range
`research_search_semantic_scholar`	`venue`	string	No	Journal or conference (e.g. "NeurIPS", "Nature")
`research_search_semantic_scholar`	`field`	enum	No	Computer Science / Medicine / Biology / Physics / Chemistry / Mathematics / Engineering / Economics / Psychology / Sociology
`research_search_semantic_scholar`	`open_access_only`	boolean	No	Only papers with free PDF, default false
`research_search_semantic_scholar`	`min_citations`	number	No	Minimum citation count
`research_search_semantic_scholar`	`sort_by`	enum	No	relevance / citationCount / publicationDate
`research_search_arxiv`	`query`	string	No (one of two)	Search query with optional prefixes (e.g. "ti:attention AND au:vaswani")
`research_search_arxiv`	`category`	string	No	ArXiv category (e.g. "cs.AI", "math.CO", "stat.ML")
`research_search_arxiv`	`sort_by`	enum	No	relevance / lastUpdatedDate / submittedDate
`research_search_arxiv`	`sort_order`	enum	No	descending (default) or ascending
`research_search_crossref`	`query`	string	No (one of four)	Full-text search across title and abstract
`research_search_crossref`	`author`	string	No	Author name filter
`research_search_crossref`	`journal`	string	No	Journal or conference name
`research_search_crossref`	`doi_prefix`	string	No	DOI prefix (e.g. "10.1038" for Nature, "10.1126" for Science)
`research_search_crossref`	`type`	enum	No	journal-article / book-chapter / proceedings-article / posted-content / book / dataset / report
`research_search_crossref`	`year_from` / `year_to`	number	No	Year range
`research_search_crossref`	`sort_by`	enum	No	relevance / is-referenced-by-count / published
`research_search_openalex`	`query`	string	Yes	Search query across title, abstract, and full text
`research_search_openalex`	`year`	number	No	Filter to a single publication year
`research_search_openalex`	`min_citations`	number	No	Minimum citation count
`research_search_openalex`	`open_access_only`	boolean	No	Only open-access papers, default false
`research_search_openalex`	`sort_by`	enum	No	relevance_score:desc / cited_by_count:desc / publication_date:desc
`research_find_researcher`	`family_name`	string	No (one of five)	Last name (e.g. "Hinton", "LeCun")
`research_find_researcher`	`given_names`	string	No	First name(s)
`research_find_researcher`	`affiliation`	string	No	University or organization (e.g. "MIT", "Google DeepMind")
`research_find_researcher`	`keyword`	string	No	Research keyword (e.g. "deep learning", "CRISPR")
`research_find_researcher`	`query`	string	No	Raw ORCID Lucene query (overrides individual fields)
`research_find_researcher`	`fetch_works`	boolean	No	Fetch full publication list per researcher (slower), default false
`research_find_researcher`	`max_results`	number	No	1 to 100, default 25
`research_literature_review`	`query`	string	Yes	Research topic or question (e.g. "CAR-T cell therapy efficacy")
`research_literature_review`	`year_from` / `year_to`	number	No	Year range applied to all databases
`research_literature_review`	`max_per_source`	number	No	1 to 200, default 50
`research_literature_review`	`include_arxiv`	boolean	No	Also search ArXiv preprints (adds time due to source rate limit), default false

Output example

{
  "query": "CRISPR off-target effects",
  "yearRange": { "from": 2022, "to": "present" },
  "coverage": {
    "sourcesSearched": ["PubMed", "Semantic Scholar", "Crossref"],
    "resultsPerSource": {
      "PubMed": 47,
      "Semantic Scholar": 50,
      "Crossref": 50
    },
    "totalBeforeDedup": 147,
    "uniquePapersWithDoi": 112,
    "papersWithoutDoi": 3,
    "papersFoundInMultipleSources": 28
  },
  "papers": [
    {
      "doi": "10.1038/s41587-023-01918-1",
      "title": "Prime editing with genome-wide off-target evaluation",
      "authors": "Anzalone AV, Gao XD, Podracky CJ, Nelson AT, Koblan LW, Raguram A, Levy JM, Mercer JAM, Liu DR",
      "year": 2023,
      "journal": "Nature Biotechnology",
      "citationCount": 482,
      "abstract": "Prime editing enables precise installation of substitutions, insertions, and deletions without requiring double-strand breaks. Here we develop a high-throughput off-target evaluation pipeline...",
      "isOpenAccess": false,
      "url": "https://pubmed.ncbi.nlm.nih.gov/37640944/",
      "foundIn": ["PubMed", "Semantic Scholar", "Crossref"],
      "sourceCount": 3
    },
    {
      "doi": "10.1016/j.cell.2022.10.012",
      "title": "Genome-wide specificity profiling of CRISPR-Cas9 base editors in human cells",
      "authors": "Kim D, Lim K, Kim S, Yoon S, Kim JS",
      "year": 2022,
      "journal": "Cell",
      "citationCount": 318,
      "abstract": "Base editors enable targeted single-nucleotide conversions. We profile genome-wide off-target activity of cytosine and adenine base editors using GUIDE-seq and Digenome-seq...",
      "isOpenAccess": true,
      "url": "https://www.semanticscholar.org/paper/abc123",
      "foundIn": ["PubMed", "Semantic Scholar"],
      "sourceCount": 2
    },
    {
      "doi": "10.1126/science.add8643",
      "title": "Engineered Cas12a variants with reduced off-target activity",
      "authors": "Liu Y, Wang J, Zhang H, Chen L, Doudna JA",
      "year": 2023,
      "journal": "Science",
      "citationCount": 156,
      "abstract": "We engineer Cas12a variants through directed evolution to reduce off-target cleavage while preserving on-target efficiency...",
      "isOpenAccess": false,
      "url": "https://doi.org/10.1126/science.add8643",
      "foundIn": ["Crossref"],
      "sourceCount": 1
    }
  ],
  "papersWithoutDoi": [
    {
      "title": "Conference talk: CRISPR safety in clinical translation",
      "authors": "Chen L, Doudna JA",
      "year": 2023,
      "journal": "ASH Annual Meeting Abstracts",
      "citationCount": 4,
      "url": "https://www.semanticscholar.org/paper/xyz789",
      "foundIn": ["Semantic Scholar"]
    }
  ]
}

The papers[] array sorts by sourceCount descending (papers in more databases first), with citation count as the tiebreaker. papersWithoutDoi is capped at the first 20 entries so the response stays compact. Single-database tools (research_search_pubmed, research_search_semantic_scholar, etc.) return a simpler { total, source, papers } shape with the source-specific field set.

Output fields

Coverage block (returned by research_literature_review only)

Field	Type	Description
`coverage.sourcesSearched`	string[]	Exact list of databases that ran (e.g. `["PubMed", "Semantic Scholar", "Crossref"]`)
`coverage.resultsPerSource`	object	Per-database hit count, branch on this to detect thin searches
`coverage.totalBeforeDedup`	number	Raw count across all sources before DOI dedup
`coverage.uniquePapersWithDoi`	number	Deduplicated paper count
`coverage.papersWithoutDoi`	number	Count of results that lacked a DOI
`coverage.papersFoundInMultipleSources`	number	Corroboration count, the strongest "this is real" signal

Per-paper fields (composite review)

Field	Type	Description
`doi`	string	DOI (normalized, no `https://doi.org/` prefix, lowercased)
`title` / `authors` / `year` / `journal`	string / string / number / string	Standard bibliographic fields
`citationCount`	number	Citation count from the most-detailed source (Semantic Scholar or OpenAlex preferred)
`abstract`	string	Full abstract, or Semantic Scholar AI TLDR when only S2 returned the paper
`isOpenAccess`	boolean	Open-access flag (null when no source returned it)
`url`	string	Best landing URL (pubmedUrl > semanticScholarUrl > crossref url > arxiv absUrl)
`foundIn`	string[]	List of databases that returned this paper
`sourceCount`	number	Length of `foundIn`, the rank-sort key

Single-database tool envelope

Field	Type	Description
`total`	number	Number of papers returned (post status-message filter)
`source`	string	Database name (e.g. "PubMed", "Semantic Scholar")
`papers`	object[]	Raw paper records with source-specific field sets

How much does it cost to run academic research searches?

Academic Research Intelligence MCP uses pay-per-event pricing: $0.05 per single-database search, $0.15 per multi-database literature review, free for research_list_sources. Platform compute is included.

Scenario	Tool calls	Cost per call	Total cost
Quick test, single-database lookup	1	$0.05	$0.05
Multi-database literature review (3 sources)	1	$0.15	$0.15
Multi-database review with ArXiv (4 sources)	1	$0.15	$0.15
Systematic review pipeline: review + 2 single-source supplemental	3	mixed	$0.25
Researcher discovery + full works fetch	1	$0.05	$0.05
50 literature reviews per month (active research team)	50	$0.15	$7.50
500 single-database lookups per month (RAG indexer)	500	$0.05	$25.00

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached, returning a structured error your pipeline can handle gracefully.

Apify's free tier includes $5 of monthly platform credits: enough for 100 single-database searches or 33 multi-database reviews before you need to add payment.

How it works

Standby request received. Apify routes the MCP POST to /mcp on the standby instance. The activity timer resets, so the idle-shutdown countdown restarts.
MCP tool dispatch. The McpServer matches the tool name (research_literature_review, research_search_pubmed, etc.) and validates input against the Zod schema. Invalid inputs return a structured { error: ... } response without charging.
PPE charge. Actor.charge({ eventName }) fires before any upstream call (so a failed sub-actor still bills, matching Apify PPE semantics).
Sub-actor call via apify-client. Each tool calls one or more sibling actors (ryanclinton/pubmed-research-search, ryanclinton/semantic-scholar-search, etc.) with memory: 256 and per-tool waitSecs timeout (120s default, 180s for S2 / ORCID, 300s for ArXiv).
Literature review fan-out. research_literature_review runs PubMed + Semantic Scholar + Crossref (+ optional ArXiv) in parallel via Promise.all. Each sub-actor returns a dataset; the MCP iterates items, filters out status-message rows, and normalizes the paper shape.
DOI deduplication. DOIs are normalized (strip https?://doi.org/ prefix, lowercase, trim), then collapsed in a Map<doi, { paper, sources[] }>. Papers without a DOI go to a separate papersWithoutDoi list so they do not pollute the dedup count.
Multi-source ranking. Deduplicated papers sort by sources.length descending, then citationCount descending. Coverage stats (sourcesSearched, resultsPerSource, papersFoundInMultipleSources) are computed from the raw per-source counts.
Idle shutdown. A 30-second interval checks Date.now() - lastRequestAt. If the gap exceeds STANDBY_IDLE_TIMEOUT_SECS (default 300), the actor calls Actor.exit() to release platform compute. The next request cold-starts a fresh instance.

Tips for best results

Call research_list_sources once per session. It is free, and the agent gets a current map of which database covers which discipline. Saves a wasted paid call when the user asks for a database that does not match the topic (e.g. ArXiv for biomedical-only research).
Use research_literature_review for "what does the literature say" prompts. Three databases for $0.15 beats three single-database calls at $0.15 total when you want dedup and coverage stats. Single-database tools are for when the user names the database.
Add include_arxiv: true only for STEM topics. ArXiv covers physics, math, CS, q-bio, q-fin, and stats. For biomedical, chemistry-only, or social science queries, ArXiv adds time (1 req per 3s rate limit) without adding coverage.
Use field tags in PubMed queries for precision. "BRCA1[Gene Symbol] AND breast cancer[MeSH Terms]" returns far fewer false positives than "BRCA1 breast cancer". The PubMed source supports the full field-tag and boolean syntax.
Sort Semantic Scholar by citationCount for established topics, publicationDate for emerging ones. Citation-sort surfaces canonical papers in a mature field; date-sort surfaces fresh work in fast-moving areas like LLM research.
Use ORCID IDs when known, not just names. Common researcher names ("Wei Zhang", "Sarah Kim") return mixed results from multiple people. research_find_researcher with family_name + affiliation disambiguates; passing the raw ORCID ID via the query parameter is even more precise.
Set fetch_works: false for researcher discovery, true for verification. Discovery (finding the right person) only needs profile metadata. Verification (confirming publication record) needs the full works list. Default is false so casual lookups stay fast.
Tune STANDBY_IDLE_TIMEOUT_SECS for traffic pattern. Bursty agent traffic benefits from a longer idle window (600-900s) to avoid cold starts. Always-on workloads can use the 300s default.

Combine with other Apify actors

Actor	How to combine
PubMed Research Search	The biomedical sub-actor. Call directly for batch jobs that do not need the MCP overhead, or to pull full datasets for downstream processing.
Semantic Scholar Search	The all-discipline sub-actor with AI TLDRs. Use directly when you need citation-sorted results across all fields with paper summaries.
ArXiv Paper Search	The preprint sub-actor. Call directly for high-volume preprint indexing where the 1-req-per-3s rate limit needs its own run isolation.
Crossref Paper Search	The DOI metadata sub-actor with funder and ORCID data. Use directly for funder-tracking or grant-paper-linkage workflows.
OpenAlex Research Search	The broad academic index sub-actor with concept tagging. Use directly for concept-based exploration and institution-level analytics.
ORCID Researcher Search	The researcher profile sub-actor. Use directly for batch researcher verification across a candidate list.
Research Integrity Screening MCP	The companion integrity tool. Use this MCP for literature discovery, then pipe candidate authors into the integrity MCP for retraction, paper-mill, and citation-anomaly screening.
NIH Research Grants	Cross-reference paper authors against NIH PI records to surface funding context for biomedical literature reviews.
Company Deep Research	Pair when researching biotech, pharma, or AI companies whose leadership has academic publishing histories.

Limitations

No full-text PDF download. This MCP returns metadata, abstracts, and source URLs. To retrieve PDFs, follow the url field per paper or use a separate full-text retrieval tool.
ArXiv rate limit at the source. ArXiv enforces 1 request per 3 seconds. Large max_results values on research_search_arxiv are correspondingly slow (a 300-result query takes ~15 minutes). The sub-actor handles the pacing, but the wait is real.
DOI coverage varies by database. Crossref always has DOIs (it is the DOI registry). Semantic Scholar and OpenAlex have high DOI coverage. PubMed has DOI coverage on most modern records, missing on older citations and some grey literature. ArXiv preprints have DOIs only after the corresponding journal publication.
Single-database tools return database-native field sets. A PubMed paper has mesh, articleType, pubmedUrl; a Semantic Scholar paper has tldr, influentialCitationCount. Only research_literature_review normalizes to a unified shape, single-database calls preserve source-specific fields.
ORCID fetch_works: true is slow. Pulling the full publication list adds one upstream call per researcher. For 25 researchers with fetch_works: true, expect 60-180 seconds.
Child sub-actor timeout is 120-300 seconds depending on tool. If a source is slow, it returns an empty array and the composite review still completes with available data. coverage.resultsPerSource shows which sources delivered, so a thin result is visible.
Semantic Scholar polite-pool rate limits apply. High-volume callers (1000+ requests per hour) may see throttling at the source. Spread calls across runs or schedule rather than burst.
OpenAlex indexes content from other sources. OpenAlex pulls from Crossref, PubMed, and others, so a literature review including OpenAlex may surface duplicates that the DOI dedup will collapse, but the raw resultsPerSource count will look inflated.

Integrations

Apify API, trigger academic searches programmatically from research-knowledge-base builders, literature-monitoring tools, or systematic-review software.
Webhooks, push new high-citation papers to Slack, email, or knowledge-base ingestion pipelines the moment a scheduled search completes.
Zapier, connect to Airtable or Google Sheets paper trackers; auto-log new literature-review results when topics are added.
Make, build research-monitoring workflows that re-run searches weekly and diff new papers against the prior run.
LangChain / LlamaIndex, embed the MCP as a tool in agent pipelines for automated literature search, RAG ingestion, and research synthesis.

Troubleshooting

Literature review returns coverage.resultsPerSource with one or two sources at 0. One or more sub-actors timed out or returned no matches. Check coverage.sourcesSearched to confirm which databases ran. If a specific source consistently times out, call its single-database tool directly with the same query to isolate the issue.

Tool returns { "error": "Provide at least one of: query, author, or journal" }. The tool requires at least one search field. PubMed, ArXiv, Crossref, and ORCID accept multiple optional fields but at least one must be set. Semantic Scholar and OpenAlex require query.

research_find_researcher returns many unrelated profiles. The name was too common. Add affiliation ("MIT", "Google DeepMind") or pass the ORCID ID via the query parameter for exact match.

ArXiv search is slow. Source rate limit (1 req per 3s). Lower max_results or run the search as a scheduled job. The actor handles the pacing; the wait is at ArXiv, not the MCP.

Cold-start delay on first call. Standby mode shuts the instance down after idle (default 300s) to release platform compute. First request after idle takes ~10-20 seconds to spin up. Subsequent calls in the same window are instant. Increase STANDBY_IDLE_TIMEOUT_SECS if you need longer warm windows.

Tool returns { "error": true, "message": "Spending limit reached" }. Your Apify run has hit the maximum charge limit configured for the run. Increase maxTotalChargeUsd in your run configuration, or purchase additional platform credits.

Responsible use

All data accessed by this server comes from publicly available academic databases (PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, ORCID) operating under open or polite-pool access policies.
Citation counts, AI TLDRs, and paper rankings are computed signals, not endorsements of paper quality. Always read the underlying papers before drawing conclusions.
When using ORCID profile data, comply with applicable data-protection regulations in your jurisdiction (GDPR, CCPA, etc.) when storing or sharing researcher information.
Do not use this tool to harass, defame, or discriminate against researchers based on publication record alone.
For guidance on web scraping and data use legality, see Apify's guide.

FAQ

What is the difference between this MCP and calling the six sub-actors directly? The sub-actors return source-specific paper shapes (different field names, different field sets). This MCP normalizes them into a unified schema, adds DOI-based deduplication via research_literature_review, and exposes everything as MCP tools so AI agents (Claude, Cursor, Windsurf, custom agents) can discover and call them through the standard MCP protocol. If you are running batch jobs from a fixed script, the sub-actors directly may be a better fit; if you are building agent workflows, this MCP is the integration surface.

Why does research_literature_review only query three databases (PubMed, Semantic Scholar, Crossref) by default? These three give the highest cross-database coverage for most topics with the lowest latency. PubMed covers biomedical, Semantic Scholar covers all disciplines with AI summaries, Crossref covers DOI-registered publications across all fields. Adding ArXiv (set include_arxiv: true) is useful for STEM topics but adds noticeable time due to ArXiv's 1-req-per-3s rate limit. OpenAlex and ORCID are skipped from the composite review because OpenAlex aggregates from Crossref / PubMed (mostly duplicates) and ORCID is researcher-focused, not paper-focused.

How does the DOI deduplication work? Each paper's DOI is normalized (stripped of https?://doi.org/ prefix, lowercased, trimmed) and used as a map key. When the same DOI appears from multiple sources, the source names are appended to a sources[] array. The final papers[] is sorted by sources.length descending (most-corroborated first), then citationCount descending. Papers without a DOI cannot be deduplicated and go to a separate papersWithoutDoi[] list to keep the dedup count honest.

Do I need any API keys to use this MCP? No. All six data sources (PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, ORCID) are free public academic APIs with no authentication required for the access patterns this MCP uses. You only need an Apify token (for billing) configured in your MCP client.

How long does a typical literature review take? The default three-database review (PubMed + Semantic Scholar + Crossref) typically completes in 60-120 seconds, depending on result counts and source response times. Adding ArXiv extends this to 90-300 seconds for the same query (ArXiv rate limit). Single-database searches typically complete in 20-60 seconds.

Can I get the full PDF of each paper? Not directly. This MCP returns paper metadata, abstracts (or AI TLDRs from Semantic Scholar), and source URLs. The url field per paper points to the landing page (PubMed, Semantic Scholar, ArXiv abstract, Crossref DOI). For open-access papers, follow the URL to the PDF; for paywalled papers, you will hit the publisher's access wall. A separate full-text retrieval step is required for actual PDF content.

How is this MCP different from web search or Google Scholar? Web search returns ranked pages, not structured paper records. Google Scholar returns paper records but has no API and is unfriendly to programmatic access. This MCP returns clean structured JSON from six official academic data APIs, with cross-database deduplication and explicit coverage stats. For RAG pipelines, agent workflows, and systematic reviews, structured API access is more reliable than scraping search results.

Can I schedule searches to run periodically? Yes. Use the Apify Scheduler to trigger the actor on a daily, weekly, or monthly cadence. Configure a webhook to push new papers (or papers above a citation threshold) to your notification system or knowledge base. This is useful for literature monitoring, RAG corpus refresh, and competitive-intelligence tracking of preprints.

Is it legal to use this tool for academic research? All six underlying data sources (PubMed, Semantic Scholar, ArXiv, Crossref, OpenAlex, ORCID) are publicly available academic databases that explicitly support programmatic access. Accessing and analyzing public scholarly records is a standard practice in academic research, systematic reviews, and grant management. See Apify's guide on web scraping legality for broader context.

Why does the agent need to call research_list_sources if it is free? It saves wasted paid calls. The agent can check which databases cover which disciplines (e.g. ArXiv has no biomedical content, PubMed has no CS content) before paying for a search that would return zero from a mismatched source. Especially useful for the first call in a session before the agent has built a mental model of the toolset.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

Go to Account Settings > Privacy
Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.

Related actors

AI Cold Email Writer — $0.01/Email, Zero LLM Markup

Generates personalized cold emails from enriched lead data using your own OpenAI or Anthropic key. Subject line, body, CTA, and optional follow-up sequence — $0.01/email, zero LLM markup.

$0.05/event

AI Outreach Personalizer — Emails with Your LLM Key

Generate personalized cold emails using your own OpenAI or Anthropic API key. Subject lines, opening lines, full bodies — tailored to each lead's role, company, and signals. $0.01/lead compute + your LLM costs. Zero AI markup.

$0.01/event

Bulk Email Verifier — MX, SMTP & Disposable Detection at Scale

Verify email deliverability in bulk — MX records, SMTP mailbox checks, disposable detection (55K+ domains), role-based flagging, catch-all detection, domain health scoring (SPF/DKIM/DMARC), and confidence scores. $0.005/email, no subscription.

$0.005/event

CFPB Complaint Intelligence — Vendor Risk & Screening

Turn 5M+ CFPB consumer complaints into decisions: screen companies pass / review / fail, score complaint-handling risk, monitor what changed since last run, benchmark cohorts, and build audit-ready due-diligence packs. Filter by company, product, state, and date. No API key.

$0.002/event

Not sure which actor to pick?

Try the actor recommender

Last verified: March 27, 2026

Ready to try Academic Research Intelligence MCP Server?

This actor is coming soon to the Apify Store.

Coming soon