AIDEVELOPER TOOLS

Drug Pipeline Report

Drug Pipeline Report gives pharmaceutical strategists, biotech investors, and pharmacovigilance teams a composite risk score (0-100) for any drug or therapeutic area in under two minutes — no data subscriptions required. It queries 7 public data sources simultaneously and synthesizes the results into four specialized scoring models: Pipeline Threat, First-Mover Advantage, Adverse Event Divergence, and Literature Momentum.

Try on Apify Store
$0.40per event
1
Users (30d)
5
Runs (30d)
90
Actively maintained
Maintenance Pulse
$0.40
Per event

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

analysis-runs
Estimated cost:$40.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
analysis-runFull intelligence analysis run$0.40

Example: 100 events = $40.00 · 1,000 events = $400.00

Documentation

Drug Pipeline Report gives pharmaceutical strategists, biotech investors, and pharmacovigilance teams a composite risk score (0-100) for any drug or therapeutic area in under two minutes — no data subscriptions required. It queries 7 public data sources simultaneously and synthesizes the results into four specialized scoring models: Pipeline Threat, First-Mover Advantage, Adverse Event Divergence, and Literature Momentum.

The actor runs all 7 sub-actors in parallel — clinical trial registries, FDA and EMA databases, PubMed, patent records, and adverse event reports — then applies weighted scoring algorithms to produce a single, structured intelligence report. You get one JSON record per run with every signal, score, raw record count, and top reaction term included. No scraping expertise needed.

What data can you extract?

Data PointSourceExample
🎯 Composite risk scoreAll 7 sources58 (HIGH)
⚠️ Risk level classificationScoring modelsHIGH, CRITICAL, MODERATE, LOW
🏭 Pipeline threat scoreClinicalTrials.gov65 — 4 Phase 3 competitors detected
🔬 Phase distribution of competitorsClinicalTrials.gov{ "Phase 3": 4, "Phase 2": 6, "Phase 1": 3 }
🛡️ First-mover advantage scorePatent DB + FDA72 — 5.3 years exclusivity remaining
📅 Earliest patent expiry datePatent database2031-06-15
🚨 Adverse event divergence scoreopenFDA FAERS35 (ELEVATED) — 85 reports, 1 death
💊 Top adverse reactionsopenFDA FAERSNausea (24 reports), Vomiting (15 reports)
📈 Literature momentum scorePubMed78 — publication rate accelerating
📰 Top publishing journalsPubMedNEJM, The Lancet, Diabetes Care
📊 Data source record countsAll 7 sources{ clinicalTrials: 13, patents: 8, ... }
🔔 All detected signalsScoring models["4 Phase 3 competitors in pipeline", ...]

Why use Drug Pipeline Report?

Manual pharmaceutical competitive intelligence means pulling ClinicalTrials.gov exports, running PubMed searches, cross-referencing FDA Orange Book data, and navigating the EMA product database — across 7 separate platforms, for each drug you track. At a research analyst's hourly rate, a single ad-hoc report can take 4-8 hours and still miss emerging signals from patent filings or adverse event trends.

This actor automates the entire data collection and scoring pipeline in a single run. You get the same intelligence that a pharma strategy team builds over a day, delivered in structured JSON in under two minutes.

  • Scheduling — run weekly competitive intelligence sweeps on your full pipeline portfolio to catch new Phase 3 entrants the moment they register
  • API access — trigger reports from Python, JavaScript, or any HTTP client to embed pipeline intelligence into internal dashboards or research tools
  • Proxy rotation — all sub-actor calls use Apify's built-in infrastructure with automatic retries for reliable data collection at scale
  • Monitoring — configure Slack or email alerts when runs complete so your team is notified the instant new pipeline data is available
  • Integrations — push results to Google Sheets for weekly pipeline tracking, or connect to Zapier and Make for automated report distribution

Features

  • 7 data sources queried in parallel — ClinicalTrials.gov, openFDA adverse events (FAERS), FDA drug approvals, FDA drug recalls, PubMed research publications, patent databases, and EMA medicine authorizations — all called simultaneously with Promise.allSettled for fault-tolerant execution
  • Composite risk score (0-100) — weighted synthesis across four models: Pipeline Threat (30%), Adverse Event Divergence (25%), Literature Momentum (25%), and inverted First-Mover Advantage (20%)
  • Pipeline Threat scoring — Phase 3 competitors are the primary signal at 8 points each (max 40). Trial density contributes up to 20 points. FDA and EMA approvals add up to 25 combined. Competitor recalls reduce the score by up to 15 points
  • First-Mover Advantage scoring — patent portfolio strength (max 30 points), years of exclusivity remaining at 2.5 points per year (max 25), trial phase lead (max 25), and existing FDA approvals (max 20). This dimension is inverted in the composite — higher advantage means lower overall risk
  • Adverse Event Divergence scoring — death reports contribute 7 points each (max 35). Serious event ratio above 30% exceeds class average (max 25 points). Hospitalization burden adds up to 20 points. Report volume adds logarithmic scoring up to 20 points
  • Literature Momentum scoring — publication volume (max 30), recency of publications in the last 2 years (max 30), acceleration at the 20%-growth threshold (max 25), and journal diversity (max 15)
  • Phase cliff detection — flags when patent exclusivity is less than 2 years remaining with an explicit warning signal
  • Publication acceleration detection — compares recent 2-year publication output against the prior 2-year window; 20%+ growth triggers an accelerating signal
  • Clinical trial phase filter — restrict analysis to Phase 1, 2, 3, or 4 trials to focus on the most relevant competitive window
  • Company name filter — combine drug name with a company name to sharpen searches across all 7 sub-actors simultaneously
  • Raw data included — up to 50 clinical trials, 20 adverse event reports, 30 PubMed publications, and 30 patents are included in the output for downstream inspection
  • Fault-tolerant execution — if any individual sub-actor fails, the run continues and scores with the remaining data sources rather than failing completely

Use cases for drug pipeline intelligence

Pharmaceutical competitive strategy

Strategy and commercial teams at pharmaceutical companies need to know the competitive density in a therapeutic area before a pipeline go/no-go decision. This actor surfaces how many Phase 3 competitors are registered, how many have already received FDA approval, and whether EMA authorizations compound the market pressure. A CRITICAL Pipeline Threat score with 5+ Phase 3 competitors is a data-backed argument for reconsidering R&D allocation or repositioning the indication.

Biotech investment due diligence

Investors evaluating early-stage drug candidates need a fast competitive moat assessment. The First-Mover Advantage score quantifies exactly that: patent coverage depth, years of exclusivity remaining, and how far ahead in clinical development the candidate is relative to rivals. Combine this with the Pipeline Threat score to produce a defensible position/threat matrix for investment memos — in minutes, not weeks.

Pharmacovigilance and safety signal screening

Medical safety teams monitoring class-level adverse event trends can use the Adverse Event Divergence score as an early screening layer on top of their internal systems. When the FAERS death report count or serious event ratio exceeds class norms, the score and signals surface that pattern immediately. This does not replace established pharmacovigilance processes but provides a publicly-sourced cross-check that can be run on any drug or therapeutic class.

Patent expiry and lifecycle management

Patent attorneys and lifecycle management teams tracking exclusivity windows can query any active drug to get the earliest estimated patent expiry date and years of exclusivity remaining. The patent cliff signal fires automatically when that window drops below 2 years — giving teams a data trigger to accelerate formulation extensions, line extensions, or authorized generic strategies.

Medical affairs and KOL landscape mapping

Medical affairs teams tracking research momentum in their therapeutic area can use the Literature Momentum score and top publishing journals list to understand where research activity is concentrated. Accelerating publication rates signal growing academic and clinical interest, and the journal diversity score indicates whether interest is broad-based or concentrated in specialty publications.

Business development and licensing

BD teams scouting in-licensing candidates can run a rapid pipeline threat assessment before entering due diligence. The composite risk score gives an objective baseline for comparing multiple candidates, and the signal list provides the specific competitive factors driving the score — making it easier to prioritize which assets to pursue.

How to generate a drug pipeline report

  1. Enter a drug name or therapeutic area — Type a specific drug name ("semaglutide"), active ingredient ("liraglutide"), or therapeutic area description ("GLP-1 receptor agonists" or "oncology checkpoint inhibitors") into the Query field.
  2. Optionally narrow by company and phase — Enter a company name like "Novo Nordisk" or "Pfizer" to focus searches, and select a clinical trial phase (Phase 1-4) to filter the competitive landscape to a specific development window.
  3. Click Start — The actor calls all 7 data sources simultaneously. Most runs complete in 60-120 seconds depending on result volume.
  4. Download your report — Go to the Dataset tab and export as JSON, CSV, or Excel. The single output record contains all four scoring models, all detected signals, and raw data slices.

Input parameters

ParameterTypeRequiredDefaultDescription
querystringYessemaglutideDrug name, active ingredient, or therapeutic area (e.g., "semaglutide", "GLP-1", "oncology checkpoint inhibitors")
companystringNoPharmaceutical company name to focus the analysis across all 7 sub-actors (e.g., "Novo Nordisk", "Pfizer")
phasestringNoClinical trial phase filter. One of: PHASE1, PHASE2, PHASE3, PHASE4

Input examples

Competitive intelligence on a specific drug:

{
  "query": "semaglutide",
  "company": "Novo Nordisk"
}

Therapeutic area scan with Phase 3 focus:

{
  "query": "oncology checkpoint inhibitors",
  "phase": "PHASE3"
}

Minimal run — drug name only:

{
  "query": "pembrolizumab"
}

Input tips

  • Start with the INN (generic name) — using the International Nonproprietary Name ("semaglutide" not "Ozempic") produces more consistent results across ClinicalTrials.gov, PubMed, and patent databases, which index by active ingredient
  • Add company for sharper patent and trial results — the company name is appended to the search query for clinical trials and PubMed, narrowing results toward that company's pipeline
  • Use Phase 3 filter for near-term competitive threats — Phase 3 is the strongest threat signal in the scoring model; filtering to PHASE3 removes earlier-stage noise when you only need late-stage competitive picture
  • Use therapeutic area queries for market landscape — broad queries like "GLP-1 receptor agonists" or "VEGF inhibitors" give you the full competitive density picture rather than tracking a single drug
  • Run without a phase filter first — the phase distribution in the pipeline threat output shows you where competitor density is highest before deciding to filter

Output example

{
  "company": "Novo Nordisk",
  "drug": "semaglutide",
  "compositeScore": 58,
  "riskLevel": "HIGH",
  "query": "semaglutide",
  "phaseFilter": null,
  "pipelineThreat": {
    "score": 65,
    "competitorCount": 18,
    "phaseDistribution": {
      "Phase 3": 4,
      "Phase 2": 6,
      "Phase 1": 3,
      "Phase 4": 2
    },
    "sameIndicationTrials": 13,
    "recentApprovals": 2,
    "recentRecalls": 0,
    "threatLevel": "HIGH",
    "signals": [
      "4 Phase 3 competitors in pipeline",
      "13 active clinical trials in therapeutic area",
      "2 recent FDA approvals in class"
    ]
  },
  "firstMoverAdvantage": {
    "score": 72,
    "patentsCovering": 8,
    "earliestPatentExpiry": "2031-06-15",
    "yearsOfExclusivity": 5.3,
    "trialPhaseLead": 4,
    "approvalPathwayClear": true,
    "signals": [
      "5.3 years patent exclusivity remaining",
      "Already has 2 FDA approval(s)"
    ]
  },
  "adverseEventDivergence": {
    "score": 35,
    "totalReports": 85,
    "seriousEvents": 18,
    "deathReports": 1,
    "hospitalizationReports": 12,
    "seriousRatio": 0.212,
    "divergenceLevel": "ELEVATED",
    "topReactions": [
      { "term": "Nausea", "count": 24 },
      { "term": "Vomiting", "count": 15 },
      { "term": "Diarrhoea", "count": 11 },
      { "term": "Decreased appetite", "count": 8 },
      { "term": "Abdominal pain", "count": 6 }
    ],
    "signals": []
  },
  "literatureMomentum": {
    "score": 78,
    "publicationCount": 42,
    "recentPublications": 28,
    "yearlyTrend": {
      "2022": 8,
      "2023": 14,
      "2024": 14,
      "2025": 6
    },
    "accelerating": true,
    "topJournals": [
      "The New England Journal of Medicine",
      "The Lancet",
      "Diabetes Care",
      "Nature Medicine",
      "JAMA"
    ],
    "signals": [
      "28 publications in last 2 years",
      "Publication rate accelerating (28 recent vs 14 prior 2yr)"
    ]
  },
  "allSignals": [
    "4 Phase 3 competitors in pipeline",
    "13 active clinical trials in therapeutic area",
    "2 recent FDA approvals in class",
    "5.3 years patent exclusivity remaining",
    "Already has 2 FDA approval(s)",
    "28 publications in last 2 years",
    "Publication rate accelerating (28 recent vs 14 prior 2yr)"
  ],
  "dataSources": {
    "clinicalTrials": 13,
    "fdaAdverseEvents": 85,
    "fdaApprovals": 2,
    "fdaRecalls": 0,
    "pubmedPublications": 42,
    "patents": 8,
    "emaAuthorizations": 3
  },
  "rawData": {
    "clinicalTrials": ["...up to 50 trial records..."],
    "fdaAdverseEvents": ["...up to 20 FAERS records..."],
    "fdaApprovals": ["...all approval records..."],
    "fdaRecalls": [],
    "pubmedPublications": ["...up to 30 PubMed records..."],
    "patents": ["...up to 30 patent records..."],
    "emaAuthorizations": ["...all EMA records..."]
  }
}

Output fields

FieldTypeDescription
companystringCompany name from input (or "Unknown")
drugstringDrug name from the query field
compositeScorenumberWeighted composite risk score 0-100. Higher = greater overall risk
riskLevelstringCRITICAL (75+), HIGH (50-74), MODERATE (25-49), LOW (0-24)
querystringOriginal query string
phaseFilterstring|nullClinical trial phase filter applied, or null
pipelineThreat.scorenumberPipeline competitive threat score 0-100
pipelineThreat.competitorCountnumberTotal competitor count across trials, FDA, and EMA
pipelineThreat.phaseDistributionobjectCount of competing trials by clinical phase
pipelineThreat.sameIndicationTrialsnumberActive clinical trials in the same therapeutic area
pipelineThreat.recentApprovalsnumberRecent FDA approvals in the same drug class
pipelineThreat.recentRecallsnumberCompetitor recalls detected (reduces threat score)
pipelineThreat.threatLevelstringCRITICAL, HIGH, MODERATE, or LOW
pipelineThreat.signalsstring[]Human-readable threat signals detected
firstMoverAdvantage.scorenumberFirst-mover advantage score 0-100 (inverted in composite)
firstMoverAdvantage.patentsCoveringnumberNumber of relevant patents found
firstMoverAdvantage.earliestPatentExpirystring|nullEarliest patent expiration date found
firstMoverAdvantage.yearsOfExclusivitynumberEstimated years of patent exclusivity remaining
firstMoverAdvantage.trialPhaseLeadnumberHighest clinical phase (1-4) found in own trials
firstMoverAdvantage.approvalPathwayClearbooleanTrue if Phase 3 or above, or existing FDA approval
firstMoverAdvantage.signalsstring[]First-mover signals (exclusivity warnings, approval status)
adverseEventDivergence.scorenumberAdverse event divergence score 0-100
adverseEventDivergence.totalReportsnumberTotal FAERS adverse event reports found
adverseEventDivergence.seriousEventsnumberNumber of reports classified as serious
adverseEventDivergence.deathReportsnumberReports flagged with seriousnessdeath = 1
adverseEventDivergence.hospitalizationReportsnumberReports flagged with seriousnesshospitalization = 1
adverseEventDivergence.seriousRationumberRatio of serious to total reports (0-1)
adverseEventDivergence.divergenceLevelstringCRITICAL, CONCERNING, ELEVATED, or NORMAL
adverseEventDivergence.topReactionsarrayTop 10 MedDRA reaction terms by frequency
adverseEventDivergence.signalsstring[]Safety signals exceeding class-average thresholds
literatureMomentum.scorenumberLiterature momentum score 0-100
literatureMomentum.publicationCountnumberTotal PubMed publications found
literatureMomentum.recentPublicationsnumberPublications in the last 2 years
literatureMomentum.yearlyTrendobjectPublication count per year
literatureMomentum.acceleratingbooleanTrue if recent 2-year output exceeds prior 2-year by 20%+
literatureMomentum.topJournalsstring[]Top 5 publishing journals by article count
literatureMomentum.signalsstring[]Momentum signals (acceleration, volume thresholds)
allSignalsstring[]Combined signal list from all four scoring models
dataSourcesobjectRecord counts from each of the 7 data sources
rawData.clinicalTrialsarrayUp to 50 raw clinical trial records
rawData.fdaAdverseEventsarrayUp to 20 raw FAERS event records
rawData.fdaApprovalsarrayAll FDA approval records found
rawData.fdaRecallsarrayAll FDA recall records found
rawData.pubmedPublicationsarrayUp to 30 raw PubMed publication records
rawData.patentsarrayUp to 30 raw patent records
rawData.emaAuthorizationsarrayAll EMA authorization records found

How much does it cost to generate a drug pipeline report?

Drug Pipeline Report uses pay-per-event pricing — you pay for the Apify platform compute time consumed across 7 parallel sub-actor calls. Each run costs approximately $0.25-$0.60 depending on result volume and sub-actor run times. Platform compute costs are included.

ScenarioRunsApprox. cost per runTotal cost
Quick test1$0.30$0.30
Weekly drug tracking (5 drugs)5$0.35$1.75
Monthly pipeline sweep (20 drugs)20$0.40$8.00
Quarterly portfolio review (50 drugs)50$0.40$20.00
Annual competitive monitoring (200 drugs)200$0.40$80.00

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.

Compare this to commercial pharmaceutical intelligence platforms like Cortellis, Citeline Pharmaprojects, or GlobalData Pharma, which charge $15,000-$50,000 per year for subscription access to similar pipeline data. For teams that run periodic competitive reviews rather than continuous monitoring, this actor delivers comparable public-data intelligence at a fraction of the cost with no subscription commitment.

Drug pipeline intelligence using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/drug-pipeline-report").call(run_input={
    "query": "semaglutide",
    "company": "Novo Nordisk",
    "phase": "PHASE3"
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"Composite Risk: {item['compositeScore']}/100 ({item['riskLevel']})")
    print(f"Pipeline Threat: {item['pipelineThreat']['score']}/100 — {item['pipelineThreat']['threatLevel']}")
    print(f"First-Mover Advantage: {item['firstMoverAdvantage']['score']}/100")
    print(f"Patent Exclusivity: {item['firstMoverAdvantage']['yearsOfExclusivity']} years remaining")
    print(f"Phase 3 Competitors: {item['pipelineThreat']['phaseDistribution'].get('Phase 3', 0)}")
    print(f"Key Signals: {', '.join(item['allSignals'][:3])}")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/drug-pipeline-report").call({
    query: "semaglutide",
    company: "Novo Nordisk",
    phase: "PHASE3"
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    console.log(`Composite Risk: ${item.compositeScore}/100 (${item.riskLevel})`);
    console.log(`Pipeline Threat: ${item.pipelineThreat.threatLevel} — ${item.pipelineThreat.sameIndicationTrials} trials`);
    console.log(`First-Mover: ${item.firstMoverAdvantage.score}/100 — ${item.firstMoverAdvantage.yearsOfExclusivity} years exclusivity`);
    console.log(`Adverse Events: ${item.adverseEventDivergence.divergenceLevel} — ${item.adverseEventDivergence.totalReports} reports`);
    console.log(`Literature Momentum: ${item.literatureMomentum.score}/100 — accelerating: ${item.literatureMomentum.accelerating}`);
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~drug-pipeline-report/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "semaglutide",
    "company": "Novo Nordisk",
    "phase": "PHASE3"
  }'

# Fetch results (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How Drug Pipeline Report works

Phase 1: Parallel data collection across 7 sources

The actor builds a tailored search query for each sub-actor. When a company name is provided, it appends the company to the clinical trial and PubMed queries (e.g., "semaglutide Novo Nordisk") to sharpen those searches toward the company's own pipeline. The drug name alone is used for FDA adverse events, FDA approvals, FDA recalls, and EMA searches — where company-level specificity would narrow results too aggressively. All 7 sub-actor calls execute via Promise.allSettled, which means a failure in any single source (network timeout, empty results, sub-actor error) does not abort the run — the remaining sources complete and the scoring models work with whatever data was successfully retrieved.

Phase 2: Four-model scoring

Each scoring function receives the full data object — a dictionary keyed by sub-actor name — and extracts only the fields it needs. The Pipeline Threat model parses phases and phase fields from clinical trial records, normalizing both numeric ("3") and Roman numeral ("III") representations into a consistent phase string before building the phaseDistribution map. The Adverse Event Divergence model reads seriousnessdeath, seriousnesshospitalization, and seriousnessother fields from FAERS records, then aggregates MedDRA reaction terms from the nested patient.reaction array. The Literature Momentum model parses publication dates from publicationDate, pubDate, or date fields across PubMed records, buckets them by year, and compares the two most recent years against the two prior years. The First-Mover model normalizes patent expiration dates from either expirationDate or patentExpireDate fields and calculates fractional years remaining.

Phase 3: Composite score assembly

The composite score applies a fixed weight vector: Pipeline Threat (30%), Adverse Event Divergence (25%), Literature Momentum (25%), First-Mover Advantage inverted (20%). The First-Mover score is subtracted from 100 before weighting — a strong patent and approval position reduces overall composite risk. The final score is clamped to [0, 100] and mapped to four risk tiers: CRITICAL (75+), HIGH (50-74), MODERATE (25-49), LOW (0-24). All signals from all four models are concatenated into allSignals for a single-field summary of every threshold that fired during the run.

Phase 4: Output assembly with raw data slices

The final dataset record includes the full scored report, all signals, and raw data slices: up to 50 clinical trial records, up to 20 FAERS event records, up to 30 PubMed records, and up to 30 patent records. FDA approval, recall, and EMA authorization records are included in full (not sliced) because these datasets are typically small. Source record counts are included in dataSources for quick validation that data collection succeeded across all 7 sources.

Tips for best results

  1. Use INN names for cross-source consistency. The International Nonproprietary Name ("semaglutide" not "Ozempic") is indexed consistently across ClinicalTrials.gov, PubMed, and patent databases. Trade names are inconsistently indexed and will produce lower record counts in clinical trial and patent sub-actors.

  2. Cross-check the dataSources counts before trusting the score. If clinicalTrials: 0 and patents: 0 appear in dataSources, it likely means the query term was too specific or the sub-actor returned no results. Try a broader query term before drawing conclusions from the composite score.

  3. Compare multiple candidates by running in sequence. The composite score is designed for relative comparison. Run the actor for 3-5 competing drugs in the same therapeutic area and rank by composite score to identify which has the most defensible position.

  4. Use the phase filter to stress-test a specific competitive window. Filter to PHASE3 to see only late-stage competitive pressure. The Pipeline Threat score will reflect only Phase 3 trial density, giving a focused view of near-term market entry risk.

  5. Schedule weekly runs on your top-priority drugs. New Phase 3 trial registrations appear on ClinicalTrials.gov within days of submission. A weekly scheduled run on your lead drug candidate will surface new competitive entrants before they appear in analyst reports.

  6. Pair the output with Company Deep Research to add corporate financial and organizational context to the pipeline scoring data — useful for BD and licensing assessments.

  7. For broad therapeutic area scans, start with no company filter. Omitting the company name returns the full competitive landscape. Then re-run with specific competitor company names to see their individual pipeline positions within that landscape.

  8. Export raw data for secondary analysis. The rawData fields include structured records from all 7 sources. Export as JSON and load into a notebook or analysis tool to build custom visualizations of the phase distribution or publication trend data.

Combine with other Apify actors

ActorHow to combine
Company Deep ResearchRun after Drug Pipeline Report to add corporate financial health, executive team, and M&A history context to companies surfaced in the pipeline threat analysis
Patent SearchQuery individual patent numbers from rawData.patents to get full claim text and assignee details for legal review
Competitor Analysis ReportCross-reference pipeline threat signals against competitor commercial performance data for a full competitive picture
SEC EDGAR Filing AnalyzerPull 10-K and 10-Q filings for competitors identified in the pipeline threat output to understand their R&D spend and pipeline investment priorities
Website Content to MarkdownConvert competitor pipeline pages or drug database records to clean markdown for LLM-powered analysis pipelines
B2B Lead QualifierScore pharma companies identified in the pipeline analysis as potential BD partners or acquisition targets based on 30+ firmographic signals

Limitations

  • Data is publicly sourced only. The actor queries ClinicalTrials.gov, openFDA, PubMed, public patent databases, and the EMA product database. Proprietary pipeline databases (Citeline, Cortellis, GlobalData) contain additional confidential trial data that this actor cannot access.
  • Patent expiry dates are estimates. Patent expiry is calculated from expirationDate or patentExpireDate fields in the patent database records. These dates do not account for patent term extensions (PTE), supplementary protection certificates (SPC), or litigation-related delays that may extend or shorten the actual exclusivity window.
  • Adverse event scoring is a screening tool, not a clinical assessment. FAERS data reflects voluntary and mandatory reports submitted to the FDA and includes duplicate records, confounding medications, and incomplete case narratives. The Adverse Event Divergence score surfaces patterns, not causality. It must not replace formal signal detection methods.
  • ClinicalTrials.gov registration lag. There is typically a 21-day gap between trial initiation and ClinicalTrials.gov registration. Very recently initiated Phase 3 trials may not appear in results.
  • Publication dates may be inconsistent across PubMed records. The momentum model parses publicationDate, pubDate, and date fields. Records with missing or malformed date fields are excluded from the yearly trend calculation, which may slightly undercount recent publications.
  • EMA data covers European authorizations only. Products authorized exclusively in other non-FDA, non-EMA markets (Japan PMDA, Health Canada, TGA Australia) are not captured in the competitive threat scoring.
  • Sub-actor failures reduce scoring confidence. If a sub-actor returns zero results due to a temporary API issue, the scoring model treats that dimension as zero data. Always check dataSources record counts to confirm successful collection before acting on a score.
  • Not a substitute for expert analysis. The composite score and signals are decision-support tools. Material strategic decisions should be reviewed by qualified pharmaceutical strategists, patent attorneys, and medical affairs professionals.

Integrations

  • Zapier — trigger an automated pipeline report whenever a new drug enters your company's development watchlist, and route the results to Slack or email
  • Make — build weekly competitive intelligence workflows that run Drug Pipeline Report on a schedule and push high-risk scores to a shared dashboard
  • Google Sheets — export composite scores and signal lists to a portfolio tracking spreadsheet for regular pipeline review meetings
  • Apify API — integrate drug pipeline scoring directly into internal research tools, BD platforms, or clinical operations systems
  • Webhooks — post completed report data to any internal endpoint, triggering downstream analysis or notification workflows
  • LangChain / LlamaIndex — feed pipeline report output into LLM-powered competitive intelligence agents for automated narrative summaries and strategic recommendations

Troubleshooting

Composite score seems unexpectedly low despite known competitive activity. Check the dataSources counts in the output. If clinicalTrials or patents show 0 records, the query term likely returned no results from those sub-actors. Try the drug's INN name instead of a brand name, or broaden the query to the mechanism of action (e.g., "PD-1 inhibitor" instead of "pembrolizumab").

Run times out or takes longer than 2 minutes. The actor calls 7 sub-actors in parallel with a 120-second timeout per sub-actor. Queries that return very large result sets (broad therapeutic area queries) may hit this limit for some sub-actors. Those sub-actors will return partial or empty results but the run will still complete. Use a more specific drug name or add a company filter to reduce result volume.

phaseFilter is set in output but phase distribution includes other phases. The phase filter applies only to the clinical-trial-tracker sub-actor query. FDA approvals, EMA authorizations, and adverse events are not filtered by phase — they are always retrieved for all phases regardless of the phase input.

allSignals is empty despite a high composite score. Signals are generated only when data crosses specific thresholds (e.g., 3+ Phase 3 competitors, 2+ recent FDA approvals). A high score driven by sub-threshold contributions across multiple dimensions may not trigger any individual signal while still producing an elevated composite score. Review the individual model scores to understand which dimensions are driving the composite.

EMA authorization count is always 0. The EMA Medicines Search sub-actor uses the drug name directly. Some drugs are authorized under different names in the EU. Try querying with the INN rather than a brand name, or check the EMA Product database manually to confirm the drug's EU authorization status.

Responsible use

  • This actor queries publicly available regulatory databases, clinical trial registries, and academic literature indexes.
  • Adverse event data from FAERS represents reported events and does not establish causation or constitute clinical guidance.
  • Do not use pipeline intelligence derived from this actor as the sole basis for clinical, regulatory, or patient-safety decisions.
  • Patent expiry estimates are informational only and do not constitute legal advice. Consult a qualified patent attorney before making lifecycle management decisions based on exclusivity calculations.
  • For guidance on web scraping legality and data use, see Apify's guide.

FAQ

How many data sources does the drug pipeline report query? The actor queries 7 sources simultaneously: ClinicalTrials.gov (via clinical-trial-tracker), FDA adverse events (FAERS via openFDA), FDA drug approvals, FDA drug recalls, PubMed research publications, patent databases, and EMA medicine authorizations. All 7 run in parallel using Promise.allSettled, so a single source failure does not abort the run.

How is the composite drug pipeline risk score calculated? The composite score (0-100) is a weighted average of four models: Pipeline Threat (30%), Adverse Event Divergence (25%), Literature Momentum (25%), and inverted First-Mover Advantage (20%). The First-Mover score is subtracted from 100 before weighting — a strong patent position reduces overall risk. All four model scores contribute to the final number even if individual data sources returned partial results.

What does the Pipeline Threat score measure specifically? Pipeline Threat quantifies competitive pressure from other drugs in the same therapeutic area. Phase 3 competitors are the strongest signal at 8 points each (maximum 40 points). Active trial density adds up to 20 points. Recent FDA approvals in the same class contribute up to 15 points. EMA authorizations add up to 10 points. Competitor recalls reduce the score by up to 15 points because they indicate competitors are losing ground.

How accurate is the First-Mover Advantage patent exclusivity calculation? The exclusivity estimate is based on the earliest expirationDate or patentExpireDate field found in patent records. It does not account for patent term extensions (PTE), supplementary protection certificates in EU markets, or ongoing litigation that could shorten or lengthen the actual exclusivity window. Treat it as a directional indicator, not a legal determination. The patent cliff warning fires automatically when estimated exclusivity drops below 2 years.

Is it legal to query FDA, PubMed, and ClinicalTrials.gov data? Yes. All data sources queried by this actor are publicly accessible government databases and academic indexes. ClinicalTrials.gov, openFDA (FAERS), FDA drug approvals and recalls, PubMed, and the EMA medicines database are all publicly available under open data policies. See Apify's guide on web scraping legality for broader context.

How long does a typical drug pipeline report run take? Most runs complete in 60-120 seconds. The actor is gated by the slowest sub-actor call. Broad therapeutic area queries that return large result sets may take slightly longer. All 7 sub-actors run in parallel, so total time is determined by the single longest call rather than the sum of all calls.

Can I compare two competing drugs using this actor? Yes. Run the actor once for each drug and compare their composite scores, Pipeline Threat levels, and First-Mover Advantage scores side by side. The structured JSON output is designed for programmatic comparison. A typical competitive analysis runs 3-5 drugs in a therapeutic area, then ranks them by composite score and pipeline threat level.

How is this different from commercial pharma intelligence platforms like Cortellis or GlobalData? Commercial platforms include proprietary trial data, confidential licensing agreements, and human-curated pipeline entries that go beyond public registrations. This actor uses only publicly available data sources and produces a quantitative composite score rather than qualitative analyst commentary. It is best suited for teams that need fast, objective, public-data assessments — not a replacement for full platform subscriptions in regulatory or clinical contexts.

What does the Literature Momentum score indicate about competition? A high Literature Momentum score means the therapeutic area is generating a large and growing volume of academic research. This signals that more labs and companies are investigating the space, which typically precedes an increase in clinical trial activity 2-4 years later. An accelerating publication rate (20%+ growth) is an early warning that competitive density will likely increase in the coming years.

Can I schedule this actor to monitor my pipeline automatically? Yes. Use Apify's built-in scheduler to run the actor daily, weekly, or monthly. Configure Slack or email alerts so your team is notified when a run completes. This is the recommended approach for ongoing competitive monitoring of priority drug candidates.

What happens if one of the 7 sub-actors fails during a run? The actor uses Promise.allSettled for parallel execution, which means individual sub-actor failures are caught and recorded without aborting the overall run. The affected data source will show 0 in dataSources, and the scoring models will proceed with data from the remaining sources. Check dataSources counts in the output to confirm whether all 7 sources returned data successfully.

Does the Adverse Event Divergence score replace pharmacovigilance systems? No. The score aggregates publicly available FAERS data as a screening-level indicator. FAERS records are voluntary and mandatory reporter submissions that may include duplicates, confounding factors, and incomplete narratives. The score is designed to surface patterns worth investigating through formal pharmacovigilance processes — not to replace them.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Drug Pipeline Report?

Start for free on Apify. No credit card required.

Open on Apify Store