Drug Pipeline Report
Drug Pipeline Report gives pharmaceutical strategists, biotech investors, and pharmacovigilance teams a composite risk score (0-100) for any drug or therapeutic area in under two minutes — no data subscriptions required. It queries 7 public data sources simultaneously and synthesizes the results into four specialized scoring models: Pipeline Threat, First-Mover Advantage, Adverse Event Divergence, and Literature Momentum.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| analysis-run | Full intelligence analysis run | $0.40 |
Example: 100 events = $40.00 · 1,000 events = $400.00
Documentation
Drug Pipeline Report gives pharmaceutical strategists, biotech investors, and pharmacovigilance teams a composite risk score (0-100) for any drug or therapeutic area in under two minutes — no data subscriptions required. It queries 7 public data sources simultaneously and synthesizes the results into four specialized scoring models: Pipeline Threat, First-Mover Advantage, Adverse Event Divergence, and Literature Momentum.
The actor runs all 7 sub-actors in parallel — clinical trial registries, FDA and EMA databases, PubMed, patent records, and adverse event reports — then applies weighted scoring algorithms to produce a single, structured intelligence report. You get one JSON record per run with every signal, score, raw record count, and top reaction term included. No scraping expertise needed.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| 🎯 Composite risk score | All 7 sources | 58 (HIGH) |
| ⚠️ Risk level classification | Scoring models | HIGH, CRITICAL, MODERATE, LOW |
| 🏭 Pipeline threat score | ClinicalTrials.gov | 65 — 4 Phase 3 competitors detected |
| 🔬 Phase distribution of competitors | ClinicalTrials.gov | { "Phase 3": 4, "Phase 2": 6, "Phase 1": 3 } |
| 🛡️ First-mover advantage score | Patent DB + FDA | 72 — 5.3 years exclusivity remaining |
| 📅 Earliest patent expiry date | Patent database | 2031-06-15 |
| 🚨 Adverse event divergence score | openFDA FAERS | 35 (ELEVATED) — 85 reports, 1 death |
| 💊 Top adverse reactions | openFDA FAERS | Nausea (24 reports), Vomiting (15 reports) |
| 📈 Literature momentum score | PubMed | 78 — publication rate accelerating |
| 📰 Top publishing journals | PubMed | NEJM, The Lancet, Diabetes Care |
| 📊 Data source record counts | All 7 sources | { clinicalTrials: 13, patents: 8, ... } |
| 🔔 All detected signals | Scoring models | ["4 Phase 3 competitors in pipeline", ...] |
Why use Drug Pipeline Report?
Manual pharmaceutical competitive intelligence means pulling ClinicalTrials.gov exports, running PubMed searches, cross-referencing FDA Orange Book data, and navigating the EMA product database — across 7 separate platforms, for each drug you track. At a research analyst's hourly rate, a single ad-hoc report can take 4-8 hours and still miss emerging signals from patent filings or adverse event trends.
This actor automates the entire data collection and scoring pipeline in a single run. You get the same intelligence that a pharma strategy team builds over a day, delivered in structured JSON in under two minutes.
- Scheduling — run weekly competitive intelligence sweeps on your full pipeline portfolio to catch new Phase 3 entrants the moment they register
- API access — trigger reports from Python, JavaScript, or any HTTP client to embed pipeline intelligence into internal dashboards or research tools
- Proxy rotation — all sub-actor calls use Apify's built-in infrastructure with automatic retries for reliable data collection at scale
- Monitoring — configure Slack or email alerts when runs complete so your team is notified the instant new pipeline data is available
- Integrations — push results to Google Sheets for weekly pipeline tracking, or connect to Zapier and Make for automated report distribution
Features
- 7 data sources queried in parallel — ClinicalTrials.gov, openFDA adverse events (FAERS), FDA drug approvals, FDA drug recalls, PubMed research publications, patent databases, and EMA medicine authorizations — all called simultaneously with
Promise.allSettledfor fault-tolerant execution - Composite risk score (0-100) — weighted synthesis across four models: Pipeline Threat (30%), Adverse Event Divergence (25%), Literature Momentum (25%), and inverted First-Mover Advantage (20%)
- Pipeline Threat scoring — Phase 3 competitors are the primary signal at 8 points each (max 40). Trial density contributes up to 20 points. FDA and EMA approvals add up to 25 combined. Competitor recalls reduce the score by up to 15 points
- First-Mover Advantage scoring — patent portfolio strength (max 30 points), years of exclusivity remaining at 2.5 points per year (max 25), trial phase lead (max 25), and existing FDA approvals (max 20). This dimension is inverted in the composite — higher advantage means lower overall risk
- Adverse Event Divergence scoring — death reports contribute 7 points each (max 35). Serious event ratio above 30% exceeds class average (max 25 points). Hospitalization burden adds up to 20 points. Report volume adds logarithmic scoring up to 20 points
- Literature Momentum scoring — publication volume (max 30), recency of publications in the last 2 years (max 30), acceleration at the 20%-growth threshold (max 25), and journal diversity (max 15)
- Phase cliff detection — flags when patent exclusivity is less than 2 years remaining with an explicit warning signal
- Publication acceleration detection — compares recent 2-year publication output against the prior 2-year window; 20%+ growth triggers an accelerating signal
- Clinical trial phase filter — restrict analysis to Phase 1, 2, 3, or 4 trials to focus on the most relevant competitive window
- Company name filter — combine drug name with a company name to sharpen searches across all 7 sub-actors simultaneously
- Raw data included — up to 50 clinical trials, 20 adverse event reports, 30 PubMed publications, and 30 patents are included in the output for downstream inspection
- Fault-tolerant execution — if any individual sub-actor fails, the run continues and scores with the remaining data sources rather than failing completely
Use cases for drug pipeline intelligence
Pharmaceutical competitive strategy
Strategy and commercial teams at pharmaceutical companies need to know the competitive density in a therapeutic area before a pipeline go/no-go decision. This actor surfaces how many Phase 3 competitors are registered, how many have already received FDA approval, and whether EMA authorizations compound the market pressure. A CRITICAL Pipeline Threat score with 5+ Phase 3 competitors is a data-backed argument for reconsidering R&D allocation or repositioning the indication.
Biotech investment due diligence
Investors evaluating early-stage drug candidates need a fast competitive moat assessment. The First-Mover Advantage score quantifies exactly that: patent coverage depth, years of exclusivity remaining, and how far ahead in clinical development the candidate is relative to rivals. Combine this with the Pipeline Threat score to produce a defensible position/threat matrix for investment memos — in minutes, not weeks.
Pharmacovigilance and safety signal screening
Medical safety teams monitoring class-level adverse event trends can use the Adverse Event Divergence score as an early screening layer on top of their internal systems. When the FAERS death report count or serious event ratio exceeds class norms, the score and signals surface that pattern immediately. This does not replace established pharmacovigilance processes but provides a publicly-sourced cross-check that can be run on any drug or therapeutic class.
Patent expiry and lifecycle management
Patent attorneys and lifecycle management teams tracking exclusivity windows can query any active drug to get the earliest estimated patent expiry date and years of exclusivity remaining. The patent cliff signal fires automatically when that window drops below 2 years — giving teams a data trigger to accelerate formulation extensions, line extensions, or authorized generic strategies.
Medical affairs and KOL landscape mapping
Medical affairs teams tracking research momentum in their therapeutic area can use the Literature Momentum score and top publishing journals list to understand where research activity is concentrated. Accelerating publication rates signal growing academic and clinical interest, and the journal diversity score indicates whether interest is broad-based or concentrated in specialty publications.
Business development and licensing
BD teams scouting in-licensing candidates can run a rapid pipeline threat assessment before entering due diligence. The composite risk score gives an objective baseline for comparing multiple candidates, and the signal list provides the specific competitive factors driving the score — making it easier to prioritize which assets to pursue.
How to generate a drug pipeline report
- Enter a drug name or therapeutic area — Type a specific drug name ("semaglutide"), active ingredient ("liraglutide"), or therapeutic area description ("GLP-1 receptor agonists" or "oncology checkpoint inhibitors") into the Query field.
- Optionally narrow by company and phase — Enter a company name like "Novo Nordisk" or "Pfizer" to focus searches, and select a clinical trial phase (Phase 1-4) to filter the competitive landscape to a specific development window.
- Click Start — The actor calls all 7 data sources simultaneously. Most runs complete in 60-120 seconds depending on result volume.
- Download your report — Go to the Dataset tab and export as JSON, CSV, or Excel. The single output record contains all four scoring models, all detected signals, and raw data slices.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes | semaglutide | Drug name, active ingredient, or therapeutic area (e.g., "semaglutide", "GLP-1", "oncology checkpoint inhibitors") |
company | string | No | — | Pharmaceutical company name to focus the analysis across all 7 sub-actors (e.g., "Novo Nordisk", "Pfizer") |
phase | string | No | — | Clinical trial phase filter. One of: PHASE1, PHASE2, PHASE3, PHASE4 |
Input examples
Competitive intelligence on a specific drug:
{
"query": "semaglutide",
"company": "Novo Nordisk"
}
Therapeutic area scan with Phase 3 focus:
{
"query": "oncology checkpoint inhibitors",
"phase": "PHASE3"
}
Minimal run — drug name only:
{
"query": "pembrolizumab"
}
Input tips
- Start with the INN (generic name) — using the International Nonproprietary Name ("semaglutide" not "Ozempic") produces more consistent results across ClinicalTrials.gov, PubMed, and patent databases, which index by active ingredient
- Add company for sharper patent and trial results — the company name is appended to the search query for clinical trials and PubMed, narrowing results toward that company's pipeline
- Use Phase 3 filter for near-term competitive threats — Phase 3 is the strongest threat signal in the scoring model; filtering to PHASE3 removes earlier-stage noise when you only need late-stage competitive picture
- Use therapeutic area queries for market landscape — broad queries like "GLP-1 receptor agonists" or "VEGF inhibitors" give you the full competitive density picture rather than tracking a single drug
- Run without a phase filter first — the phase distribution in the pipeline threat output shows you where competitor density is highest before deciding to filter
Output example
{
"company": "Novo Nordisk",
"drug": "semaglutide",
"compositeScore": 58,
"riskLevel": "HIGH",
"query": "semaglutide",
"phaseFilter": null,
"pipelineThreat": {
"score": 65,
"competitorCount": 18,
"phaseDistribution": {
"Phase 3": 4,
"Phase 2": 6,
"Phase 1": 3,
"Phase 4": 2
},
"sameIndicationTrials": 13,
"recentApprovals": 2,
"recentRecalls": 0,
"threatLevel": "HIGH",
"signals": [
"4 Phase 3 competitors in pipeline",
"13 active clinical trials in therapeutic area",
"2 recent FDA approvals in class"
]
},
"firstMoverAdvantage": {
"score": 72,
"patentsCovering": 8,
"earliestPatentExpiry": "2031-06-15",
"yearsOfExclusivity": 5.3,
"trialPhaseLead": 4,
"approvalPathwayClear": true,
"signals": [
"5.3 years patent exclusivity remaining",
"Already has 2 FDA approval(s)"
]
},
"adverseEventDivergence": {
"score": 35,
"totalReports": 85,
"seriousEvents": 18,
"deathReports": 1,
"hospitalizationReports": 12,
"seriousRatio": 0.212,
"divergenceLevel": "ELEVATED",
"topReactions": [
{ "term": "Nausea", "count": 24 },
{ "term": "Vomiting", "count": 15 },
{ "term": "Diarrhoea", "count": 11 },
{ "term": "Decreased appetite", "count": 8 },
{ "term": "Abdominal pain", "count": 6 }
],
"signals": []
},
"literatureMomentum": {
"score": 78,
"publicationCount": 42,
"recentPublications": 28,
"yearlyTrend": {
"2022": 8,
"2023": 14,
"2024": 14,
"2025": 6
},
"accelerating": true,
"topJournals": [
"The New England Journal of Medicine",
"The Lancet",
"Diabetes Care",
"Nature Medicine",
"JAMA"
],
"signals": [
"28 publications in last 2 years",
"Publication rate accelerating (28 recent vs 14 prior 2yr)"
]
},
"allSignals": [
"4 Phase 3 competitors in pipeline",
"13 active clinical trials in therapeutic area",
"2 recent FDA approvals in class",
"5.3 years patent exclusivity remaining",
"Already has 2 FDA approval(s)",
"28 publications in last 2 years",
"Publication rate accelerating (28 recent vs 14 prior 2yr)"
],
"dataSources": {
"clinicalTrials": 13,
"fdaAdverseEvents": 85,
"fdaApprovals": 2,
"fdaRecalls": 0,
"pubmedPublications": 42,
"patents": 8,
"emaAuthorizations": 3
},
"rawData": {
"clinicalTrials": ["...up to 50 trial records..."],
"fdaAdverseEvents": ["...up to 20 FAERS records..."],
"fdaApprovals": ["...all approval records..."],
"fdaRecalls": [],
"pubmedPublications": ["...up to 30 PubMed records..."],
"patents": ["...up to 30 patent records..."],
"emaAuthorizations": ["...all EMA records..."]
}
}
Output fields
| Field | Type | Description |
|---|---|---|
company | string | Company name from input (or "Unknown") |
drug | string | Drug name from the query field |
compositeScore | number | Weighted composite risk score 0-100. Higher = greater overall risk |
riskLevel | string | CRITICAL (75+), HIGH (50-74), MODERATE (25-49), LOW (0-24) |
query | string | Original query string |
phaseFilter | string|null | Clinical trial phase filter applied, or null |
pipelineThreat.score | number | Pipeline competitive threat score 0-100 |
pipelineThreat.competitorCount | number | Total competitor count across trials, FDA, and EMA |
pipelineThreat.phaseDistribution | object | Count of competing trials by clinical phase |
pipelineThreat.sameIndicationTrials | number | Active clinical trials in the same therapeutic area |
pipelineThreat.recentApprovals | number | Recent FDA approvals in the same drug class |
pipelineThreat.recentRecalls | number | Competitor recalls detected (reduces threat score) |
pipelineThreat.threatLevel | string | CRITICAL, HIGH, MODERATE, or LOW |
pipelineThreat.signals | string[] | Human-readable threat signals detected |
firstMoverAdvantage.score | number | First-mover advantage score 0-100 (inverted in composite) |
firstMoverAdvantage.patentsCovering | number | Number of relevant patents found |
firstMoverAdvantage.earliestPatentExpiry | string|null | Earliest patent expiration date found |
firstMoverAdvantage.yearsOfExclusivity | number | Estimated years of patent exclusivity remaining |
firstMoverAdvantage.trialPhaseLead | number | Highest clinical phase (1-4) found in own trials |
firstMoverAdvantage.approvalPathwayClear | boolean | True if Phase 3 or above, or existing FDA approval |
firstMoverAdvantage.signals | string[] | First-mover signals (exclusivity warnings, approval status) |
adverseEventDivergence.score | number | Adverse event divergence score 0-100 |
adverseEventDivergence.totalReports | number | Total FAERS adverse event reports found |
adverseEventDivergence.seriousEvents | number | Number of reports classified as serious |
adverseEventDivergence.deathReports | number | Reports flagged with seriousnessdeath = 1 |
adverseEventDivergence.hospitalizationReports | number | Reports flagged with seriousnesshospitalization = 1 |
adverseEventDivergence.seriousRatio | number | Ratio of serious to total reports (0-1) |
adverseEventDivergence.divergenceLevel | string | CRITICAL, CONCERNING, ELEVATED, or NORMAL |
adverseEventDivergence.topReactions | array | Top 10 MedDRA reaction terms by frequency |
adverseEventDivergence.signals | string[] | Safety signals exceeding class-average thresholds |
literatureMomentum.score | number | Literature momentum score 0-100 |
literatureMomentum.publicationCount | number | Total PubMed publications found |
literatureMomentum.recentPublications | number | Publications in the last 2 years |
literatureMomentum.yearlyTrend | object | Publication count per year |
literatureMomentum.accelerating | boolean | True if recent 2-year output exceeds prior 2-year by 20%+ |
literatureMomentum.topJournals | string[] | Top 5 publishing journals by article count |
literatureMomentum.signals | string[] | Momentum signals (acceleration, volume thresholds) |
allSignals | string[] | Combined signal list from all four scoring models |
dataSources | object | Record counts from each of the 7 data sources |
rawData.clinicalTrials | array | Up to 50 raw clinical trial records |
rawData.fdaAdverseEvents | array | Up to 20 raw FAERS event records |
rawData.fdaApprovals | array | All FDA approval records found |
rawData.fdaRecalls | array | All FDA recall records found |
rawData.pubmedPublications | array | Up to 30 raw PubMed publication records |
rawData.patents | array | Up to 30 raw patent records |
rawData.emaAuthorizations | array | All EMA authorization records found |
How much does it cost to generate a drug pipeline report?
Drug Pipeline Report uses pay-per-event pricing — you pay for the Apify platform compute time consumed across 7 parallel sub-actor calls. Each run costs approximately $0.25-$0.60 depending on result volume and sub-actor run times. Platform compute costs are included.
| Scenario | Runs | Approx. cost per run | Total cost |
|---|---|---|---|
| Quick test | 1 | $0.30 | $0.30 |
| Weekly drug tracking (5 drugs) | 5 | $0.35 | $1.75 |
| Monthly pipeline sweep (20 drugs) | 20 | $0.40 | $8.00 |
| Quarterly portfolio review (50 drugs) | 50 | $0.40 | $20.00 |
| Annual competitive monitoring (200 drugs) | 200 | $0.40 | $80.00 |
You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.
Compare this to commercial pharmaceutical intelligence platforms like Cortellis, Citeline Pharmaprojects, or GlobalData Pharma, which charge $15,000-$50,000 per year for subscription access to similar pipeline data. For teams that run periodic competitive reviews rather than continuous monitoring, this actor delivers comparable public-data intelligence at a fraction of the cost with no subscription commitment.
Drug pipeline intelligence using the API
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/drug-pipeline-report").call(run_input={
"query": "semaglutide",
"company": "Novo Nordisk",
"phase": "PHASE3"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"Composite Risk: {item['compositeScore']}/100 ({item['riskLevel']})")
print(f"Pipeline Threat: {item['pipelineThreat']['score']}/100 — {item['pipelineThreat']['threatLevel']}")
print(f"First-Mover Advantage: {item['firstMoverAdvantage']['score']}/100")
print(f"Patent Exclusivity: {item['firstMoverAdvantage']['yearsOfExclusivity']} years remaining")
print(f"Phase 3 Competitors: {item['pipelineThreat']['phaseDistribution'].get('Phase 3', 0)}")
print(f"Key Signals: {', '.join(item['allSignals'][:3])}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/drug-pipeline-report").call({
query: "semaglutide",
company: "Novo Nordisk",
phase: "PHASE3"
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
console.log(`Composite Risk: ${item.compositeScore}/100 (${item.riskLevel})`);
console.log(`Pipeline Threat: ${item.pipelineThreat.threatLevel} — ${item.pipelineThreat.sameIndicationTrials} trials`);
console.log(`First-Mover: ${item.firstMoverAdvantage.score}/100 — ${item.firstMoverAdvantage.yearsOfExclusivity} years exclusivity`);
console.log(`Adverse Events: ${item.adverseEventDivergence.divergenceLevel} — ${item.adverseEventDivergence.totalReports} reports`);
console.log(`Literature Momentum: ${item.literatureMomentum.score}/100 — accelerating: ${item.literatureMomentum.accelerating}`);
}
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~drug-pipeline-report/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "semaglutide",
"company": "Novo Nordisk",
"phase": "PHASE3"
}'
# Fetch results (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How Drug Pipeline Report works
Phase 1: Parallel data collection across 7 sources
The actor builds a tailored search query for each sub-actor. When a company name is provided, it appends the company to the clinical trial and PubMed queries (e.g., "semaglutide Novo Nordisk") to sharpen those searches toward the company's own pipeline. The drug name alone is used for FDA adverse events, FDA approvals, FDA recalls, and EMA searches — where company-level specificity would narrow results too aggressively. All 7 sub-actor calls execute via Promise.allSettled, which means a failure in any single source (network timeout, empty results, sub-actor error) does not abort the run — the remaining sources complete and the scoring models work with whatever data was successfully retrieved.
Phase 2: Four-model scoring
Each scoring function receives the full data object — a dictionary keyed by sub-actor name — and extracts only the fields it needs. The Pipeline Threat model parses phases and phase fields from clinical trial records, normalizing both numeric ("3") and Roman numeral ("III") representations into a consistent phase string before building the phaseDistribution map. The Adverse Event Divergence model reads seriousnessdeath, seriousnesshospitalization, and seriousnessother fields from FAERS records, then aggregates MedDRA reaction terms from the nested patient.reaction array. The Literature Momentum model parses publication dates from publicationDate, pubDate, or date fields across PubMed records, buckets them by year, and compares the two most recent years against the two prior years. The First-Mover model normalizes patent expiration dates from either expirationDate or patentExpireDate fields and calculates fractional years remaining.
Phase 3: Composite score assembly
The composite score applies a fixed weight vector: Pipeline Threat (30%), Adverse Event Divergence (25%), Literature Momentum (25%), First-Mover Advantage inverted (20%). The First-Mover score is subtracted from 100 before weighting — a strong patent and approval position reduces overall composite risk. The final score is clamped to [0, 100] and mapped to four risk tiers: CRITICAL (75+), HIGH (50-74), MODERATE (25-49), LOW (0-24). All signals from all four models are concatenated into allSignals for a single-field summary of every threshold that fired during the run.
Phase 4: Output assembly with raw data slices
The final dataset record includes the full scored report, all signals, and raw data slices: up to 50 clinical trial records, up to 20 FAERS event records, up to 30 PubMed records, and up to 30 patent records. FDA approval, recall, and EMA authorization records are included in full (not sliced) because these datasets are typically small. Source record counts are included in dataSources for quick validation that data collection succeeded across all 7 sources.
Tips for best results
-
Use INN names for cross-source consistency. The International Nonproprietary Name ("semaglutide" not "Ozempic") is indexed consistently across ClinicalTrials.gov, PubMed, and patent databases. Trade names are inconsistently indexed and will produce lower record counts in clinical trial and patent sub-actors.
-
Cross-check the
dataSourcescounts before trusting the score. IfclinicalTrials: 0andpatents: 0appear indataSources, it likely means the query term was too specific or the sub-actor returned no results. Try a broader query term before drawing conclusions from the composite score. -
Compare multiple candidates by running in sequence. The composite score is designed for relative comparison. Run the actor for 3-5 competing drugs in the same therapeutic area and rank by composite score to identify which has the most defensible position.
-
Use the
phasefilter to stress-test a specific competitive window. Filter to PHASE3 to see only late-stage competitive pressure. The Pipeline Threat score will reflect only Phase 3 trial density, giving a focused view of near-term market entry risk. -
Schedule weekly runs on your top-priority drugs. New Phase 3 trial registrations appear on ClinicalTrials.gov within days of submission. A weekly scheduled run on your lead drug candidate will surface new competitive entrants before they appear in analyst reports.
-
Pair the output with Company Deep Research to add corporate financial and organizational context to the pipeline scoring data — useful for BD and licensing assessments.
-
For broad therapeutic area scans, start with no company filter. Omitting the company name returns the full competitive landscape. Then re-run with specific competitor company names to see their individual pipeline positions within that landscape.
-
Export raw data for secondary analysis. The
rawDatafields include structured records from all 7 sources. Export as JSON and load into a notebook or analysis tool to build custom visualizations of the phase distribution or publication trend data.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Company Deep Research | Run after Drug Pipeline Report to add corporate financial health, executive team, and M&A history context to companies surfaced in the pipeline threat analysis |
| Patent Search | Query individual patent numbers from rawData.patents to get full claim text and assignee details for legal review |
| Competitor Analysis Report | Cross-reference pipeline threat signals against competitor commercial performance data for a full competitive picture |
| SEC EDGAR Filing Analyzer | Pull 10-K and 10-Q filings for competitors identified in the pipeline threat output to understand their R&D spend and pipeline investment priorities |
| Website Content to Markdown | Convert competitor pipeline pages or drug database records to clean markdown for LLM-powered analysis pipelines |
| B2B Lead Qualifier | Score pharma companies identified in the pipeline analysis as potential BD partners or acquisition targets based on 30+ firmographic signals |
Limitations
- Data is publicly sourced only. The actor queries ClinicalTrials.gov, openFDA, PubMed, public patent databases, and the EMA product database. Proprietary pipeline databases (Citeline, Cortellis, GlobalData) contain additional confidential trial data that this actor cannot access.
- Patent expiry dates are estimates. Patent expiry is calculated from
expirationDateorpatentExpireDatefields in the patent database records. These dates do not account for patent term extensions (PTE), supplementary protection certificates (SPC), or litigation-related delays that may extend or shorten the actual exclusivity window. - Adverse event scoring is a screening tool, not a clinical assessment. FAERS data reflects voluntary and mandatory reports submitted to the FDA and includes duplicate records, confounding medications, and incomplete case narratives. The Adverse Event Divergence score surfaces patterns, not causality. It must not replace formal signal detection methods.
- ClinicalTrials.gov registration lag. There is typically a 21-day gap between trial initiation and ClinicalTrials.gov registration. Very recently initiated Phase 3 trials may not appear in results.
- Publication dates may be inconsistent across PubMed records. The momentum model parses
publicationDate,pubDate, anddatefields. Records with missing or malformed date fields are excluded from the yearly trend calculation, which may slightly undercount recent publications. - EMA data covers European authorizations only. Products authorized exclusively in other non-FDA, non-EMA markets (Japan PMDA, Health Canada, TGA Australia) are not captured in the competitive threat scoring.
- Sub-actor failures reduce scoring confidence. If a sub-actor returns zero results due to a temporary API issue, the scoring model treats that dimension as zero data. Always check
dataSourcesrecord counts to confirm successful collection before acting on a score. - Not a substitute for expert analysis. The composite score and signals are decision-support tools. Material strategic decisions should be reviewed by qualified pharmaceutical strategists, patent attorneys, and medical affairs professionals.
Integrations
- Zapier — trigger an automated pipeline report whenever a new drug enters your company's development watchlist, and route the results to Slack or email
- Make — build weekly competitive intelligence workflows that run Drug Pipeline Report on a schedule and push high-risk scores to a shared dashboard
- Google Sheets — export composite scores and signal lists to a portfolio tracking spreadsheet for regular pipeline review meetings
- Apify API — integrate drug pipeline scoring directly into internal research tools, BD platforms, or clinical operations systems
- Webhooks — post completed report data to any internal endpoint, triggering downstream analysis or notification workflows
- LangChain / LlamaIndex — feed pipeline report output into LLM-powered competitive intelligence agents for automated narrative summaries and strategic recommendations
Troubleshooting
Composite score seems unexpectedly low despite known competitive activity. Check the dataSources counts in the output. If clinicalTrials or patents show 0 records, the query term likely returned no results from those sub-actors. Try the drug's INN name instead of a brand name, or broaden the query to the mechanism of action (e.g., "PD-1 inhibitor" instead of "pembrolizumab").
Run times out or takes longer than 2 minutes. The actor calls 7 sub-actors in parallel with a 120-second timeout per sub-actor. Queries that return very large result sets (broad therapeutic area queries) may hit this limit for some sub-actors. Those sub-actors will return partial or empty results but the run will still complete. Use a more specific drug name or add a company filter to reduce result volume.
phaseFilter is set in output but phase distribution includes other phases. The phase filter applies only to the clinical-trial-tracker sub-actor query. FDA approvals, EMA authorizations, and adverse events are not filtered by phase — they are always retrieved for all phases regardless of the phase input.
allSignals is empty despite a high composite score. Signals are generated only when data crosses specific thresholds (e.g., 3+ Phase 3 competitors, 2+ recent FDA approvals). A high score driven by sub-threshold contributions across multiple dimensions may not trigger any individual signal while still producing an elevated composite score. Review the individual model scores to understand which dimensions are driving the composite.
EMA authorization count is always 0. The EMA Medicines Search sub-actor uses the drug name directly. Some drugs are authorized under different names in the EU. Try querying with the INN rather than a brand name, or check the EMA Product database manually to confirm the drug's EU authorization status.
Responsible use
- This actor queries publicly available regulatory databases, clinical trial registries, and academic literature indexes.
- Adverse event data from FAERS represents reported events and does not establish causation or constitute clinical guidance.
- Do not use pipeline intelligence derived from this actor as the sole basis for clinical, regulatory, or patient-safety decisions.
- Patent expiry estimates are informational only and do not constitute legal advice. Consult a qualified patent attorney before making lifecycle management decisions based on exclusivity calculations.
- For guidance on web scraping legality and data use, see Apify's guide.
FAQ
How many data sources does the drug pipeline report query?
The actor queries 7 sources simultaneously: ClinicalTrials.gov (via clinical-trial-tracker), FDA adverse events (FAERS via openFDA), FDA drug approvals, FDA drug recalls, PubMed research publications, patent databases, and EMA medicine authorizations. All 7 run in parallel using Promise.allSettled, so a single source failure does not abort the run.
How is the composite drug pipeline risk score calculated? The composite score (0-100) is a weighted average of four models: Pipeline Threat (30%), Adverse Event Divergence (25%), Literature Momentum (25%), and inverted First-Mover Advantage (20%). The First-Mover score is subtracted from 100 before weighting — a strong patent position reduces overall risk. All four model scores contribute to the final number even if individual data sources returned partial results.
What does the Pipeline Threat score measure specifically? Pipeline Threat quantifies competitive pressure from other drugs in the same therapeutic area. Phase 3 competitors are the strongest signal at 8 points each (maximum 40 points). Active trial density adds up to 20 points. Recent FDA approvals in the same class contribute up to 15 points. EMA authorizations add up to 10 points. Competitor recalls reduce the score by up to 15 points because they indicate competitors are losing ground.
How accurate is the First-Mover Advantage patent exclusivity calculation?
The exclusivity estimate is based on the earliest expirationDate or patentExpireDate field found in patent records. It does not account for patent term extensions (PTE), supplementary protection certificates in EU markets, or ongoing litigation that could shorten or lengthen the actual exclusivity window. Treat it as a directional indicator, not a legal determination. The patent cliff warning fires automatically when estimated exclusivity drops below 2 years.
Is it legal to query FDA, PubMed, and ClinicalTrials.gov data? Yes. All data sources queried by this actor are publicly accessible government databases and academic indexes. ClinicalTrials.gov, openFDA (FAERS), FDA drug approvals and recalls, PubMed, and the EMA medicines database are all publicly available under open data policies. See Apify's guide on web scraping legality for broader context.
How long does a typical drug pipeline report run take? Most runs complete in 60-120 seconds. The actor is gated by the slowest sub-actor call. Broad therapeutic area queries that return large result sets may take slightly longer. All 7 sub-actors run in parallel, so total time is determined by the single longest call rather than the sum of all calls.
Can I compare two competing drugs using this actor? Yes. Run the actor once for each drug and compare their composite scores, Pipeline Threat levels, and First-Mover Advantage scores side by side. The structured JSON output is designed for programmatic comparison. A typical competitive analysis runs 3-5 drugs in a therapeutic area, then ranks them by composite score and pipeline threat level.
How is this different from commercial pharma intelligence platforms like Cortellis or GlobalData? Commercial platforms include proprietary trial data, confidential licensing agreements, and human-curated pipeline entries that go beyond public registrations. This actor uses only publicly available data sources and produces a quantitative composite score rather than qualitative analyst commentary. It is best suited for teams that need fast, objective, public-data assessments — not a replacement for full platform subscriptions in regulatory or clinical contexts.
What does the Literature Momentum score indicate about competition? A high Literature Momentum score means the therapeutic area is generating a large and growing volume of academic research. This signals that more labs and companies are investigating the space, which typically precedes an increase in clinical trial activity 2-4 years later. An accelerating publication rate (20%+ growth) is an early warning that competitive density will likely increase in the coming years.
Can I schedule this actor to monitor my pipeline automatically? Yes. Use Apify's built-in scheduler to run the actor daily, weekly, or monthly. Configure Slack or email alerts so your team is notified when a run completes. This is the recommended approach for ongoing competitive monitoring of priority drug candidates.
What happens if one of the 7 sub-actors fails during a run?
The actor uses Promise.allSettled for parallel execution, which means individual sub-actor failures are caught and recorded without aborting the overall run. The affected data source will show 0 in dataSources, and the scoring models will proceed with data from the remaining sources. Check dataSources counts in the output to confirm whether all 7 sources returned data successfully.
Does the Adverse Event Divergence score replace pharmacovigilance systems? No. The score aggregates publicly available FAERS data as a screening-level indicator. FAERS records are voluntary and mandatory reporter submissions that may include duplicates, confounding factors, and incomplete narratives. The score is designed to surface patterns worth investigating through formal pharmacovigilance processes — not to replace them.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Bulk Email Verifier
Verify email deliverability at scale. MX record validation, SMTP mailbox checks, disposable and role-based detection, catch-all flagging, and confidence scoring. No external API costs.
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Website Tech Stack Detector
Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.
Ready to try Drug Pipeline Report?
Start for free on Apify. No credit card required.
Open on Apify Store