Pandemic Biosurveillance MCP
Pandemic biosurveillance intelligence for AI agents — 8 mathematically rigorous epidemic modeling tools backed by live data from 16 public health, ecological, and environmental sources. Connect any MCP-compatible AI client to run stochastic outbreak simulations, infer transmission parameters, estimate effective reproduction numbers, and map zoonotic spillover hotspots in a single tool call.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| simulate-epidemic-metapopulation | Gillespie SSA metapopulation on radiation mobility | $0.10 |
| infer-parameters-pmcmc | Particle MCMC with alive particle filter | $0.10 |
| estimate-phylodynamic-re | Birth-death skyline R_e estimation | $0.08 |
| evaluate-intervention-causality | Augmented synthetic control for NPI effects | $0.08 |
| compute-vaccination-equilibrium | Mean-field game HJB-Fokker-Planck coupling | $0.10 |
| forecast-variant-fitness | Multinomial logistic regression on frequencies | $0.06 |
| assess-zoonotic-spillover | MaxEnt species distribution modeling | $0.06 |
| model-seasonal-waning-dynamics | Fourier forcing with power-law antibody waning | $0.08 |
Example: 100 events = $10.00 · 1,000 events = $100.00
Connect to your AI agent
Add this MCP server to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.
https://ryanclinton--pandemic-biosurveillance-mcp.apify.actor/mcp{
"mcpServers": {
"pandemic-biosurveillance-mcp": {
"url": "https://ryanclinton--pandemic-biosurveillance-mcp.apify.actor/mcp"
}
}
}Documentation
Pandemic biosurveillance intelligence for AI agents — 8 mathematically rigorous epidemic modeling tools backed by live data from 16 public health, ecological, and environmental sources. Connect any MCP-compatible AI client to run stochastic outbreak simulations, infer transmission parameters, estimate effective reproduction numbers, and map zoonotic spillover hotspots in a single tool call.
This MCP server runs on the Apify platform in persistent Standby mode. Each tool call triggers parallel data retrieval from WHO, PubMed, IUCN, GBIF, GDACS, NOAA, FEMA, and 9 more sources, constructs a multi-domain epidemic network, and applies a dedicated mathematical algorithm — Gillespie SSA, particle MCMC, birth-death skyline phylodynamics, augmented synthetic control, mean-field game theory, multinomial logistic regression, MaxEnt species distribution modeling, or Fourier seasonal forcing — returning structured, machine-readable results with uncertainty quantification.
⬇️ What data can you access?
| Data Point | Source | Example |
|---|---|---|
| 📊 Global disease burden and mortality rates | WHO Global Health Observatory | Influenza IFR by region |
| 🧪 Active vaccine and therapeutic trials | ClinicalTrials.gov | Phase 3 mRNA vaccine enrollment |
| 💊 Adverse drug events and safety signals | openFDA FAERS | VAERS signal detection |
| 🇪🇺 European authorized medicines | EMA Medicines Database | Antiviral market authorizations |
| 📄 Biomedical literature and citations | PubMed, OpenAlex, Europe PMC | SARS-CoV-2 transmission papers |
| 🦇 Species distribution and conservation status | IUCN Red List | Rhinolophus bat range maps |
| 🌍 Biodiversity occurrence data | GBIF | Wildlife host occurrence records |
| 🌪️ Natural disaster alerts and impact estimates | GDACS | Flood-disrupted health infrastructure |
| 🌡️ Weather and climate data | NOAA | Temperature-driven seasonality |
| 💨 Air quality and environmental health proxies | OpenAQ | Pollution co-exposure risk |
| 🚨 Disaster declarations and emergency management | FEMA | US outbreak response capacity |
| 🌐 Country demographics and population | REST Countries, World Bank | Population mobility denominators |
| 📍 Geolocation and spatial reference | Nominatim | Hotspot coordinate resolution |
Why use Pandemic Biosurveillance MCP?
Pandemic preparedness work requires integrating epidemiological data, ecological surveillance, clinical trial pipelines, environmental drivers, and geospatial context into a single coherent analysis. Doing this manually across 16 databases takes days and demands specialist knowledge in epidemiological modeling, spatial statistics, and phylogenetics. Even dedicated tools like Nextstrain or the WHO FluMart system each cover only one dimension of the problem.
This MCP server automates the entire intelligence gathering and modeling pipeline. A single tool call fetches and synthesizes data from all 16 sources, builds a multi-domain epidemic network, and runs the appropriate mathematical model — returning actionable results in under three minutes.
- Standby mode — the server stays warm between calls, eliminating cold-start latency on every tool invocation
- Parallel data retrieval — 16 actors run concurrently per tool call, not sequentially, keeping latency low
- API access — trigger analyses from Python, JavaScript, or any HTTP client via the Apify API
- Scheduling — run recurring surveillance passes with Apify's built-in scheduler
- Monitoring — receive Slack or email alerts when model outputs breach defined thresholds via webhooks
- Integrations — connect results to Zapier, Make, or custom pipelines for downstream alerting
Features
- Gillespie SSA metapopulation simulation — stochastic SEIR-HCD model with 7 compartments (S, E, I, R, H, C, D) across multiple populations; tau-leaping approximation enables simulation across 20+ populations simultaneously
- Radiation mobility network — inter-population transmission follows Simini et al. (2012): T_ij = T_i(m_i·n_j)/((m_i+s_ij)(m_i+n_j+s_ij)), parameterized directly from World Bank population data
- Particle MCMC parameter inference — alive particle filter with Sequential Monte Carlo resampling for unbiased likelihood estimation; infers R0, latent period, IFR, and hospitalization rate with Gelman-Rubin R-hat convergence diagnostics
- Birth-death skyline phylodynamics — estimates Re trajectory using piecewise-constant birth rates (Stadler et al. 2013); cross-validated via Bayesian model averaging with thermodynamic integration weights
- Augmented synthetic control — causal NPI and pharmaceutical intervention effects using Abadie (2021) synthetic counterfactual construction; conformal prediction intervals for valid uncertainty quantification
- Mean-field game vaccination equilibrium — coupled HJB backward PDE and Fokker-Planck forward PDE to find Nash equilibrium vs social optimum; quantifies the free-rider gap and produces age-group prioritization strategies
- Multinomial logistic variant fitness — estimates selection coefficients and fitness advantages from variant frequency trajectories; Maynard-Smith fitness landscape ruggedness score and escape mutation risk
- MaxEnt zoonotic spillover modeling — P(presence|env) = exp(sum_k lambda_k·f_k(x)) / Z; combines with deforestation frontier KDE for human-wildlife interface hotspot identification; overall spillover probability P(spillover) = P(host) × P(contact) × P(adaptation)
- Fourier seasonal forcing with power-law waning — R_e(t) = R0·(1+alpha·cos(2pi·t/365-phi)) combined with Khoury et al. (2021) power-law antibody waning: Ab(t) = Ab0·t^(-kappa); computes optimal booster timing
- 16 parallel data sources — health (4), research (3), ecological (2), environmental (4), spatial (3); all run concurrently via the Apify actor client
- Multi-domain epidemic network graph — nodes typed as disease, trial, paper, country, species, drug, environment, or location; edges typed as transmits, treats, studies, hosts, mobility, or correlates
- Pay-per-tool-call pricing — no subscription, no minimum commitment; $0.030–$0.040 per tool call depending on the algorithm
Use cases for pandemic biosurveillance intelligence
Pandemic preparedness scenario planning
Government health agencies, defense research labs, and biosecurity think tanks need to stress-test preparedness plans against realistic outbreak scenarios before an event occurs. Use simulate_epidemic_metapopulation with a query like "H5N1 avian influenza pandemic" to generate stochastic spread trajectories across connected populations, estimate peak hospitalization timing, and compute herd immunity thresholds. Combine with assess_zoonotic_spillover to identify the geographic interfaces most likely to generate the index case.
Outbreak response and resource allocation
During an active outbreak, public health operations teams need real-time estimates of whether transmission is growing or declining and how effective current interventions are. Use estimate_phylodynamic_re to track the Re trajectory from epidemiological signals and evaluate_intervention_causality to quantify the causal effect of NPIs already in place. The augmented synthetic control counterfactual provides lives-saved estimates with proper uncertainty bounds, directly suitable for policy briefings.
Vaccination strategy and campaign design
Immunization program managers designing rollout strategies need to understand both optimal allocation and the behavioral incentive landscape. Use compute_vaccination_equilibrium to identify the free-rider gap between Nash equilibrium coverage and the social optimum, and to generate age-group prioritization ranked by cost-effectiveness. Use model_seasonal_waning_dynamics to determine optimal booster timing given the local seasonal forcing amplitude and antibody half-life estimates.
Variant surveillance and genomic monitoring
Virology and genomic surveillance teams tracking an evolving pathogen need to forecast which variants will dominate the next transmission wave. Use forecast_variant_fitness with a query like "SARS-CoV-2 variants 2025" to estimate selection coefficients, project 30-day and 90-day frequency trajectories, and identify variants with high immune evasion scores. Fitness landscape ruggedness scores indicate how rapidly the variant space is evolving.
Zoonotic disease research and spillover risk mapping
Researchers at the wildlife-human interface — ecology institutes, biosafety labs, One Health programs — need to identify where the next spillover is most likely to occur. Use assess_zoonotic_spillover to run MaxEnt species distribution modeling for reservoir hosts, combine with deforestation frontier KDE, and produce a ranked list of geographic hotspots with per-region risk scores. The output includes pathogen family, deforestation driver score, and human contact rate for each species.
Academic epidemiology and public health research
Epidemiologists and graduate researchers need reproducible, citable methodologies for modeling studies. Each tool in this MCP implements a published mathematical framework with proper convergence diagnostics and uncertainty quantification, making outputs suitable as model inputs or benchmarks. infer_parameters_pmcmc provides posterior distributions over epidemic parameters with 95% credible intervals and effective sample size, ready for inclusion in methods sections.
How to use pandemic biosurveillance MCP tools
- Connect your MCP client — add the server URL
https://pandemic-biosurveillance-mcp.apify.actor/mcpto Claude Desktop, Cursor, Windsurf, Cline, or any MCP-compatible client. No API key is needed for the URL itself; billing is handled through your Apify account. - Choose the right tool — select from 8 tools based on your analysis goal: simulation, parameter inference, Re estimation, intervention causality, vaccination strategy, variant forecasting, spillover risk, or seasonal dynamics.
- Enter a natural-language query — type the disease, pathogen, variant, or geographic context you want to analyze. Examples: "H5N1 avian influenza pandemic", "bat coronavirus spillover Southeast Asia", "SARS-CoV-2 Omicron booster timing".
- Receive structured results — the server fetches live data from 16 sources in parallel, runs the mathematical model, and returns JSON-structured output with model estimates, uncertainty intervals, and supporting network metadata within 2–4 minutes.
Input parameters
This MCP server has no traditional actor input schema — it operates in Standby mode and receives all inputs via the MCP protocol. Each tool accepts a single parameter:
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Disease, pathogen, outbreak, or geographic context to analyze. Natural language accepted. |
Tool-level input reference
| Tool | query example |
|---|---|
simulate_epidemic_metapopulation | "H5N1 avian influenza pandemic" |
infer_parameters_pmcmc | "mpox outbreak West Africa" |
estimate_phylodynamic_re | "SARS-CoV-2 JN.1 variant" |
evaluate_intervention_causality | "COVID-19 lockdown effectiveness" |
compute_vaccination_equilibrium | "influenza seasonal vaccination" |
forecast_variant_fitness | "influenza H3N2 variants 2025" |
assess_zoonotic_spillover | "bat coronavirus spillover Southeast Asia" |
model_seasonal_waning_dynamics | "RSV seasonal dynamics booster timing" |
Input tips
- Be specific about geography — adding a region narrows the ecological and spatial data returned, improving model accuracy for localized analyses.
- Include the pathogen family — queries like "bat coronavirus" vs "coronavirus" produce different IUCN and GBIF species retrieval, affecting zoonotic spillover results.
- Use variant-level queries for genomic tools — "SARS-CoV-2 JN.1 variant" returns more targeted literature and drug data than a generic query.
- Chain tools deliberately — run
simulate_epidemic_metapopulationfirst to understand population-level dynamics, then useevaluate_intervention_causalityto assess the effect of specific measures on the same outbreak context. - Set a spending limit — configure a maximum spend per run in your Apify account settings to prevent unexpected costs during iterative analysis sessions.
Output example
simulate_epidemic_metapopulation response for "H5N1 avian influenza pandemic":
{
"networkNodes": 142,
"networkEdges": 318,
"trajectories": [
{
"time": 0,
"population": "China",
"S": 1402000000,
"E": 0,
"I": 50,
"R": 0,
"H": 0,
"C": 0,
"D": 0,
"Re": 2.1
},
{
"time": 14,
"population": "China",
"S": 1401820000,
"E": 42300,
"I": 87400,
"R": 12800,
"H": 8200,
"C": 1400,
"D": 280,
"Re": 1.87
},
{
"time": 60,
"population": "India",
"S": 1380000000,
"E": 198000,
"I": 341000,
"R": 94000,
"H": 28700,
"C": 5100,
"D": 1820,
"Re": 1.43
}
],
"peakInfections": 4820000,
"peakDate": 84,
"totalInfected": 18400000,
"totalDeaths": 621000,
"herdImmunityThreshold": 0.524,
"mobilityImpact": 0.38,
"populationsAffected": 22
}
infer_parameters_pmcmc response:
{
"networkNodes": 138,
"parameters": [
{
"parameter": "R0",
"mean": 2.34,
"ci95Lower": 1.91,
"ci95Upper": 2.88,
"effectiveSampleSize": 847
},
{
"parameter": "latentPeriod",
"mean": 3.2,
"ci95Lower": 2.4,
"ci95Upper": 4.1,
"effectiveSampleSize": 912
},
{
"parameter": "IFR",
"mean": 0.0038,
"ci95Lower": 0.0021,
"ci95Upper": 0.0061,
"effectiveSampleSize": 734
},
{
"parameter": "hospitalizationRate",
"mean": 0.041,
"ci95Lower": 0.028,
"ci95Upper": 0.057,
"effectiveSampleSize": 801
}
],
"logLikelihood": -1847.3,
"dic": 3714.8,
"particlesUsed": 500,
"acceptanceRate": 0.24,
"convergenceDiagnostic": 1.03
}
Output fields
simulate_epidemic_metapopulation
| Field | Type | Description |
|---|---|---|
networkNodes | number | Total nodes in the epidemic network (diseases, trials, countries, species, etc.) |
networkEdges | number | Total edges (mobility, transmits, treats, hosts, studies, correlates) |
trajectories[] | array | Time-series SEIR-HCD state per population per time step |
trajectories[].time | number | Simulation day |
trajectories[].population | string | Population name (country or region) |
trajectories[].S | number | Susceptible count |
trajectories[].E | number | Exposed count |
trajectories[].I | number | Infectious count |
trajectories[].R | number | Recovered count |
trajectories[].H | number | Hospitalized count |
trajectories[].C | number | Critical/ICU count |
trajectories[].D | number | Deaths |
trajectories[].Re | number | Effective reproduction number at this time step |
peakInfections | number | Maximum concurrent infections across all populations |
peakDate | number | Day of peak infections |
totalInfected | number | Cumulative infections |
totalDeaths | number | Cumulative deaths |
herdImmunityThreshold | number | Required immune fraction to halt spread (0–1) |
mobilityImpact | number | Estimated proportion of spread attributable to inter-population mobility |
populationsAffected | number | Number of distinct populations with active transmission |
infer_parameters_pmcmc
| Field | Type | Description |
|---|---|---|
parameters[] | array | Posterior estimates for each epidemic parameter |
parameters[].parameter | string | Parameter name (R0, latentPeriod, IFR, hospitalizationRate) |
parameters[].mean | number | Posterior mean |
parameters[].ci95Lower | number | 2.5th percentile of posterior |
parameters[].ci95Upper | number | 97.5th percentile of posterior |
parameters[].effectiveSampleSize | number | ESS from SMC; values above 400 indicate reliable estimates |
logLikelihood | number | Log marginal likelihood |
dic | number | Deviance Information Criterion for model comparison |
particlesUsed | number | Number of SMC particles |
acceptanceRate | number | Metropolis-Hastings acceptance rate (0.15–0.35 is healthy) |
convergenceDiagnostic | number | Gelman-Rubin R-hat; values below 1.05 indicate convergence |
estimate_phylodynamic_re
| Field | Type | Description |
|---|---|---|
intervals[] | array | Piecewise Re estimates across skyline intervals |
intervals[].startTime | number | Interval start (days before present) |
intervals[].endTime | number | Interval end |
intervals[].Re | number | Mean Re estimate for this interval |
intervals[].ci95Lower | number | Lower credible bound |
intervals[].ci95Upper | number | Upper credible bound |
intervals[].growthRate | number | Exponential growth rate for this interval |
currentRe | number | Most recent Re estimate |
treeHeight | number | Inferred phylogenetic tree height in days |
tmrca | number | Time to most recent common ancestor |
skylinePopSize[] | number[] | Effective population size over skyline intervals |
modelAvgWeights[] | array | Bayesian model averaging weights for each sub-model |
doublingTime | number | Doubling time in days given current Re |
evaluate_intervention_causality
| Field | Type | Description |
|---|---|---|
effects[] | array | Causal effect estimate per intervention type |
effects[].intervention | string | Intervention name (e.g., lockdown, mask mandate, vaccination) |
effects[].causalEffect | number | Estimated reduction in transmission (negative = beneficial) |
effects[].ci95Lower | number | Conformal prediction lower bound |
effects[].ci95Upper | number | Conformal prediction upper bound |
effects[].conformalCoverage | number | Empirical coverage of prediction interval |
effects[].syntheticControlFit | number | Pre-intervention fit quality (0–1; above 0.85 is good) |
overallReduction | number | Combined transmission reduction from all interventions |
bestIntervention | string | Intervention with largest causal effect |
counterfactualDeaths | number | Projected deaths without any intervention |
livesSaved | number | Estimated lives saved by current interventions |
compute_vaccination_equilibrium
| Field | Type | Description |
|---|---|---|
equilibrium.optimalCoverage | number | Socially optimal vaccination coverage (0–1) |
equilibrium.nashEquilibrium | number | Individual rational Nash equilibrium coverage |
equilibrium.freeRiderGap | number | Gap between social optimum and Nash equilibrium |
equilibrium.criticalThreshold | number | Herd immunity threshold |
equilibrium.costEffectiveness | number | Cost per QALY averted at optimal coverage |
ageGroupStrategy[] | array | Per-age-group coverage and priority rank |
supplyConstraints[] | array | Resource availability vs need per supply category |
welfareGain | number | Population welfare gain from reaching social optimum |
forecast_variant_fitness
| Field | Type | Description |
|---|---|---|
variants[] | array | Forecast per identified variant |
variants[].variant | string | Variant identifier |
variants[].currentFrequency | number | Current proportion of sequences (0–1) |
variants[].fitnessAdvantage | number | Selection coefficient vs reference strain |
variants[].projectedFrequency30d | number | Projected frequency in 30 days |
variants[].projectedFrequency90d | number | Projected frequency in 90 days |
variants[].immuneEvasion | number | Immune evasion score (0–1) |
variants[].transmissibility | number | Relative transmissibility vs reference |
dominantVariant | string | Variant projected to dominate at 90 days |
sweepTimeline | number | Days until dominant variant exceeds 80% frequency |
landscapeRuggedness | number | Maynard-Smith fitness landscape ruggedness |
escapeMutationRisk | number | Probability of immune escape mutation emerging |
assess_zoonotic_spillover
| Field | Type | Description |
|---|---|---|
risks[] | array | Spillover risk assessment per host species |
risks[].species | string | Reservoir host species name |
risks[].pathogenFamily | string | Associated pathogen family |
risks[].spilloverProbability | number | Combined spillover probability (0–1) |
risks[].deforestationDriver | number | Deforestation pressure score at range boundary |
risks[].humanContactRate | number | Estimated human-animal contact frequency |
risks[].habitatOverlap | number | Overlap between human and host distributions |
hotspots[] | array | Geographic spillover hotspots with coordinates |
hotspots[].region | string | Region or administrative area name |
hotspots[].lat | number | Latitude of hotspot centroid |
hotspots[].lon | number | Longitude of hotspot centroid |
hotspots[].riskScore | number | Composite spillover risk score (0–100) |
overallSpilloverRate | number | Aggregate annual spillover rate estimate |
highRiskInterfaces | string[] | Named high-risk human-wildlife interface zones |
maxEntPrediction | number | MaxEnt model predicted occurrence probability |
model_seasonal_waning_dynamics
| Field | Type | Description |
|---|---|---|
dynamics[] | array | Monthly seasonal dynamics across a 12-month cycle |
dynamics[].month | number | Month (1–12) |
dynamics[].Re | number | Seasonally-adjusted effective reproduction number |
dynamics[].immunityLevel | number | Population immunity fraction at this month |
dynamics[].waningRate | number | Monthly immunity waning rate |
dynamics[].boosterNeed | number | Proportion of population needing boosting |
seasonalAmplitude | number | Fourier forcing amplitude (alpha) |
peakMonth | number | Month of maximum transmission |
troughMonth | number | Month of minimum transmission |
antibodyHalfLife | number | Estimated antibody half-life in days |
waningPowerLawExponent | number | Khoury et al. power-law exponent kappa |
optimalBoosterTiming | number | Recommended month for booster campaign |
How much does it cost to run pandemic biosurveillance analyses?
This MCP server uses pay-per-event pricing — you pay per tool call. Compute costs are included. Prices range from $0.030 to $0.040 per call depending on the algorithm.
| Tool | Price per call | 10 calls | 50 calls |
|---|---|---|---|
simulate_epidemic_metapopulation | $0.040 | $0.40 | $2.00 |
infer_parameters_pmcmc | $0.035 | $0.35 | $1.75 |
estimate_phylodynamic_re | $0.035 | $0.35 | $1.75 |
evaluate_intervention_causality | $0.030 | $0.30 | $1.50 |
compute_vaccination_equilibrium | $0.035 | $0.35 | $1.75 |
forecast_variant_fitness | $0.030 | $0.30 | $1.50 |
assess_zoonotic_spillover | $0.030 | $0.30 | $1.50 |
model_seasonal_waning_dynamics | $0.030 | $0.30 | $1.50 |
Running the full suite of 8 tools for a single outbreak analysis costs $0.265. A weekly surveillance run across 4 pathogens using 3 tools each costs approximately $1.56/week.
Apify's free tier includes $5 of monthly platform credits, which covers approximately 150 tool calls before any payment is required. You can set a maximum spending limit per run in your Apify account settings to cap costs.
Compare this to commercial epidemiological intelligence platforms charging $500–2,000/month for comparable data access, with no programmatic API and no mathematical modeling layer.
How to connect this MCP server
Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"pandemic-biosurveillance": {
"url": "https://pandemic-biosurveillance-mcp.apify.actor/mcp",
"headers": {
"Authorization": "Bearer YOUR_APIFY_TOKEN"
}
}
}
}
Cursor / Windsurf / Cline
Use the same URL in your client's MCP server settings: https://pandemic-biosurveillance-mcp.apify.actor/mcp
Python (direct HTTP)
import httpx
import json
response = httpx.post(
"https://pandemic-biosurveillance-mcp.apify.actor/mcp",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_APIFY_TOKEN",
},
json={
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "simulate_epidemic_metapopulation",
"arguments": {"query": "H5N1 avian influenza pandemic Southeast Asia"}
},
"id": 1
},
timeout=300,
)
result = response.json()
data = json.loads(result["result"]["content"][0]["text"])
print(f"Peak infections: {data['peakInfections']:,}")
print(f"Herd immunity threshold: {data['herdImmunityThreshold']:.1%}")
print(f"Populations affected: {data['populationsAffected']}")
JavaScript
const response = await fetch("https://pandemic-biosurveillance-mcp.apify.actor/mcp", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${process.env.APIFY_TOKEN}`,
},
body: JSON.stringify({
jsonrpc: "2.0",
method: "tools/call",
params: {
name: "forecast_variant_fitness",
arguments: { query: "influenza H3N2 variants 2025" }
},
id: 1
}),
});
const result = await response.json();
const data = JSON.parse(result.result.content[0].text);
console.log(`Dominant variant: ${data.dominantVariant}`);
console.log(`Sweep timeline: ${data.sweepTimeline} days`);
console.log(`Escape mutation risk: ${(data.escapeMutationRisk * 100).toFixed(1)}%`);
cURL
# Call a tool
curl -X POST "https://pandemic-biosurveillance-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "assess_zoonotic_spillover",
"arguments": {"query": "bat coronavirus spillover Southeast Asia"}
},
"id": 1
}'
# List all available tools
curl -X POST "https://pandemic-biosurveillance-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{"jsonrpc":"2.0","method":"tools/list","params":{},"id":1}'
How Pandemic Biosurveillance MCP works
Phase 1: Parallel data retrieval across 16 sources
When any tool is called, fetchAndBuildNetwork() dispatches 16 actor calls concurrently via runActorsParallel(). Each actor has a 180-second timeout and runs at 256 MB memory. The 16 sources are organized into 4 functional groups:
- Health (4 actors): WHO GHO (50 results), ClinicalTrials.gov (50), openFDA drug events (40), EMA medicines (40)
- Research (3 actors): PubMed (50), OpenAlex (50), Europe PMC (50)
- Ecological (2 actors): IUCN Red List (30), GBIF biodiversity (30)
- Environmental/spatial (7 actors): GDACS disasters (30), NOAA weather (30), OpenAQ air quality (30), FEMA (30), REST Countries (30), Nominatim geocoder (20), World Bank (40)
Failed actor calls return empty arrays and do not abort the run, so partial data availability degrades gracefully.
Phase 2: Epidemic network construction
buildEpiNetwork() normalizes all returned records into typed EpiNode objects — disease, trial, paper, country, species, drug, environment, or location — with deduplication by ID. Country population fields are extracted from REST Countries and World Bank responses to parameterize the radiation mobility model.
Mobility edges between country nodes use the Simini et al. radiation model: T_ij = T_i(m_i·n_j)/((m_i+s_ij)(m_i+n_j+s_ij)), where the intervening population s_ij is approximated as 0.3·sqrt(m_i·n_j). Disease-to-country transmission edges, drug-to-disease treatment edges, and species-to-disease host edges are added based on node type matching. The resulting network graph is passed to the algorithm layer.
Phase 3: Mathematical modeling
Each of the 8 tools implements a distinct algorithm in scoring.ts:
- Gillespie SSA uses exponential waiting times with tau-leaping to propagate the stochastic SEIR-HCD system. Initial conditions are derived from disease node metadata and country population sizes from the network.
- Particle MCMC initializes an alive particle filter with 500 particles. Metropolis-Hastings proposals draw from log-normal distributions. The Gelman-Rubin R-hat statistic is computed from two parallel chains.
- Birth-death skyline fits a piecewise-constant birth-rate model with 6 intervals to the network's literature and trial time distribution as a proxy phylogenetic dataset. Bayesian model averaging uses thermodynamic integration to compute model weights.
- Augmented synthetic control constructs the synthetic counterfactual Y_hat(0)_t = sum_j w_j·Y_j(t) where w = argmin ||Y_1 - Yw||^2, using country-level epidemic proxies from the network as donor units.
- Mean-field game solves the coupled HJB/Fokker-Planck system with finite-difference discretization. Age-group strategies are derived from country demographic data in the network.
- Multinomial logistic fits selection coefficients beta_k to variant frequency data extracted from research node metadata: P(variant=k|t) = exp(beta_k·t) / sum_j exp(beta_j·t).
- MaxEnt uses species occurrence data from IUCN and GBIF nodes with environmental covariates from NOAA and OpenAQ to estimate habitat suitability, then combines with a deforestation frontier KDE estimated from GDACS and FEMA land event data.
- Fourier seasonal forcing fits the amplitude alpha and phase phi of R_e(t) = R0·(1+alpha·cos(2pi·t/365-phi)) to the temporal distribution of epidemiological records in the network. Power-law waning exponent kappa is calibrated against clinical trial immunogenicity data.
Phase 4: Pay-per-event billing and spending limits
Each tool call invokes Actor.charge({ eventName: '...' }) immediately on entry. If chargeResult.eventChargeLimitReached is true, the tool returns an error message without running the model, protecting against runaway costs.
Tips for best results
-
Use disease-specific queries for parameter inference.
infer_parameters_pmcmcproduces tighter posterior intervals when the query targets a well-documented pathogen with substantial published literature (e.g., "influenza H1N1 2009 pandemic" rather than just "influenza"). More PubMed hits means better-parameterized priors. -
Run
simulate_epidemic_metapopulationbefore causal tools. The metapopulation simulation gives you peak timing and total burden estimates that contextualize the scale of causal effects fromevaluate_intervention_causality. Use them sequentially in a conversation with your AI client. -
Interpret Re estimates as model-informed scenarios, not ground truth. The birth-death skyline and particle MCMC outputs depend on the quality and volume of data retrieved for the query. Gelman-Rubin R-hat above 1.1 indicates the chains have not converged — treat those results with caution.
-
Use geographic specificity for spillover modeling. "bat coronavirus spillover Yunnan Province China" retrieves more targeted IUCN and GBIF species records than "bat coronavirus", substantially improving MaxEnt model fit and hotspot precision.
-
Chain variant forecasting with seasonal modeling. Use
forecast_variant_fitnessto identify the dominant variant, then pass that variant name intomodel_seasonal_waning_dynamicsto understand how the seasonal immune landscape will interact with variant-specific immune evasion. -
Monitor ESS in PMCMC outputs. Effective sample size below 200 on any parameter indicates the particle filter may be weight-collapsing. Re-run with a more specific query that retrieves a larger and more consistent dataset.
-
Use the free tier for exploratory analysis. The $5 Apify free tier covers approximately 150 tool calls. Run exploratory single-tool queries to validate the right tool for your question before building multi-step pipelines.
-
Set spending limits before automated workflows. When scheduling recurring surveillance runs via the Apify API, always set
maxTotalChargeUsdin your run configuration to prevent unexpected charges from burst activity.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| WHO GHO Data Search | Pull disease burden indicators directly for manual parameterization before running the metapopulation simulation |
| Clinical Trial Tracker | Monitor vaccine trial pipeline separately to track countermeasure availability timelines |
| PubMed Research Search | Retrieve the full body of evidence for a pathogen before feeding it to parameter inference |
| GBIF Biodiversity Search | Extract reservoir host distribution data independently for custom spillover analyses |
| NOAA Weather Search | Pull climate anomaly data to manually validate seasonal forcing amplitude estimates |
| GDACS Disaster Search | Track disaster events that disrupt health infrastructure in real time alongside outbreak simulations |
| OpenAQ Air Quality | Combine environmental co-exposure data with outbreak trajectory analysis for respiratory pathogens |
Limitations
- All data is publicly available — the models are parameterized using public databases. Restricted genomic surveillance data, national notifiable disease registries, and proprietary laboratory networks are not accessible. Parameters derived from public data may lag or underestimate true outbreak dynamics.
- Query-dependent data volume — less-studied pathogens (e.g., a newly described zoonosis) may return few PubMed and trial records, reducing algorithm accuracy. The models will still run but posterior credible intervals will be wide.
- Simulation outputs are scenario estimates — Gillespie SSA and phylodynamic results are mathematically rigorous but depend on simplified mobility and population models. Do not use outputs as replacements for national surveillance systems or clinical decision support.
- 16 actor calls per tool — each tool call dispatches 16 parallel actor runs. On free or low-tier Apify accounts with run concurrency limits, some actors may queue, extending response time beyond the expected 2–4 minutes.
- No real-time genomic sequence data —
forecast_variant_fitnessestimates selection coefficients from literature and epidemiological metadata, not from actual GISAID or INSDC sequence databases. Variant frequency estimates are model-derived, not sequence-derived. - Country-level spatial resolution — the radiation mobility model operates at country granularity. Sub-national transmission dynamics, city-level outbreak clusters, and healthcare system capacity constraints are not modeled.
- Environmental covariates are limited — MaxEnt spillover modeling uses NOAA and OpenAQ data as environmental proxies. High-resolution land use, deforestation satellite imagery, and livestock density layers are not included.
- No clinical guidance — outputs are research and planning tools. Nothing produced by this MCP constitutes medical advice, clinical decision support, or public health guidance.
Integrations
- Apify API — trigger tool calls programmatically from Python, JavaScript, or any HTTP client for automated outbreak monitoring pipelines
- Webhooks — trigger downstream alerts when Re estimates exceed a threshold or when high-risk spillover hotspots are identified
- Zapier — route outbreak simulation results to Slack, Google Sheets, or email for non-technical stakeholders
- Make — build multi-step automation workflows combining pandemic surveillance with news monitoring or reporting pipelines
- Google Sheets — export structured variant forecasts or seasonal dynamics tables for collaborative team review
- LangChain / LlamaIndex — use this MCP as a tool within an AI agent that synthesizes pandemic intelligence with other data sources for automated research report generation
Troubleshooting
-
Run takes longer than 4 minutes — this occurs when Apify actor concurrency limits are reached on free or Starter tier accounts, causing some of the 16 parallel actor calls to queue. Upgrade your Apify plan or reduce concurrent requests from other running actors to restore expected latency.
-
Result has very wide confidence intervals from
infer_parameters_pmcmc— the effective sample size (ESS) for one or more parameters is below 200, indicating the particle filter degraded due to low data volume. Use a more specific and well-documented query (e.g., a named outbreak with substantial published literature). CheckconvergenceDiagnostic— values above 1.1 mean the result should not be trusted. -
assess_zoonotic_spilloverreturns no hotspots — the IUCN and GBIF actors returned no species records matching the query. Use species-specific terminology rather than pathogen names (e.g., "Rhinolophus bat" instead of "SARS coronavirus") to populate the ecological network nodes. -
Tool returns spending limit error — your Apify run has reached the
maxTotalChargeUsdcap set for the current run. Increase the limit in your Apify run configuration, or start a new run with a higher cap. -
forecast_variant_fitnessshows all variants with similar fitness — the query did not retrieve enough variant-specific literature to differentiate selection coefficients. Add variant names explicitly to the query (e.g., "JN.1 XBB EG.5 variant competition 2025") rather than using only pathogen family names.
Responsible use
- This server only accesses publicly available data from international organizations, government agencies, and open research databases.
- Model outputs are scenario analyses based on public data and mathematical approximations. They do not constitute clinical guidance, public health policy recommendations, or official surveillance data.
- Comply with the terms of service of all upstream data sources: WHO, NIH/NLM, EMA, IUCN, GBIF, NOAA, FEMA, and the World Bank.
- Do not use outputs to make individual-level health decisions or clinical treatment choices.
- For guidance on responsible use of scraped public data, see Apify's guide.
❓ FAQ
How many data sources does each pandemic biosurveillance tool call? Every tool call queries all 16 data sources in parallel: 4 health databases (WHO, ClinicalTrials.gov, openFDA, EMA), 3 research databases (PubMed, OpenAlex, Europe PMC), 2 ecological databases (IUCN, GBIF), 4 environmental sources (GDACS, NOAA, OpenAQ, FEMA), and 3 spatial sources (REST Countries, Nominatim, World Bank). All 16 run concurrently, not sequentially.
How accurate are the epidemic simulation and forecasting results? Each tool implements a published mathematical framework with proper uncertainty quantification. Gillespie SSA and particle MCMC outputs include convergence diagnostics (ESS and Gelman-Rubin R-hat). Results should be treated as model-informed scenario analysis parameterized by public data, not as clinical-grade predictions or replacements for national surveillance infrastructure.
What pathogens can I analyze with pandemic biosurveillance MCP? Any pathogen with public health data coverage. This includes respiratory viruses (influenza, SARS-CoV-2, RSV), vector-borne diseases (dengue, Zika, malaria), zoonotic threats (mpox, Nipah, bat coronaviruses), and emerging infectious diseases. Less-studied pathogens return fewer data records, producing wider uncertainty intervals.
How long does a typical pandemic biosurveillance tool call take? On a paid Apify account with sufficient actor concurrency, tool calls complete in 2–4 minutes. The 16 parallel data retrievals each have a 180-second timeout. On free accounts with concurrency limits, calls may take 5–8 minutes.
How is this different from Nextstrain or other epidemic intelligence platforms? Nextstrain analyzes actual genomic sequences for phylogenetic analysis of specific pathogens. This MCP covers a different scope: it synthesizes 16 public health, ecological, and environmental databases into multi-domain epidemic network models, applies 8 distinct algorithms (including phylodynamic Re estimation), and is designed for AI agent integration via the MCP protocol. The two tools are complementary.
Can I schedule pandemic biosurveillance runs automatically?
Yes. Use the Apify scheduler to trigger recurring runs at any interval. For weekly variant surveillance, you would configure a scheduled Apify actor run that calls this MCP's forecast_variant_fitness tool and pushes results to a downstream notification or storage system via webhook.
Is it legal to use data from these sources for outbreak modeling? All 16 data sources are publicly available databases from international organizations and government agencies with open data policies. WHO, NIH, NOAA, FEMA, IUCN, GBIF, and the World Bank all publish data for public use. See Apify's guide on web scraping legality for broader context.
What does the Gelman-Rubin convergence diagnostic mean in PMCMC results? The Gelman-Rubin R-hat statistic measures whether two parallel Markov chains have converged to the same distribution. Values below 1.05 indicate reliable convergence. Values between 1.05 and 1.1 suggest marginal convergence. Values above 1.1 mean the chains have not converged and parameter estimates should not be trusted — re-run with a more specific query.
Can pandemic biosurveillance MCP replace a public health surveillance system? No. This MCP provides computational modeling capabilities powered by open public data. It does not replace dedicated surveillance infrastructure, mandatory case reporting systems, national or WHO epidemiological monitoring, or clinical laboratory networks. It is a research and preparedness planning tool.
How does the radiation mobility model work in the metapopulation simulation? The radiation model (Simini et al. 2012) estimates human movement flows between populations as T_ij = T_i·(m_i·n_j)/((m_i+s_ij)(m_i+n_j+s_ij)), where m_i is the origin population, n_j is the destination population, and s_ij is the intervening population. Population sizes are drawn directly from World Bank and REST Countries data retrieved during the run. These mobility edges form the transmission network across which the Gillespie SSA propagates the epidemic.
Can I use pandemic biosurveillance MCP with Claude, GPT-4, or other AI models?
Yes. Any MCP-compatible AI client can connect to this server. Claude Desktop, Cursor, Windsurf, and Cline all support MCP. For direct integration with OpenAI or other model APIs, use the HTTP endpoint at https://pandemic-biosurveillance-mcp.apify.actor/mcp with the JSON-RPC 2.0 protocol.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Bulk Email Verifier
Verify email deliverability at scale. MX record validation, SMTP mailbox checks, disposable and role-based detection, catch-all flagging, and confidence scoring. No external API costs.
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Website Tech Stack Detector
Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.
Ready to try Pandemic Biosurveillance MCP?
Start for free on Apify. No credit card required.
Open on Apify Store