Causal Panopticon MCP Server
Causal Panopticon is a cross-domain causal discovery and inference engine for AI agents, exposed via the Model Context Protocol. It orchestrates **18 heterogeneous data sources** across economics, health, environment, security, policy, finance, academia, and labor — applying 8 peer-reviewed causal algorithms to discover what actually causes what, not just what correlates.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| discover-cross-domain-causes | PC/GES/NOTEARS meta-algorithm causal discovery | $0.12 |
| estimate-interventional-effect | Do-calculus back-door/front-door adjustment | $0.10 |
| compute-counterfactual | SCM abduction-action-prediction | $0.10 |
| transport-causal-effect | Selection diagram transportability | $0.10 |
| detect-confounders | Conditional independence confounder detection | $0.08 |
| simulate-causal-agents | Bayesian persuasion concavification game | $0.10 |
| optimize-causal-experiment | Causal Bayesian optimization EI acquisition | $0.10 |
| validate-causal-model | DAG Markov faithfulness BIC validation | $0.08 |
Example: 100 events = $12.00 · 1,000 events = $120.00
Connect to your AI agent
Add this MCP server to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.
https://ryanclinton--causal-panopticon-mcp.apify.actor/mcp{
"mcpServers": {
"causal-panopticon-mcp": {
"url": "https://ryanclinton--causal-panopticon-mcp.apify.actor/mcp"
}
}
}Documentation
Causal Panopticon is a cross-domain causal discovery and inference engine for AI agents, exposed via the Model Context Protocol. It orchestrates 18 heterogeneous data sources across economics, health, environment, security, policy, finance, academia, and labor — applying 8 peer-reviewed causal algorithms to discover what actually causes what, not just what correlates.
Built for researchers, policy analysts, and AI agents that need to move beyond correlation into causal structure. Each tool call fires up to 18 data actors in parallel, assembles a causal graph, and returns structured inference results — average treatment effects, counterfactual values, transportability verdicts, and validated DAG structures — over a single MCP connection.
What data sources can you access?
| Data Point | Source | Coverage |
|---|---|---|
| 📊 Economic time series | FRED Economic Data | 800K+ US series |
| 👷 Labor statistics | BLS Economic Data | US employment, wages, prices |
| 🌐 Global macroeconomic indicators | IMF Data | 190 countries |
| 🏥 Health statistics | WHO GHO Data | Global health metrics |
| 🧪 Clinical research | ClinicalTrials.gov | Registered trials worldwide |
| 🔬 Biomedical literature | PubMed Search | 35M+ articles |
| 🌦️ Weather and climate data | NOAA Weather | US and global observations |
| 💨 Air quality measurements | OpenAQ | Global monitoring stations |
| 🔐 Cyber vulnerabilities | NVD CVE Search | All published CVEs |
| 📜 US legislation | Congress Bill Search | Active and historical bills |
| 📋 US regulatory filings | Federal Register Search | Federal rules and notices |
| 🏢 Corporate SEC filings | SEC EDGAR Search | All public company filings |
| 📈 Market and equity data | Finnhub Financial Data | Global equities |
| 🎓 Academic publications | OpenAlex | 250M+ scholarly works |
| 🌪️ Disaster declarations | FEMA Disaster Search | US disaster records |
| ⚠️ Consumer product safety | CPSC Recall Search | Product recall alerts |
| 💊 Drug adverse events | FDA Drug Event Search | FAERS database |
| 💼 Job market trends | Job Market Intelligence | Employment signals |
Why use Causal Panopticon MCP Server?
Correlation-based analysis is fast but dangerous for decisions. A policy team that sees a correlation between air quality and hospital admissions cannot determine whether pollution causes illness, whether illness causes people to move to polluted areas, or whether poverty confounds both. Getting that distinction wrong means wasted interventions and missed opportunities.
Manual causal analysis — collecting data from 18 separate sources, applying conditional independence tests, building DAGs, checking back-door criteria — takes weeks for a trained economist. This MCP server does it in a single AI agent tool call.
Connecting via the MCP protocol means your AI agent in Claude Desktop, Cursor, or any MCP-compatible client gains causal reasoning capabilities without writing a line of data collection code.
- Scheduling — run causal discovery on a recurring schedule to detect structural shifts in real-world causal relationships over time
- API access — trigger runs from Python, JavaScript, or any HTTP client via the Apify platform API
- Proxy rotation — data collection across 18 sources uses Apify's built-in proxy infrastructure to avoid rate limits
- Monitoring — get Slack or email alerts when tool calls fail or return unexpected results
- Integrations — connect to Zapier, Make, Google Sheets, or any webhook-compatible workflow
MCP tools
| Tool | Algorithm | Best for | Cost per call |
|---|---|---|---|
discover_cross_domain_causes | PC / GES / NOTEARS meta-algorithm + transfer entropy | Finding causal structure across domains | $0.04 |
estimate_interventional_effect | Do-calculus (three rules), back-door, front-door, IV | Treatment effect estimation, policy evaluation | $0.04 |
compute_counterfactual | Pearl's SCM: abduction → action → prediction | Retrospective attribution, what-if analysis | $0.04 |
transport_causal_effect | Bareinboim-Pearl transportability, s-admissibility | Generalizing findings across populations | $0.04 |
detect_confounders | Graph structure + partial correlation analysis | Identifying threats to causal inference | $0.04 |
simulate_causal_agents | Bayesian persuasion, Kamenica-Gentzkow concavification | Expert disagreement modeling, policy persuasion | $0.04 |
optimize_causal_experiment | Causal Bayesian optimization, Expected Improvement | Experiment design, active learning | $0.04 |
validate_causal_model | DAG acyclicity, Markov compatibility, faithfulness, BIC | Model checking, algorithm comparison | $0.04 |
Features
- Meta-algorithm selection —
automode runs all three structural learning algorithms (PC, GES, NOTEARS) and selects the result with the best BIC score, so you get the best-fitting DAG for your data without manual tuning - PC algorithm — constraint-based causal discovery using Pearson partial correlation tests with Fisher's z-test (Abramowitz-Stegun approximation), controllable significance level via the
alphaparameter - GES algorithm — greedy equivalence search with forward and backward BIC scoring phases; well-suited for denser causal graphs where the PC skeleton may over-prune
- NOTEARS algorithm — continuous optimization with trace-exponential acyclicity constraint (Taylor series approximation to order 8); converts the combinatorial DAG search into a differentiable problem
- Do-calculus identification — implements all three rules of Pearl's do-calculus; applies back-door criterion first, front-door criterion second, and instrumental variable identification third, returning which criterion identified the effect
- SCM counterfactuals — Pearl's three-step procedure: abduction (infer exogenous noise U via residualization), action (modify structural equations for the hypothetical), prediction (propagate in topological order through the modified SCM)
- Bareinboim-Pearl transportability — evaluates s-admissibility using selection diagrams; reports whether a causal effect identified in a source domain can be validly transported to a target domain and provides the adjusted estimate
- Transfer entropy matrix — kernel density estimation of time-lagged mutual information between all domain pairs; captures information flow that directed graphs may miss
- Colimit graph construction — merges per-domain causal graphs into a unified causal atlas using the category-theoretic colimit over directed graphs, preserving domain-specific edges as cross-domain causal claims
- Bayesian persuasion simulation — models 3,000 domain expert agents in a Kamenica-Gentzkow sender-receiver game; concavifies the sender's value function over the simplex of posterior beliefs to find the optimal information disclosure policy
- Confounder detection — reports each confounder's variable name, the pair it confounds, confounding strength (partial correlation magnitude), whether it is adjustable in the graph, and whether an instrumental variable exists for non-adjustable confounders
- Experiment optimization — causal Bayesian optimization selects the intervention variable that maximizes expected information gain via variance reduction; applies Expected Improvement acquisition function and outputs suggested sample size via power analysis
- Model validation — four independent validation criteria: DAG acyclicity (Kahn topological sort), Markov compatibility (implied independencies hold in data), faithfulness (no extra independencies), and BIC goodness of fit with individual p-values per test
- Parallel data collection — all upstream actors fire simultaneously via
runActorsParallel, with 180-second per-actor timeout and graceful empty-array fallback on failure
Use cases for cross-domain causal discovery
Policy impact research
Policy analysts and think tanks evaluating legislative proposals need to know whether a law caused an outcome, not merely whether the two are correlated. Use estimate_interventional_effect with sourceDomain: "policy" and targetDomain: "economics" to estimate the average treatment effect of a legislative change on employment or GDP — with identification criterion reported so you know whether back-door, front-door, or IV was used.
Cross-domain causal hypothesis generation
Research teams and AI agents building knowledge graphs often work within single data silos. discover_cross_domain_causes pulls from all 9 domain groups simultaneously, builds per-domain DAGs via the meta-algorithm, and merges them into a colimit graph. The result surfaces cross-domain causal edges — for example, a NOAA climate variable influencing an FRED unemployment series — that no single-domain analysis would find.
Clinical and public health attribution
Epidemiologists and public health researchers need counterfactual estimates to answer attribution questions. compute_counterfactual implements Pearl's abduction-action-prediction procedure using PubMed and clinical trials data: "What would the mortality rate have been if the intervention had not been introduced?" The exogenous noise terms are inferred from observed residuals, giving a statistically grounded retrospective estimate.
Generalizing findings across populations
Development economists and international health researchers routinely need to apply findings from one country's data to another. transport_causal_effect evaluates Bareinboim-Pearl s-admissibility using selection diagrams derived from the source and target domain graphs. If the effect is transportable, it returns the reweighted adjusted estimate; if not, it reports why transportability fails.
AI agent experiment planning
AI agents tasked with causal learning pipelines — for example, an agent iterating on a research question over multiple runs — can use optimize_causal_experiment to decide which variable to intervene on next. The Expected Improvement acquisition function selects the variable whose intervention maximally reduces posterior variance over the target, with power analysis output for sample size planning.
Causal model validation before downstream use
Before using a discovered DAG to make decisions or pass to a downstream analysis, use validate_causal_model to verify it satisfies DAG acyclicity, Markov compatibility, and faithfulness. Individual test p-values are returned per criterion, so you can report the statistical basis for accepting or rejecting the structure.
How to connect this MCP server
Claude Desktop
Add the following to your claude_desktop_config.json:
{
"mcpServers": {
"causal-panopticon": {
"url": "https://causal-panopticon-mcp.apify.actor/mcp",
"headers": {
"Authorization": "Bearer YOUR_APIFY_TOKEN"
}
}
}
}
Cursor
Add the following to your Cursor MCP settings:
{
"mcpServers": {
"causal-panopticon": {
"url": "https://causal-panopticon-mcp.apify.actor/mcp",
"headers": {
"Authorization": "Bearer YOUR_APIFY_TOKEN"
}
}
}
}
Any MCP-compatible client
The server endpoint is: https://causal-panopticon-mcp.apify.actor/mcp
The server uses Streamable HTTP transport and supports all standard MCP operations: tools/list, tools/call.
Input parameters
This MCP server has no actor-level input. All parameters are passed directly to each tool when called by the AI agent.
Per-tool parameters
| Tool | Parameter | Type | Required | Default | Description |
|---|---|---|---|---|---|
discover_cross_domain_causes | query | string | Yes | — | Search query sent to all domain actors (e.g. "inflation healthcare employment") |
discover_cross_domain_causes | domains | string[] | No | all 9 | Subset of: economics, health, environment, security, policy, finance, academia, disaster, labor |
discover_cross_domain_causes | algorithm | enum | No | auto | auto, pc, ges, or notears |
discover_cross_domain_causes | alpha | number | No | 0.05 | Significance level for PC conditional independence tests |
discover_cross_domain_causes | maxResults | number | No | 20 | Max results fetched per upstream actor |
estimate_interventional_effect | query | string | Yes | — | Data collection query |
estimate_interventional_effect | treatment | string | Yes | — | Treatment/intervention variable name |
estimate_interventional_effect | outcome | string | Yes | — | Outcome variable name |
estimate_interventional_effect | domains | string[] | No | all 9 | Domains for data collection |
estimate_interventional_effect | maxResults | number | No | 15 | Max results per actor |
compute_counterfactual | query | string | Yes | — | Data collection query |
compute_counterfactual | interventionVariable | string | Yes | — | Variable to hypothetically change |
compute_counterfactual | interventionValue | number | Yes | — | Hypothetical value for the intervention |
compute_counterfactual | queryVariable | string | Yes | — | Variable whose counterfactual value to compute |
compute_counterfactual | domains | string[] | No | all 9 | Domains for data collection |
compute_counterfactual | maxResults | number | No | 15 | Max results per actor |
transport_causal_effect | query | string | Yes | — | Query common to both domains |
transport_causal_effect | sourceDomain | string | Yes | — | Domain where the effect was identified |
transport_causal_effect | targetDomain | string | Yes | — | Domain to transport the effect to |
transport_causal_effect | treatment | string | Yes | — | Treatment variable |
transport_causal_effect | outcome | string | Yes | — | Outcome variable |
transport_causal_effect | selectionVariables | string[] | No | — | Variables that differ between domains |
transport_causal_effect | maxResults | number | No | 15 | Max results per actor |
detect_confounders | query | string | Yes | — | Data collection query |
detect_confounders | domains | string[] | No | all 9 | Domains for data collection |
detect_confounders | maxResults | number | No | 15 | Max results per actor |
simulate_causal_agents | query | string | Yes | — | Causal claim or hypothesis to evaluate |
simulate_causal_agents | numStates | number | No | 3 | Number of possible causal states (2–10) |
simulate_causal_agents | numAgents | number | No | 3000 | Number of domain expert agents to simulate |
simulate_causal_agents | domains | string[] | No | all 9 | Domains for data collection |
simulate_causal_agents | maxResults | number | No | 10 | Max results per actor |
optimize_causal_experiment | query | string | Yes | — | Research question or target variable query |
optimize_causal_experiment | targetVariable | string | Yes | — | Variable to learn about |
optimize_causal_experiment | budget | number | No | 100 | Experiment budget (abstract units for cost estimation) |
optimize_causal_experiment | domains | string[] | No | all 9 | Domains for data collection |
optimize_causal_experiment | maxResults | number | No | 15 | Max results per actor |
validate_causal_model | query | string | Yes | — | Data collection query |
validate_causal_model | domains | string[] | No | all 9 | Domains for data collection |
validate_causal_model | algorithm | enum | No | auto | Algorithm to validate: auto, pc, ges, notears |
validate_causal_model | maxResults | number | No | 15 | Max results per actor |
Usage examples
Discover causal structure across economics and health:
{
"query": "inflation unemployment healthcare spending",
"domains": ["economics", "health", "labor"],
"algorithm": "auto",
"alpha": 0.05,
"maxResults": 20
}
Estimate policy treatment effect:
{
"query": "minimum wage employment",
"treatment": "minimum_wage",
"outcome": "employment_rate",
"domains": ["economics", "policy", "labor"],
"maxResults": 15
}
Compute counterfactual for health intervention:
{
"query": "air quality respiratory hospital admissions",
"interventionVariable": "pm25_concentration",
"interventionValue": 12.0,
"queryVariable": "hospital_admissions",
"domains": ["environment", "health"],
"maxResults": 15
}
Input tips
- Narrow your domains — specify 2-3 focused domains rather than using all 9 to reduce cost and improve signal-to-noise ratio in the causal graph
- Use descriptive queries — queries like "air quality respiratory hospital admissions" retrieve more relevant data than "health" from each upstream actor, improving the quality of extracted numeric features
- Match variable names to your data — the
findClosestNodefunction uses substring matching, so treatment/outcome variable names like "employment" or "pm25" will match nodes even without exact ID matches - Increase
alphafor exploratory work — settingalpha: 0.10makes the PC algorithm less aggressive in pruning edges, giving a denser skeleton for hypothesis generation - Use
validate_causal_modelafterdiscover_cross_domain_causes— run discovery first, then validation with the same query and domain parameters to get p-values for each model criterion before drawing conclusions
Output examples
discover_cross_domain_causes output
{
"algorithm": "NOTEARS",
"totalNodes": 24,
"totalEdges": 31,
"domains": ["economics", "health", "environment", "labor"],
"perDomainGraphs": [
{
"algorithm": "GES",
"nodes": 8,
"edges": 11,
"bicScore": -142.7,
"dagValid": true
},
{
"algorithm": "PC",
"nodes": 6,
"edges": 7,
"bicScore": -98.4,
"dagValid": true
}
],
"crossDomainEdges": [
{
"from": "economics_cpi_inflation",
"to": "health_hospital_admissions",
"weight": 0.64,
"mechanism": "indirect",
"confidence": 0.81
},
{
"from": "environment_pm25",
"to": "labor_absenteeism",
"weight": 0.52,
"mechanism": "direct",
"confidence": 0.74
}
],
"colimitGraph": {
"nodes": 24,
"edges": 31,
"dagValid": true
},
"transferEntropyTopPairs": [
{ "from": "economics", "to": "labor", "entropy": 0.43 },
{ "from": "environment", "to": "health", "entropy": 0.38 }
]
}
estimate_interventional_effect output
{
"treatment": "economics_minimum_wage",
"outcome": "labor_employment_rate",
"averageTreatmentEffect": -0.024,
"confidence": 0.78,
"identificationCriterion": "backdoor",
"adjustmentSet": ["economics_gdp_growth", "policy_labor_regulation"],
"confidenceInterval": [-0.041, -0.007],
"graphAlgorithm": "GES",
"graphNodes": 18,
"graphEdges": 22
}
compute_counterfactual output
{
"query": "What would hospital admissions be if PM2.5 had been 12?",
"factualValue": 0.47,
"counterfactualValue": 0.31,
"causalEffect": -0.16,
"probability": 0.83,
"mechanism": "direct structural pathway via environment_pm25 -> health_respiratory",
"exogenousNoiseTerms": 6,
"graphAlgorithm": "NOTEARS",
"graphNodes": 14
}
transport_causal_effect output
{
"sourceDomain": "health",
"targetDomain": "economics",
"sourceEffect": 0.38,
"transportable": true,
"sAdmissible": true,
"adjustedEffect": 0.29,
"confidence": 0.71,
"selectionNodes": ["gdp_per_capita", "healthcare_expenditure"],
"sourceGraphNodes": 9,
"targetGraphNodes": 11
}
validate_causal_model output
{
"overallValid": true,
"dagConstraintSatisfied": true,
"markovCompatible": true,
"faithfulnessHolds": false,
"bicScore": -214.3,
"testResults": [
{ "test": "dag_acyclicity", "pValue": 1.0, "passed": true },
{ "test": "markov_compatibility", "pValue": 0.12, "passed": true },
{ "test": "faithfulness", "pValue": 0.03, "passed": false },
{ "test": "bic_goodness_of_fit", "pValue": 0.24, "passed": true }
],
"graphAlgorithm": "PC",
"graphNodes": 16,
"graphEdges": 19
}
Output fields
Common fields (all tools)
| Field | Type | Description |
|---|---|---|
graphAlgorithm | string | Algorithm selected: PC, GES, or NOTEARS |
graphNodes | number | Number of nodes in the merged causal graph |
graphEdges | number | Number of directed edges in the merged causal graph |
discover_cross_domain_causes
| Field | Type | Description |
|---|---|---|
algorithm | string | Algorithm selected by meta-algorithm (best BIC) |
totalNodes | number | Total nodes in the colimit (merged) graph |
totalEdges | number | Total edges in the colimit graph |
domains | string[] | Domains included in this run |
perDomainGraphs[] | object[] | Per-domain graph summary with algorithm, node/edge counts, BIC score, and DAG validity flag |
crossDomainEdges[] | object[] | Edges whose endpoints span different domain graphs |
crossDomainEdges[].from | string | Source node ID (domain-prefixed) |
crossDomainEdges[].to | string | Target node ID (domain-prefixed) |
crossDomainEdges[].weight | number | Edge weight (partial correlation magnitude) |
crossDomainEdges[].mechanism | string | direct or indirect |
crossDomainEdges[].confidence | number | Confidence in direction [0, 1] |
colimitGraph.nodes | number | Total nodes in the merged colimit graph |
colimitGraph.edges | number | Total edges in the merged colimit graph |
colimitGraph.dagValid | boolean | Whether the colimit graph satisfies acyclicity |
transferEntropyTopPairs[] | object[] | Top domain pairs by time-lagged transfer entropy |
estimate_interventional_effect
| Field | Type | Description |
|---|---|---|
treatment | string | Matched treatment node ID in the graph |
outcome | string | Matched outcome node ID in the graph |
averageTreatmentEffect | number | ATE: E[Y | do(X=x)] - E[Y | do(X=x')] |
confidence | number | Confidence in the ATE estimate [0, 1] |
identificationCriterion | string | backdoor, frontdoor, instrumental, or direct |
adjustmentSet | string[] | Variables used in the adjustment formula |
confidenceInterval | [number, number] | 95% confidence interval bounds for the ATE |
compute_counterfactual
| Field | Type | Description |
|---|---|---|
query | string | Human-readable counterfactual question |
factualValue | number | Observed value of the query variable |
counterfactualValue | number | Estimated value under the hypothetical intervention |
causalEffect | number | Counterfactual effect: counterfactual - factual |
probability | number | Probability of the counterfactual scenario [0, 1] |
mechanism | string | Description of the causal pathway |
exogenousNoiseTerms | number | Number of exogenous noise variables inferred (abduction step) |
transport_causal_effect
| Field | Type | Description |
|---|---|---|
sourceDomain | string | Domain where the effect was identified |
targetDomain | string | Target domain for transport |
sourceEffect | number | Original causal effect in the source domain |
transportable | boolean | Whether the effect can be validly transported |
sAdmissible | boolean | Whether the selection diagram is s-admissible |
adjustedEffect | number | Reweighted effect estimate for the target domain |
selectionNodes | string[] | Selection variables in the transportability diagram |
detect_confounders
| Field | Type | Description |
|---|---|---|
totalConfounders | number | Total confounders found in the graph |
confounders[].variable | string | Confounding variable ID |
confounders[].confounds | [string, string] | The pair of variables being confounded |
confounders[].strength | number | Confounding strength (partial correlation magnitude) |
confounders[].adjustable | boolean | Whether the confounder is observed and adjustable |
confounders[].instrumentAvailable | boolean | Whether an instrumental variable exists |
confounders[].instrument | string | null | IV variable name, if available |
simulate_causal_agents
| Field | Type | Description |
|---|---|---|
agentsSimulated | number | Number of expert agents in the simulation |
states | number | Number of possible causal states |
convergence | boolean | Whether the persuasion game reached equilibrium |
equilibriumBeliefs | object | Final posterior belief distribution over states |
senderPayoff | number | Optimal sender payoff under the equilibrium signal |
receiverPayoff | number | Receiver payoff at equilibrium |
informationRent | number | Information rent extracted by the sender |
optimalSignals | number | Number of signals in the optimal disclosure policy |
posteriorCount | number | Number of posteriors in the concavified value function |
optimize_causal_experiment
| Field | Type | Description |
|---|---|---|
targetVariable | string | Matched target variable ID in the graph |
optimalIntervention | string | Variable to intervene on for maximum information gain |
expectedInfoGain | number | Expected reduction in posterior variance |
acquisitionFunction | number | Expected Improvement acquisition function value |
suggestedSampleSize | number | Recommended sample size from power analysis |
estimatedCost | number | Estimated experiment cost in budget units |
candidateVariables | string[] | All candidate intervention variables considered |
validate_causal_model
| Field | Type | Description |
|---|---|---|
overallValid | boolean | True if all four criteria pass |
dagConstraintSatisfied | boolean | Kahn topological sort succeeds (no cycles) |
markovCompatible | boolean | Implied conditional independencies hold in data |
faithfulnessHolds | boolean | No extra conditional independencies in data |
bicScore | number | Bayesian Information Criterion (lower is better) |
testResults[] | object[] | Individual test results with p-value and pass/fail |
testResults[].test | string | Test name |
testResults[].pValue | number | p-value for the test |
testResults[].passed | boolean | Whether the test passed |
How much does it cost to use Causal Panopticon?
Causal Panopticon uses pay-per-event pricing — you pay $0.04 per tool call. Upstream actor execution costs are billed separately at Apify's standard rates (typically $0.01-0.02 per upstream call). Platform compute is included.
| Scenario | Tool calls | Cost per call | Estimated total |
|---|---|---|---|
| Quick test | 1 | $0.04 | $0.04 |
| Small research session | 10 | $0.04 | $0.40 |
| Full analysis pipeline | 50 | $0.04 | $2.00 |
| Daily automated runs | 200 | $0.04 | $8.00 |
| Enterprise research workflow | 1,000 | $0.04 | $40.00 |
You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.
The Apify free plan includes $5 of monthly platform credits — enough for approximately 125 tool calls at no cost. Compare this to commercial causal inference platforms that charge $200-2,000/month for access to a single data domain. Causal Panopticon covers 9 domains with no subscription commitment.
How to call this server using the API
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
# Start the MCP server actor in standby mode and call a tool via HTTP
import urllib.request
import json
url = "https://causal-panopticon-mcp.apify.actor/mcp"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_APIFY_TOKEN"
}
payload = {
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "discover_cross_domain_causes",
"arguments": {
"query": "inflation unemployment health outcomes",
"domains": ["economics", "health", "labor"],
"algorithm": "auto",
"maxResults": 20
}
},
"id": 1
}
data = json.dumps(payload).encode("utf-8")
req = urllib.request.Request(url, data=data, headers=headers, method="POST")
with urllib.request.urlopen(req) as response:
result = json.loads(response.read())
content = result["result"]["content"][0]["text"]
atlas = json.loads(content)
print(f"Algorithm: {atlas['algorithm']}")
print(f"Total nodes: {atlas['totalNodes']}, edges: {atlas['totalEdges']}")
for edge in atlas.get("crossDomainEdges", []):
print(f" {edge['from']} -> {edge['to']} (confidence: {edge['confidence']:.2f})")
JavaScript
const url = "https://causal-panopticon-mcp.apify.actor/mcp";
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_APIFY_TOKEN"
},
body: JSON.stringify({
jsonrpc: "2.0",
method: "tools/call",
params: {
name: "estimate_interventional_effect",
arguments: {
query: "minimum wage employment labor market",
treatment: "minimum_wage",
outcome: "employment_rate",
domains: ["economics", "policy", "labor"],
maxResults: 15
}
},
id: 1
})
});
const result = await response.json();
const content = JSON.parse(result.result.content[0].text);
console.log(`ATE: ${content.averageTreatmentEffect.toFixed(4)}`);
console.log(`Identification: ${content.identificationCriterion}`);
console.log(`Adjustment set: ${content.adjustmentSet.join(", ")}`);
console.log(`95% CI: [${content.confidenceInterval[0].toFixed(4)}, ${content.confidenceInterval[1].toFixed(4)}]`);
cURL
# Call discover_cross_domain_causes
curl -X POST "https://causal-panopticon-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "discover_cross_domain_causes",
"arguments": {
"query": "air quality respiratory disease employment",
"domains": ["environment", "health", "labor"],
"algorithm": "auto",
"maxResults": 20
}
},
"id": 1
}'
# List all available tools
curl -X POST "https://causal-panopticon-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{"jsonrpc": "2.0", "method": "tools/list", "params": {}, "id": 1}'
How Causal Panopticon works
Phase 1: Parallel data collection across 18 sources
When a tool is called, the server identifies which domains are active and maps each domain to its upstream actor IDs using the DOMAIN_MAP constant. The economics domain maps to fred, bls, and imf actors; health maps to who, clinicalTrials, and pubmed; and so on. All actor calls fire simultaneously via runActorsParallel, which wraps Promise.all with per-actor 180-second timeouts and empty-array fallbacks. Results are grouped by domain into a domainData record before graph construction.
Phase 2: Causal graph construction via meta-algorithm
buildCrossDomainAtlas extracts numeric features from each domain's data items — converting any JSON field resolvable to a float into a CausalNode.values array. For each domain with at least 2 nodes, the function runs all three structural learning algorithms and selects the result with the lowest BIC score. PC algorithm starts with a complete undirected graph and iteratively removes edges where partial correlation is not significant at alpha (Fisher's z-test, Abramowitz-Stegun CDF approximation). GES applies forward greedy edge addition followed by backward deletion, scoring each candidate DAG by BIC = n*ln(RSS/n) + k*ln(n). NOTEARS builds the adjacency using trace(exp(W ∘ W)) - d = 0 as the continuous acyclicity constraint, approximated via 8th-order Taylor series. Topological sort (Kahn's algorithm) validates DAG acyclicity before returning each graph.
Phase 3: Cross-domain graph merging and transfer entropy
Per-domain graphs are merged into a colimit graph by unioning all nodes and edges. Cross-domain edges are inferred by computing Pearson correlations between nodes from different domain graphs using their raw value arrays — pairs above a threshold receive a directed edge from the domain with higher data density to the lower. The transfer entropy matrix is computed for all domain pairs using kernel density estimation of time-lagged mutual information, giving a separate signal for directional information flow that does not depend on the structural equations.
Phase 4: Causal inference computation
Depending on the tool called, the merged graph feeds into one of five inference procedures. doCalculus searches the graph for a back-door adjustment set (a set of nodes that blocks all back-door paths from treatment to outcome) and computes the adjustment formula; if none exists, it tries front-door paths, then instrumental variables. scmCounterfactual assigns each node a structural equation of the form X_j = sum(beta_ij * X_i) + U_j, infers the noise terms U by regression residualization (abduction), sets the intervention variable to its hypothetical value (action), and propagates in topological order (prediction). transportCausalEffect compares the Markov boundaries of treatment and outcome nodes across source and target graphs to evaluate s-admissibility. bayesianPersuasion applies the Kamenica-Gentzkow concavification by iteratively finding the highest concave lower envelope over the sender's value function on the probability simplex.
Tips for best results
-
Narrow the domain list for focused queries. All-domain runs query all 18 actors, which increases both latency and cost. For a question about economic policy effects on labor markets,
["economics", "policy", "labor"]is sufficient and produces a cleaner causal graph. -
Use
algorithm: "auto"unless you have a prior reason not to. The meta-algorithm compares PC, GES, and NOTEARS by BIC and picks the best fit for your data's density. Only override to a specific algorithm if you need reproducibility across runs or have domain knowledge about the expected graph structure. -
Run
detect_confoundersbeforeestimate_interventional_effect. Knowing which confounders exist and whether they are adjustable tells you whether the back-door criterion is applicable before running the full treatment effect estimation. If a confounder is non-adjustable and no instrument is available, the identification may fall back to front-door or be unreliable. -
Use
validate_causal_modelto assess statistical confidence. ThefaithfulnessHolds: falseresult does not invalidate a graph — it means the data contains extra conditional independencies beyond what the graph implies, which is common with small samples. Examine the individualtestResultsp-values rather than relying solely onoverallValid. -
For counterfactual analysis, set
interventionValueto a realistic hypothetical. Values far outside the observed data range will produce extrapolated noise estimates with low probability scores. Counterfactuals within the 10th-90th percentile of the treatment variable's observed distribution are most reliable. -
Combine
optimize_causal_experimentwithdiscover_cross_domain_causesfor active causal learning. First run discovery to get the graph structure, then pass the target variable to the experiment optimizer to find which variable to measure or intervene on in the next data collection round. -
Increase
maxResultsfor higher-quality graphs. Each upstream actor returns up tomaxResultsitems, and more observations per node improve the statistical power of the partial correlation tests in PC and the BIC scoring in GES. The default of 15-20 is conservative; set to 50 for research-grade results.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Company Deep Research | Feed company intelligence reports as domain data input; use estimate_interventional_effect to identify which business factors causally drive financial outcomes |
| Website Tech Stack Detector | Build a technology adoption dataset across companies, then use discover_cross_domain_causes to find causal links between tech choices and business performance signals |
| Trustpilot Review Analyzer | Extract sentiment scores over time and use compute_counterfactual to estimate what review scores would have been absent a specific product change |
| B2B Lead Qualifier | Score leads with 30+ signals, then pipe signals into detect_confounders to identify which lead-scoring features are genuinely predictive versus confounded by company size |
| SERP Rank Tracker | Track ranking changes over time and use transport_causal_effect to test whether SEO interventions that worked in one market transport to another |
| Regulatory Change Tracker | Use new regulatory data as the policy domain input for estimate_interventional_effect to estimate ATE of specific regulatory changes on industry metrics |
Limitations
- Causal discovery from observational data has fundamental identifiability limits. When two variables are d-separation equivalent in the graph, conditional independence tests cannot determine the direction of the edge without external assumptions or experimental data. The DAG returned represents one member of a Markov equivalence class.
- Faithfulness and Markov assumptions may not hold. Real-world data, especially from heterogeneous public APIs, may violate the faithfulness assumption (extra independencies due to canceling paths) or have non-stationary distributions. Validate results with
validate_causal_modelbefore acting on them. - Numeric feature extraction from JSON is heuristic. Upstream actors return structured JSON with many string and categorical fields. The feature extraction converts numeric fields only. Domains with predominantly textual data (such as academic abstracts from PubMed) will produce sparser causal graphs.
- Transfer entropy measures information flow, not necessarily causation. High transfer entropy between two domains indicates temporal information transmission but does not establish the direction of structural causation. It is a complementary signal, not a substitute for structural graph-based identification.
- Upstream actor failures return empty arrays, not errors. If a subset of the 18 source actors fail or time out, the causal graph is built on partial data without warning. Check
graphNodesin the response — very low counts indicate data collection issues for that domain. - Variable name matching uses substring search. The
findClosestNodefunction matches by substring if no exact match is found. Treatment and outcome variable names that are too generic (e.g., "rate") may match unexpected nodes. Use domain-prefixed names or more specific terms (e.g., "employment_rate", "cpi_inflation") for reliable matching. - BIC-based algorithm selection is not guaranteed optimal. BIC penalizes model complexity, but the best-BIC algorithm is not always the causally correct one. For small samples (fewer than 30 observations per node), consider specifying
algorithm: "pc"explicitly, as GES and NOTEARS can overfit. - Bayesian persuasion simulation uses abstract payoff matrices. The
simulate_causal_agentstool derives prior beliefs from data density across domains, but the sender/receiver payoffs are modeled abstractly. Results are most useful for directional qualitative analysis of expert disagreement, not precise payoff quantification.
Integrations
- Zapier — trigger a causal discovery run when new data is added to a spreadsheet or CRM and receive structured results in downstream workflow steps
- Make — build automated research pipelines that call
estimate_interventional_effecton a schedule and pipe ATE results to Slack or a Google Sheet - Google Sheets — export counterfactual and treatment effect results directly into spreadsheets for stakeholder review without touching the API
- Apify API — call the MCP server programmatically from any Python, JavaScript, or HTTP client; results are returned as structured JSON in the MCP protocol envelope
- Webhooks — receive a POST notification when a long-running causal discovery job completes, with the full result payload
- LangChain / LlamaIndex — expose the MCP tools directly to LLM orchestration frameworks; agents can call
discover_cross_domain_causesand then reason over the returned causal graph structure in a multi-step research loop
Troubleshooting
Graph has very few nodes despite a broad query. Most upstream actors returned empty arrays, either due to query terms that did not match their search APIs or temporary rate limiting. Try a broader query term and specify fewer domains. Check that each domain's typical data format contains numeric fields — text-heavy domains like academia produce sparse graphs when PubMed abstracts are the primary data source.
averageTreatmentEffect is unexpectedly large or small. The ATE is estimated from the linear structural equations fitted to the extracted numeric features. If the upstream data has a very narrow numeric range or high variance, the regression coefficients will reflect that scale. Interpret the ATE as a relative directional effect within the data distribution, not an absolute real-world unit effect.
transportable: false even when domains seem similar. Transportability fails when the Markov boundary of the treatment node in the source graph contains variables that are selection nodes in the target domain — meaning the populations differ on precisely those variables. Try running detect_confounders on the target domain separately and compare the confounder list to understand why s-admissibility is not satisfied.
convergence: false in simulate_causal_agents. The Bayesian persuasion game did not reach a stable equilibrium within the iteration limit. This can happen when the prior beliefs are nearly uniform and the payoff matrices have low contrast. Try increasing numStates to 2 for a binary causal claim, which produces a simpler concavification problem that converges more reliably.
Tool returns a spending limit error. You have reached the per-run spending limit configured in your Apify account or run settings. Increase the maxTotalChargeUsd parameter when starting the actor, or set a higher limit in the Apify console under your account's billing settings.
Responsible use
- This server queries publicly available data sources: US federal databases, WHO, NOAA, OpenAQ, NVD, FEMA, FDA, and academic APIs. It does not scrape private or restricted sources.
- Causal claims derived from observational data should be validated against domain expertise before informing policy or clinical decisions.
- Users are responsible for ensuring that downstream use of causal findings complies with applicable regulations in their jurisdiction.
- Do not use causal inference results as the sole basis for high-stakes decisions affecting individuals without experimental validation.
- For guidance on responsible AI-assisted research, see the Apify platform guidelines.
FAQ
What is cross-domain causal discovery and why does it require 18 data sources? Causal discovery within a single domain misses relationships that span institutional boundaries. Economic shocks cause health outcomes; climate events cause labor disruptions; cybersecurity incidents cause regulatory changes. Identifying these cross-domain causal pathways requires simultaneous data from economics, health, environment, and other domains. The 18 source actors cover 9 domain groups, giving the meta-algorithm enough heterogeneous data to build a credible cross-domain causal atlas.
Which causal discovery algorithm should I use — PC, GES, or NOTEARS?
Use algorithm: "auto" (the default) to let the server select by BIC score. If you need guidance: PC is best for sparse graphs with strong domain knowledge about which variables can be causally connected; GES handles moderate-density graphs and is computationally faster than NOTEARS for larger node sets; NOTEARS is best when you expect a dense continuous structure and want gradient-based optimization rather than combinatorial search.
Can this server prove causation?
No causal discovery algorithm running on observational data can prove causation. The tools identify candidate causal structures that are statistically consistent with the data under the faithfulness and Markov assumptions. True causal confirmation requires experimental intervention. Use validate_causal_model to check statistical support and optimize_causal_experiment to design a follow-up experiment that would confirm or refute the discovered structure.
How is estimate_interventional_effect different from a regression coefficient?
A regression coefficient measures association conditional on included variables. estimate_interventional_effect uses do-calculus to identify P(Y|do(X)) — the distribution of Y under a hypothetical intervention that sets X to a value, rather than merely observing X at that value. The critical difference is that do-calculus adjusts for confounders using the graph structure, not just the variables included in the regression, and can identify effects even when back-door blocking is impossible (via front-door or IV criteria).
How accurate are the counterfactual estimates?
Accuracy depends on: the quality of numeric data extracted from upstream actors, how well the linear structural equations approximate the true mechanisms, and whether the faithfulness assumption holds. The probability field in compute_counterfactual reflects the plausibility of the inferred exogenous noise scenario. Values above 0.7 indicate the counterfactual is consistent with the observed data; below 0.5 suggests the hypothetical intervention is far from the training distribution.
How long does a typical tool call take?
A full all-domain call (discover_cross_domain_causes with all 9 domains) typically takes 2-4 minutes because it fires up to 18 upstream actors in parallel, each with a 180-second timeout. Single-domain calls or calls with 2-3 domains typically complete in 30-90 seconds. The MCP server remains in standby mode between calls, so there is no cold-start overhead for the server itself.
Can I schedule causal discovery to run periodically to detect structural shifts?
Yes. Use Apify's built-in scheduler to run the MCP server actor on a daily, weekly, or custom interval. Compare the crossDomainEdges from successive runs to detect when causal relationships between domains change — for example, detecting when the economics-health causal link strengthens during a recession.
Is it legal to collect and analyze data from these 18 sources? All 18 upstream data sources are publicly available APIs from US federal agencies (FRED, BLS, Congress, SEC, FEMA, FDA, CPSC, NVD), international organizations (WHO, IMF), research databases (PubMed, OpenAlex), and environmental monitoring networks (NOAA, OpenAQ). Public API usage complies with each source's terms of service. For web scraping legality generally, see Apify's guide.
How is this different from existing causal inference libraries like DoWhy or CausalML? DoWhy and CausalML are Python libraries that require you to supply your own data. Causal Panopticon handles data collection, feature extraction, and graph construction end-to-end, and exposes the entire pipeline as an MCP tool that any AI agent can call. It is designed for agentic workflows where the AI needs to autonomously collect evidence, build a causal model, and return actionable inference results — not for researchers who already have a clean dataset and want fine-grained programmatic control.
Can I use this with Claude, GPT-4, or other AI models?
Yes. Any AI model with MCP client support can connect to this server at https://causal-panopticon-mcp.apify.actor/mcp. Claude Desktop and Cursor have native MCP support. For other models, use an MCP-compatible client library or call the HTTP endpoint directly from your agent's tool-use framework.
What happens if some upstream actors fail during a run?
The runActor function catches all errors and returns an empty array, so the run continues with partial data. The affected domain will contribute no nodes to the causal graph. Monitor graphNodes in the response — if it is much lower than expected given the number of domains specified, one or more upstream actors likely failed. The Apify run log will show which actors returned errors.
Does the server retain my data between calls? No. The MCP server processes each tool call independently and does not persist causal graphs or data between calls. The Apify platform stores run logs and output datasets per-run in your account, which you can access via the Apify console or API for debugging purposes.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom causal inference workflows, multi-domain research pipelines, or enterprise integrations, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Bulk Email Verifier
Verify email deliverability at scale. MX record validation, SMTP mailbox checks, disposable and role-based detection, catch-all flagging, and confidence scoring. No external API costs.
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Website Tech Stack Detector
Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.
Ready to try Causal Panopticon MCP Server?
Start for free on Apify. No credit card required.
Open on Apify Store