AIDEVELOPER TOOLS

Causal Panopticon MCP Server

Causal Panopticon is a cross-domain causal discovery and inference engine for AI agents, exposed via the Model Context Protocol. It orchestrates **18 heterogeneous data sources** across economics, health, environment, security, policy, finance, academia, and labor — applying 8 peer-reviewed causal algorithms to discover what actually causes what, not just what correlates.

Try on Apify Store
$0.12per event
0
Users (30d)
0
Runs (30d)
90
Actively maintained
Maintenance Pulse
$0.12
Per event

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

discover-cross-domain-causess
Estimated cost:$12.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
discover-cross-domain-causesPC/GES/NOTEARS meta-algorithm causal discovery$0.12
estimate-interventional-effectDo-calculus back-door/front-door adjustment$0.10
compute-counterfactualSCM abduction-action-prediction$0.10
transport-causal-effectSelection diagram transportability$0.10
detect-confoundersConditional independence confounder detection$0.08
simulate-causal-agentsBayesian persuasion concavification game$0.10
optimize-causal-experimentCausal Bayesian optimization EI acquisition$0.10
validate-causal-modelDAG Markov faithfulness BIC validation$0.08

Example: 100 events = $12.00 · 1,000 events = $120.00

Connect to your AI agent

Add this MCP server to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.

MCP Endpoint
https://ryanclinton--causal-panopticon-mcp.apify.actor/mcp
Claude Desktop Config
{
  "mcpServers": {
    "causal-panopticon-mcp": {
      "url": "https://ryanclinton--causal-panopticon-mcp.apify.actor/mcp"
    }
  }
}

Documentation

Causal Panopticon is a cross-domain causal discovery and inference engine for AI agents, exposed via the Model Context Protocol. It orchestrates 18 heterogeneous data sources across economics, health, environment, security, policy, finance, academia, and labor — applying 8 peer-reviewed causal algorithms to discover what actually causes what, not just what correlates.

Built for researchers, policy analysts, and AI agents that need to move beyond correlation into causal structure. Each tool call fires up to 18 data actors in parallel, assembles a causal graph, and returns structured inference results — average treatment effects, counterfactual values, transportability verdicts, and validated DAG structures — over a single MCP connection.

What data sources can you access?

Data PointSourceCoverage
📊 Economic time seriesFRED Economic Data800K+ US series
👷 Labor statisticsBLS Economic DataUS employment, wages, prices
🌐 Global macroeconomic indicatorsIMF Data190 countries
🏥 Health statisticsWHO GHO DataGlobal health metrics
🧪 Clinical researchClinicalTrials.govRegistered trials worldwide
🔬 Biomedical literaturePubMed Search35M+ articles
🌦️ Weather and climate dataNOAA WeatherUS and global observations
💨 Air quality measurementsOpenAQGlobal monitoring stations
🔐 Cyber vulnerabilitiesNVD CVE SearchAll published CVEs
📜 US legislationCongress Bill SearchActive and historical bills
📋 US regulatory filingsFederal Register SearchFederal rules and notices
🏢 Corporate SEC filingsSEC EDGAR SearchAll public company filings
📈 Market and equity dataFinnhub Financial DataGlobal equities
🎓 Academic publicationsOpenAlex250M+ scholarly works
🌪️ Disaster declarationsFEMA Disaster SearchUS disaster records
⚠️ Consumer product safetyCPSC Recall SearchProduct recall alerts
💊 Drug adverse eventsFDA Drug Event SearchFAERS database
💼 Job market trendsJob Market IntelligenceEmployment signals

Why use Causal Panopticon MCP Server?

Correlation-based analysis is fast but dangerous for decisions. A policy team that sees a correlation between air quality and hospital admissions cannot determine whether pollution causes illness, whether illness causes people to move to polluted areas, or whether poverty confounds both. Getting that distinction wrong means wasted interventions and missed opportunities.

Manual causal analysis — collecting data from 18 separate sources, applying conditional independence tests, building DAGs, checking back-door criteria — takes weeks for a trained economist. This MCP server does it in a single AI agent tool call.

Connecting via the MCP protocol means your AI agent in Claude Desktop, Cursor, or any MCP-compatible client gains causal reasoning capabilities without writing a line of data collection code.

  • Scheduling — run causal discovery on a recurring schedule to detect structural shifts in real-world causal relationships over time
  • API access — trigger runs from Python, JavaScript, or any HTTP client via the Apify platform API
  • Proxy rotation — data collection across 18 sources uses Apify's built-in proxy infrastructure to avoid rate limits
  • Monitoring — get Slack or email alerts when tool calls fail or return unexpected results
  • Integrations — connect to Zapier, Make, Google Sheets, or any webhook-compatible workflow

MCP tools

ToolAlgorithmBest forCost per call
discover_cross_domain_causesPC / GES / NOTEARS meta-algorithm + transfer entropyFinding causal structure across domains$0.04
estimate_interventional_effectDo-calculus (three rules), back-door, front-door, IVTreatment effect estimation, policy evaluation$0.04
compute_counterfactualPearl's SCM: abduction → action → predictionRetrospective attribution, what-if analysis$0.04
transport_causal_effectBareinboim-Pearl transportability, s-admissibilityGeneralizing findings across populations$0.04
detect_confoundersGraph structure + partial correlation analysisIdentifying threats to causal inference$0.04
simulate_causal_agentsBayesian persuasion, Kamenica-Gentzkow concavificationExpert disagreement modeling, policy persuasion$0.04
optimize_causal_experimentCausal Bayesian optimization, Expected ImprovementExperiment design, active learning$0.04
validate_causal_modelDAG acyclicity, Markov compatibility, faithfulness, BICModel checking, algorithm comparison$0.04

Features

  • Meta-algorithm selectionauto mode runs all three structural learning algorithms (PC, GES, NOTEARS) and selects the result with the best BIC score, so you get the best-fitting DAG for your data without manual tuning
  • PC algorithm — constraint-based causal discovery using Pearson partial correlation tests with Fisher's z-test (Abramowitz-Stegun approximation), controllable significance level via the alpha parameter
  • GES algorithm — greedy equivalence search with forward and backward BIC scoring phases; well-suited for denser causal graphs where the PC skeleton may over-prune
  • NOTEARS algorithm — continuous optimization with trace-exponential acyclicity constraint (Taylor series approximation to order 8); converts the combinatorial DAG search into a differentiable problem
  • Do-calculus identification — implements all three rules of Pearl's do-calculus; applies back-door criterion first, front-door criterion second, and instrumental variable identification third, returning which criterion identified the effect
  • SCM counterfactuals — Pearl's three-step procedure: abduction (infer exogenous noise U via residualization), action (modify structural equations for the hypothetical), prediction (propagate in topological order through the modified SCM)
  • Bareinboim-Pearl transportability — evaluates s-admissibility using selection diagrams; reports whether a causal effect identified in a source domain can be validly transported to a target domain and provides the adjusted estimate
  • Transfer entropy matrix — kernel density estimation of time-lagged mutual information between all domain pairs; captures information flow that directed graphs may miss
  • Colimit graph construction — merges per-domain causal graphs into a unified causal atlas using the category-theoretic colimit over directed graphs, preserving domain-specific edges as cross-domain causal claims
  • Bayesian persuasion simulation — models 3,000 domain expert agents in a Kamenica-Gentzkow sender-receiver game; concavifies the sender's value function over the simplex of posterior beliefs to find the optimal information disclosure policy
  • Confounder detection — reports each confounder's variable name, the pair it confounds, confounding strength (partial correlation magnitude), whether it is adjustable in the graph, and whether an instrumental variable exists for non-adjustable confounders
  • Experiment optimization — causal Bayesian optimization selects the intervention variable that maximizes expected information gain via variance reduction; applies Expected Improvement acquisition function and outputs suggested sample size via power analysis
  • Model validation — four independent validation criteria: DAG acyclicity (Kahn topological sort), Markov compatibility (implied independencies hold in data), faithfulness (no extra independencies), and BIC goodness of fit with individual p-values per test
  • Parallel data collection — all upstream actors fire simultaneously via runActorsParallel, with 180-second per-actor timeout and graceful empty-array fallback on failure

Use cases for cross-domain causal discovery

Policy impact research

Policy analysts and think tanks evaluating legislative proposals need to know whether a law caused an outcome, not merely whether the two are correlated. Use estimate_interventional_effect with sourceDomain: "policy" and targetDomain: "economics" to estimate the average treatment effect of a legislative change on employment or GDP — with identification criterion reported so you know whether back-door, front-door, or IV was used.

Cross-domain causal hypothesis generation

Research teams and AI agents building knowledge graphs often work within single data silos. discover_cross_domain_causes pulls from all 9 domain groups simultaneously, builds per-domain DAGs via the meta-algorithm, and merges them into a colimit graph. The result surfaces cross-domain causal edges — for example, a NOAA climate variable influencing an FRED unemployment series — that no single-domain analysis would find.

Clinical and public health attribution

Epidemiologists and public health researchers need counterfactual estimates to answer attribution questions. compute_counterfactual implements Pearl's abduction-action-prediction procedure using PubMed and clinical trials data: "What would the mortality rate have been if the intervention had not been introduced?" The exogenous noise terms are inferred from observed residuals, giving a statistically grounded retrospective estimate.

Generalizing findings across populations

Development economists and international health researchers routinely need to apply findings from one country's data to another. transport_causal_effect evaluates Bareinboim-Pearl s-admissibility using selection diagrams derived from the source and target domain graphs. If the effect is transportable, it returns the reweighted adjusted estimate; if not, it reports why transportability fails.

AI agent experiment planning

AI agents tasked with causal learning pipelines — for example, an agent iterating on a research question over multiple runs — can use optimize_causal_experiment to decide which variable to intervene on next. The Expected Improvement acquisition function selects the variable whose intervention maximally reduces posterior variance over the target, with power analysis output for sample size planning.

Causal model validation before downstream use

Before using a discovered DAG to make decisions or pass to a downstream analysis, use validate_causal_model to verify it satisfies DAG acyclicity, Markov compatibility, and faithfulness. Individual test p-values are returned per criterion, so you can report the statistical basis for accepting or rejecting the structure.

How to connect this MCP server

Claude Desktop

Add the following to your claude_desktop_config.json:

{
  "mcpServers": {
    "causal-panopticon": {
      "url": "https://causal-panopticon-mcp.apify.actor/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Cursor

Add the following to your Cursor MCP settings:

{
  "mcpServers": {
    "causal-panopticon": {
      "url": "https://causal-panopticon-mcp.apify.actor/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Any MCP-compatible client

The server endpoint is: https://causal-panopticon-mcp.apify.actor/mcp

The server uses Streamable HTTP transport and supports all standard MCP operations: tools/list, tools/call.

Input parameters

This MCP server has no actor-level input. All parameters are passed directly to each tool when called by the AI agent.

Per-tool parameters

ToolParameterTypeRequiredDefaultDescription
discover_cross_domain_causesquerystringYesSearch query sent to all domain actors (e.g. "inflation healthcare employment")
discover_cross_domain_causesdomainsstring[]Noall 9Subset of: economics, health, environment, security, policy, finance, academia, disaster, labor
discover_cross_domain_causesalgorithmenumNoautoauto, pc, ges, or notears
discover_cross_domain_causesalphanumberNo0.05Significance level for PC conditional independence tests
discover_cross_domain_causesmaxResultsnumberNo20Max results fetched per upstream actor
estimate_interventional_effectquerystringYesData collection query
estimate_interventional_effecttreatmentstringYesTreatment/intervention variable name
estimate_interventional_effectoutcomestringYesOutcome variable name
estimate_interventional_effectdomainsstring[]Noall 9Domains for data collection
estimate_interventional_effectmaxResultsnumberNo15Max results per actor
compute_counterfactualquerystringYesData collection query
compute_counterfactualinterventionVariablestringYesVariable to hypothetically change
compute_counterfactualinterventionValuenumberYesHypothetical value for the intervention
compute_counterfactualqueryVariablestringYesVariable whose counterfactual value to compute
compute_counterfactualdomainsstring[]Noall 9Domains for data collection
compute_counterfactualmaxResultsnumberNo15Max results per actor
transport_causal_effectquerystringYesQuery common to both domains
transport_causal_effectsourceDomainstringYesDomain where the effect was identified
transport_causal_effecttargetDomainstringYesDomain to transport the effect to
transport_causal_effecttreatmentstringYesTreatment variable
transport_causal_effectoutcomestringYesOutcome variable
transport_causal_effectselectionVariablesstring[]NoVariables that differ between domains
transport_causal_effectmaxResultsnumberNo15Max results per actor
detect_confoundersquerystringYesData collection query
detect_confoundersdomainsstring[]Noall 9Domains for data collection
detect_confoundersmaxResultsnumberNo15Max results per actor
simulate_causal_agentsquerystringYesCausal claim or hypothesis to evaluate
simulate_causal_agentsnumStatesnumberNo3Number of possible causal states (2–10)
simulate_causal_agentsnumAgentsnumberNo3000Number of domain expert agents to simulate
simulate_causal_agentsdomainsstring[]Noall 9Domains for data collection
simulate_causal_agentsmaxResultsnumberNo10Max results per actor
optimize_causal_experimentquerystringYesResearch question or target variable query
optimize_causal_experimenttargetVariablestringYesVariable to learn about
optimize_causal_experimentbudgetnumberNo100Experiment budget (abstract units for cost estimation)
optimize_causal_experimentdomainsstring[]Noall 9Domains for data collection
optimize_causal_experimentmaxResultsnumberNo15Max results per actor
validate_causal_modelquerystringYesData collection query
validate_causal_modeldomainsstring[]Noall 9Domains for data collection
validate_causal_modelalgorithmenumNoautoAlgorithm to validate: auto, pc, ges, notears
validate_causal_modelmaxResultsnumberNo15Max results per actor

Usage examples

Discover causal structure across economics and health:

{
  "query": "inflation unemployment healthcare spending",
  "domains": ["economics", "health", "labor"],
  "algorithm": "auto",
  "alpha": 0.05,
  "maxResults": 20
}

Estimate policy treatment effect:

{
  "query": "minimum wage employment",
  "treatment": "minimum_wage",
  "outcome": "employment_rate",
  "domains": ["economics", "policy", "labor"],
  "maxResults": 15
}

Compute counterfactual for health intervention:

{
  "query": "air quality respiratory hospital admissions",
  "interventionVariable": "pm25_concentration",
  "interventionValue": 12.0,
  "queryVariable": "hospital_admissions",
  "domains": ["environment", "health"],
  "maxResults": 15
}

Input tips

  • Narrow your domains — specify 2-3 focused domains rather than using all 9 to reduce cost and improve signal-to-noise ratio in the causal graph
  • Use descriptive queries — queries like "air quality respiratory hospital admissions" retrieve more relevant data than "health" from each upstream actor, improving the quality of extracted numeric features
  • Match variable names to your data — the findClosestNode function uses substring matching, so treatment/outcome variable names like "employment" or "pm25" will match nodes even without exact ID matches
  • Increase alpha for exploratory work — setting alpha: 0.10 makes the PC algorithm less aggressive in pruning edges, giving a denser skeleton for hypothesis generation
  • Use validate_causal_model after discover_cross_domain_causes — run discovery first, then validation with the same query and domain parameters to get p-values for each model criterion before drawing conclusions

Output examples

discover_cross_domain_causes output

{
  "algorithm": "NOTEARS",
  "totalNodes": 24,
  "totalEdges": 31,
  "domains": ["economics", "health", "environment", "labor"],
  "perDomainGraphs": [
    {
      "algorithm": "GES",
      "nodes": 8,
      "edges": 11,
      "bicScore": -142.7,
      "dagValid": true
    },
    {
      "algorithm": "PC",
      "nodes": 6,
      "edges": 7,
      "bicScore": -98.4,
      "dagValid": true
    }
  ],
  "crossDomainEdges": [
    {
      "from": "economics_cpi_inflation",
      "to": "health_hospital_admissions",
      "weight": 0.64,
      "mechanism": "indirect",
      "confidence": 0.81
    },
    {
      "from": "environment_pm25",
      "to": "labor_absenteeism",
      "weight": 0.52,
      "mechanism": "direct",
      "confidence": 0.74
    }
  ],
  "colimitGraph": {
    "nodes": 24,
    "edges": 31,
    "dagValid": true
  },
  "transferEntropyTopPairs": [
    { "from": "economics", "to": "labor", "entropy": 0.43 },
    { "from": "environment", "to": "health", "entropy": 0.38 }
  ]
}

estimate_interventional_effect output

{
  "treatment": "economics_minimum_wage",
  "outcome": "labor_employment_rate",
  "averageTreatmentEffect": -0.024,
  "confidence": 0.78,
  "identificationCriterion": "backdoor",
  "adjustmentSet": ["economics_gdp_growth", "policy_labor_regulation"],
  "confidenceInterval": [-0.041, -0.007],
  "graphAlgorithm": "GES",
  "graphNodes": 18,
  "graphEdges": 22
}

compute_counterfactual output

{
  "query": "What would hospital admissions be if PM2.5 had been 12?",
  "factualValue": 0.47,
  "counterfactualValue": 0.31,
  "causalEffect": -0.16,
  "probability": 0.83,
  "mechanism": "direct structural pathway via environment_pm25 -> health_respiratory",
  "exogenousNoiseTerms": 6,
  "graphAlgorithm": "NOTEARS",
  "graphNodes": 14
}

transport_causal_effect output

{
  "sourceDomain": "health",
  "targetDomain": "economics",
  "sourceEffect": 0.38,
  "transportable": true,
  "sAdmissible": true,
  "adjustedEffect": 0.29,
  "confidence": 0.71,
  "selectionNodes": ["gdp_per_capita", "healthcare_expenditure"],
  "sourceGraphNodes": 9,
  "targetGraphNodes": 11
}

validate_causal_model output

{
  "overallValid": true,
  "dagConstraintSatisfied": true,
  "markovCompatible": true,
  "faithfulnessHolds": false,
  "bicScore": -214.3,
  "testResults": [
    { "test": "dag_acyclicity", "pValue": 1.0, "passed": true },
    { "test": "markov_compatibility", "pValue": 0.12, "passed": true },
    { "test": "faithfulness", "pValue": 0.03, "passed": false },
    { "test": "bic_goodness_of_fit", "pValue": 0.24, "passed": true }
  ],
  "graphAlgorithm": "PC",
  "graphNodes": 16,
  "graphEdges": 19
}

Output fields

Common fields (all tools)

FieldTypeDescription
graphAlgorithmstringAlgorithm selected: PC, GES, or NOTEARS
graphNodesnumberNumber of nodes in the merged causal graph
graphEdgesnumberNumber of directed edges in the merged causal graph

discover_cross_domain_causes

FieldTypeDescription
algorithmstringAlgorithm selected by meta-algorithm (best BIC)
totalNodesnumberTotal nodes in the colimit (merged) graph
totalEdgesnumberTotal edges in the colimit graph
domainsstring[]Domains included in this run
perDomainGraphs[]object[]Per-domain graph summary with algorithm, node/edge counts, BIC score, and DAG validity flag
crossDomainEdges[]object[]Edges whose endpoints span different domain graphs
crossDomainEdges[].fromstringSource node ID (domain-prefixed)
crossDomainEdges[].tostringTarget node ID (domain-prefixed)
crossDomainEdges[].weightnumberEdge weight (partial correlation magnitude)
crossDomainEdges[].mechanismstringdirect or indirect
crossDomainEdges[].confidencenumberConfidence in direction [0, 1]
colimitGraph.nodesnumberTotal nodes in the merged colimit graph
colimitGraph.edgesnumberTotal edges in the merged colimit graph
colimitGraph.dagValidbooleanWhether the colimit graph satisfies acyclicity
transferEntropyTopPairs[]object[]Top domain pairs by time-lagged transfer entropy

estimate_interventional_effect

FieldTypeDescription
treatmentstringMatched treatment node ID in the graph
outcomestringMatched outcome node ID in the graph
averageTreatmentEffectnumberATE: E[Y | do(X=x)] - E[Y | do(X=x')]
confidencenumberConfidence in the ATE estimate [0, 1]
identificationCriterionstringbackdoor, frontdoor, instrumental, or direct
adjustmentSetstring[]Variables used in the adjustment formula
confidenceInterval[number, number]95% confidence interval bounds for the ATE

compute_counterfactual

FieldTypeDescription
querystringHuman-readable counterfactual question
factualValuenumberObserved value of the query variable
counterfactualValuenumberEstimated value under the hypothetical intervention
causalEffectnumberCounterfactual effect: counterfactual - factual
probabilitynumberProbability of the counterfactual scenario [0, 1]
mechanismstringDescription of the causal pathway
exogenousNoiseTermsnumberNumber of exogenous noise variables inferred (abduction step)

transport_causal_effect

FieldTypeDescription
sourceDomainstringDomain where the effect was identified
targetDomainstringTarget domain for transport
sourceEffectnumberOriginal causal effect in the source domain
transportablebooleanWhether the effect can be validly transported
sAdmissiblebooleanWhether the selection diagram is s-admissible
adjustedEffectnumberReweighted effect estimate for the target domain
selectionNodesstring[]Selection variables in the transportability diagram

detect_confounders

FieldTypeDescription
totalConfoundersnumberTotal confounders found in the graph
confounders[].variablestringConfounding variable ID
confounders[].confounds[string, string]The pair of variables being confounded
confounders[].strengthnumberConfounding strength (partial correlation magnitude)
confounders[].adjustablebooleanWhether the confounder is observed and adjustable
confounders[].instrumentAvailablebooleanWhether an instrumental variable exists
confounders[].instrumentstring | nullIV variable name, if available

simulate_causal_agents

FieldTypeDescription
agentsSimulatednumberNumber of expert agents in the simulation
statesnumberNumber of possible causal states
convergencebooleanWhether the persuasion game reached equilibrium
equilibriumBeliefsobjectFinal posterior belief distribution over states
senderPayoffnumberOptimal sender payoff under the equilibrium signal
receiverPayoffnumberReceiver payoff at equilibrium
informationRentnumberInformation rent extracted by the sender
optimalSignalsnumberNumber of signals in the optimal disclosure policy
posteriorCountnumberNumber of posteriors in the concavified value function

optimize_causal_experiment

FieldTypeDescription
targetVariablestringMatched target variable ID in the graph
optimalInterventionstringVariable to intervene on for maximum information gain
expectedInfoGainnumberExpected reduction in posterior variance
acquisitionFunctionnumberExpected Improvement acquisition function value
suggestedSampleSizenumberRecommended sample size from power analysis
estimatedCostnumberEstimated experiment cost in budget units
candidateVariablesstring[]All candidate intervention variables considered

validate_causal_model

FieldTypeDescription
overallValidbooleanTrue if all four criteria pass
dagConstraintSatisfiedbooleanKahn topological sort succeeds (no cycles)
markovCompatiblebooleanImplied conditional independencies hold in data
faithfulnessHoldsbooleanNo extra conditional independencies in data
bicScorenumberBayesian Information Criterion (lower is better)
testResults[]object[]Individual test results with p-value and pass/fail
testResults[].teststringTest name
testResults[].pValuenumberp-value for the test
testResults[].passedbooleanWhether the test passed

How much does it cost to use Causal Panopticon?

Causal Panopticon uses pay-per-event pricing — you pay $0.04 per tool call. Upstream actor execution costs are billed separately at Apify's standard rates (typically $0.01-0.02 per upstream call). Platform compute is included.

ScenarioTool callsCost per callEstimated total
Quick test1$0.04$0.04
Small research session10$0.04$0.40
Full analysis pipeline50$0.04$2.00
Daily automated runs200$0.04$8.00
Enterprise research workflow1,000$0.04$40.00

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.

The Apify free plan includes $5 of monthly platform credits — enough for approximately 125 tool calls at no cost. Compare this to commercial causal inference platforms that charge $200-2,000/month for access to a single data domain. Causal Panopticon covers 9 domains with no subscription commitment.

How to call this server using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

# Start the MCP server actor in standby mode and call a tool via HTTP
import urllib.request
import json

url = "https://causal-panopticon-mcp.apify.actor/mcp"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_APIFY_TOKEN"
}

payload = {
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
        "name": "discover_cross_domain_causes",
        "arguments": {
            "query": "inflation unemployment health outcomes",
            "domains": ["economics", "health", "labor"],
            "algorithm": "auto",
            "maxResults": 20
        }
    },
    "id": 1
}

data = json.dumps(payload).encode("utf-8")
req = urllib.request.Request(url, data=data, headers=headers, method="POST")

with urllib.request.urlopen(req) as response:
    result = json.loads(response.read())
    content = result["result"]["content"][0]["text"]
    atlas = json.loads(content)
    print(f"Algorithm: {atlas['algorithm']}")
    print(f"Total nodes: {atlas['totalNodes']}, edges: {atlas['totalEdges']}")
    for edge in atlas.get("crossDomainEdges", []):
        print(f"  {edge['from']} -> {edge['to']} (confidence: {edge['confidence']:.2f})")

JavaScript

const url = "https://causal-panopticon-mcp.apify.actor/mcp";

const response = await fetch(url, {
    method: "POST",
    headers: {
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
    },
    body: JSON.stringify({
        jsonrpc: "2.0",
        method: "tools/call",
        params: {
            name: "estimate_interventional_effect",
            arguments: {
                query: "minimum wage employment labor market",
                treatment: "minimum_wage",
                outcome: "employment_rate",
                domains: ["economics", "policy", "labor"],
                maxResults: 15
            }
        },
        id: 1
    })
});

const result = await response.json();
const content = JSON.parse(result.result.content[0].text);

console.log(`ATE: ${content.averageTreatmentEffect.toFixed(4)}`);
console.log(`Identification: ${content.identificationCriterion}`);
console.log(`Adjustment set: ${content.adjustmentSet.join(", ")}`);
console.log(`95% CI: [${content.confidenceInterval[0].toFixed(4)}, ${content.confidenceInterval[1].toFixed(4)}]`);

cURL

# Call discover_cross_domain_causes
curl -X POST "https://causal-panopticon-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "discover_cross_domain_causes",
      "arguments": {
        "query": "air quality respiratory disease employment",
        "domains": ["environment", "health", "labor"],
        "algorithm": "auto",
        "maxResults": 20
      }
    },
    "id": 1
  }'

# List all available tools
curl -X POST "https://causal-panopticon-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{"jsonrpc": "2.0", "method": "tools/list", "params": {}, "id": 1}'

How Causal Panopticon works

Phase 1: Parallel data collection across 18 sources

When a tool is called, the server identifies which domains are active and maps each domain to its upstream actor IDs using the DOMAIN_MAP constant. The economics domain maps to fred, bls, and imf actors; health maps to who, clinicalTrials, and pubmed; and so on. All actor calls fire simultaneously via runActorsParallel, which wraps Promise.all with per-actor 180-second timeouts and empty-array fallbacks. Results are grouped by domain into a domainData record before graph construction.

Phase 2: Causal graph construction via meta-algorithm

buildCrossDomainAtlas extracts numeric features from each domain's data items — converting any JSON field resolvable to a float into a CausalNode.values array. For each domain with at least 2 nodes, the function runs all three structural learning algorithms and selects the result with the lowest BIC score. PC algorithm starts with a complete undirected graph and iteratively removes edges where partial correlation is not significant at alpha (Fisher's z-test, Abramowitz-Stegun CDF approximation). GES applies forward greedy edge addition followed by backward deletion, scoring each candidate DAG by BIC = n*ln(RSS/n) + k*ln(n). NOTEARS builds the adjacency using trace(exp(W ∘ W)) - d = 0 as the continuous acyclicity constraint, approximated via 8th-order Taylor series. Topological sort (Kahn's algorithm) validates DAG acyclicity before returning each graph.

Phase 3: Cross-domain graph merging and transfer entropy

Per-domain graphs are merged into a colimit graph by unioning all nodes and edges. Cross-domain edges are inferred by computing Pearson correlations between nodes from different domain graphs using their raw value arrays — pairs above a threshold receive a directed edge from the domain with higher data density to the lower. The transfer entropy matrix is computed for all domain pairs using kernel density estimation of time-lagged mutual information, giving a separate signal for directional information flow that does not depend on the structural equations.

Phase 4: Causal inference computation

Depending on the tool called, the merged graph feeds into one of five inference procedures. doCalculus searches the graph for a back-door adjustment set (a set of nodes that blocks all back-door paths from treatment to outcome) and computes the adjustment formula; if none exists, it tries front-door paths, then instrumental variables. scmCounterfactual assigns each node a structural equation of the form X_j = sum(beta_ij * X_i) + U_j, infers the noise terms U by regression residualization (abduction), sets the intervention variable to its hypothetical value (action), and propagates in topological order (prediction). transportCausalEffect compares the Markov boundaries of treatment and outcome nodes across source and target graphs to evaluate s-admissibility. bayesianPersuasion applies the Kamenica-Gentzkow concavification by iteratively finding the highest concave lower envelope over the sender's value function on the probability simplex.

Tips for best results

  1. Narrow the domain list for focused queries. All-domain runs query all 18 actors, which increases both latency and cost. For a question about economic policy effects on labor markets, ["economics", "policy", "labor"] is sufficient and produces a cleaner causal graph.

  2. Use algorithm: "auto" unless you have a prior reason not to. The meta-algorithm compares PC, GES, and NOTEARS by BIC and picks the best fit for your data's density. Only override to a specific algorithm if you need reproducibility across runs or have domain knowledge about the expected graph structure.

  3. Run detect_confounders before estimate_interventional_effect. Knowing which confounders exist and whether they are adjustable tells you whether the back-door criterion is applicable before running the full treatment effect estimation. If a confounder is non-adjustable and no instrument is available, the identification may fall back to front-door or be unreliable.

  4. Use validate_causal_model to assess statistical confidence. The faithfulnessHolds: false result does not invalidate a graph — it means the data contains extra conditional independencies beyond what the graph implies, which is common with small samples. Examine the individual testResults p-values rather than relying solely on overallValid.

  5. For counterfactual analysis, set interventionValue to a realistic hypothetical. Values far outside the observed data range will produce extrapolated noise estimates with low probability scores. Counterfactuals within the 10th-90th percentile of the treatment variable's observed distribution are most reliable.

  6. Combine optimize_causal_experiment with discover_cross_domain_causes for active causal learning. First run discovery to get the graph structure, then pass the target variable to the experiment optimizer to find which variable to measure or intervene on in the next data collection round.

  7. Increase maxResults for higher-quality graphs. Each upstream actor returns up to maxResults items, and more observations per node improve the statistical power of the partial correlation tests in PC and the BIC scoring in GES. The default of 15-20 is conservative; set to 50 for research-grade results.

Combine with other Apify actors

ActorHow to combine
Company Deep ResearchFeed company intelligence reports as domain data input; use estimate_interventional_effect to identify which business factors causally drive financial outcomes
Website Tech Stack DetectorBuild a technology adoption dataset across companies, then use discover_cross_domain_causes to find causal links between tech choices and business performance signals
Trustpilot Review AnalyzerExtract sentiment scores over time and use compute_counterfactual to estimate what review scores would have been absent a specific product change
B2B Lead QualifierScore leads with 30+ signals, then pipe signals into detect_confounders to identify which lead-scoring features are genuinely predictive versus confounded by company size
SERP Rank TrackerTrack ranking changes over time and use transport_causal_effect to test whether SEO interventions that worked in one market transport to another
Regulatory Change TrackerUse new regulatory data as the policy domain input for estimate_interventional_effect to estimate ATE of specific regulatory changes on industry metrics

Limitations

  • Causal discovery from observational data has fundamental identifiability limits. When two variables are d-separation equivalent in the graph, conditional independence tests cannot determine the direction of the edge without external assumptions or experimental data. The DAG returned represents one member of a Markov equivalence class.
  • Faithfulness and Markov assumptions may not hold. Real-world data, especially from heterogeneous public APIs, may violate the faithfulness assumption (extra independencies due to canceling paths) or have non-stationary distributions. Validate results with validate_causal_model before acting on them.
  • Numeric feature extraction from JSON is heuristic. Upstream actors return structured JSON with many string and categorical fields. The feature extraction converts numeric fields only. Domains with predominantly textual data (such as academic abstracts from PubMed) will produce sparser causal graphs.
  • Transfer entropy measures information flow, not necessarily causation. High transfer entropy between two domains indicates temporal information transmission but does not establish the direction of structural causation. It is a complementary signal, not a substitute for structural graph-based identification.
  • Upstream actor failures return empty arrays, not errors. If a subset of the 18 source actors fail or time out, the causal graph is built on partial data without warning. Check graphNodes in the response — very low counts indicate data collection issues for that domain.
  • Variable name matching uses substring search. The findClosestNode function matches by substring if no exact match is found. Treatment and outcome variable names that are too generic (e.g., "rate") may match unexpected nodes. Use domain-prefixed names or more specific terms (e.g., "employment_rate", "cpi_inflation") for reliable matching.
  • BIC-based algorithm selection is not guaranteed optimal. BIC penalizes model complexity, but the best-BIC algorithm is not always the causally correct one. For small samples (fewer than 30 observations per node), consider specifying algorithm: "pc" explicitly, as GES and NOTEARS can overfit.
  • Bayesian persuasion simulation uses abstract payoff matrices. The simulate_causal_agents tool derives prior beliefs from data density across domains, but the sender/receiver payoffs are modeled abstractly. Results are most useful for directional qualitative analysis of expert disagreement, not precise payoff quantification.

Integrations

  • Zapier — trigger a causal discovery run when new data is added to a spreadsheet or CRM and receive structured results in downstream workflow steps
  • Make — build automated research pipelines that call estimate_interventional_effect on a schedule and pipe ATE results to Slack or a Google Sheet
  • Google Sheets — export counterfactual and treatment effect results directly into spreadsheets for stakeholder review without touching the API
  • Apify API — call the MCP server programmatically from any Python, JavaScript, or HTTP client; results are returned as structured JSON in the MCP protocol envelope
  • Webhooks — receive a POST notification when a long-running causal discovery job completes, with the full result payload
  • LangChain / LlamaIndex — expose the MCP tools directly to LLM orchestration frameworks; agents can call discover_cross_domain_causes and then reason over the returned causal graph structure in a multi-step research loop

Troubleshooting

Graph has very few nodes despite a broad query. Most upstream actors returned empty arrays, either due to query terms that did not match their search APIs or temporary rate limiting. Try a broader query term and specify fewer domains. Check that each domain's typical data format contains numeric fields — text-heavy domains like academia produce sparse graphs when PubMed abstracts are the primary data source.

averageTreatmentEffect is unexpectedly large or small. The ATE is estimated from the linear structural equations fitted to the extracted numeric features. If the upstream data has a very narrow numeric range or high variance, the regression coefficients will reflect that scale. Interpret the ATE as a relative directional effect within the data distribution, not an absolute real-world unit effect.

transportable: false even when domains seem similar. Transportability fails when the Markov boundary of the treatment node in the source graph contains variables that are selection nodes in the target domain — meaning the populations differ on precisely those variables. Try running detect_confounders on the target domain separately and compare the confounder list to understand why s-admissibility is not satisfied.

convergence: false in simulate_causal_agents. The Bayesian persuasion game did not reach a stable equilibrium within the iteration limit. This can happen when the prior beliefs are nearly uniform and the payoff matrices have low contrast. Try increasing numStates to 2 for a binary causal claim, which produces a simpler concavification problem that converges more reliably.

Tool returns a spending limit error. You have reached the per-run spending limit configured in your Apify account or run settings. Increase the maxTotalChargeUsd parameter when starting the actor, or set a higher limit in the Apify console under your account's billing settings.

Responsible use

  • This server queries publicly available data sources: US federal databases, WHO, NOAA, OpenAQ, NVD, FEMA, FDA, and academic APIs. It does not scrape private or restricted sources.
  • Causal claims derived from observational data should be validated against domain expertise before informing policy or clinical decisions.
  • Users are responsible for ensuring that downstream use of causal findings complies with applicable regulations in their jurisdiction.
  • Do not use causal inference results as the sole basis for high-stakes decisions affecting individuals without experimental validation.
  • For guidance on responsible AI-assisted research, see the Apify platform guidelines.

FAQ

What is cross-domain causal discovery and why does it require 18 data sources? Causal discovery within a single domain misses relationships that span institutional boundaries. Economic shocks cause health outcomes; climate events cause labor disruptions; cybersecurity incidents cause regulatory changes. Identifying these cross-domain causal pathways requires simultaneous data from economics, health, environment, and other domains. The 18 source actors cover 9 domain groups, giving the meta-algorithm enough heterogeneous data to build a credible cross-domain causal atlas.

Which causal discovery algorithm should I use — PC, GES, or NOTEARS? Use algorithm: "auto" (the default) to let the server select by BIC score. If you need guidance: PC is best for sparse graphs with strong domain knowledge about which variables can be causally connected; GES handles moderate-density graphs and is computationally faster than NOTEARS for larger node sets; NOTEARS is best when you expect a dense continuous structure and want gradient-based optimization rather than combinatorial search.

Can this server prove causation? No causal discovery algorithm running on observational data can prove causation. The tools identify candidate causal structures that are statistically consistent with the data under the faithfulness and Markov assumptions. True causal confirmation requires experimental intervention. Use validate_causal_model to check statistical support and optimize_causal_experiment to design a follow-up experiment that would confirm or refute the discovered structure.

How is estimate_interventional_effect different from a regression coefficient? A regression coefficient measures association conditional on included variables. estimate_interventional_effect uses do-calculus to identify P(Y|do(X)) — the distribution of Y under a hypothetical intervention that sets X to a value, rather than merely observing X at that value. The critical difference is that do-calculus adjusts for confounders using the graph structure, not just the variables included in the regression, and can identify effects even when back-door blocking is impossible (via front-door or IV criteria).

How accurate are the counterfactual estimates? Accuracy depends on: the quality of numeric data extracted from upstream actors, how well the linear structural equations approximate the true mechanisms, and whether the faithfulness assumption holds. The probability field in compute_counterfactual reflects the plausibility of the inferred exogenous noise scenario. Values above 0.7 indicate the counterfactual is consistent with the observed data; below 0.5 suggests the hypothetical intervention is far from the training distribution.

How long does a typical tool call take? A full all-domain call (discover_cross_domain_causes with all 9 domains) typically takes 2-4 minutes because it fires up to 18 upstream actors in parallel, each with a 180-second timeout. Single-domain calls or calls with 2-3 domains typically complete in 30-90 seconds. The MCP server remains in standby mode between calls, so there is no cold-start overhead for the server itself.

Can I schedule causal discovery to run periodically to detect structural shifts? Yes. Use Apify's built-in scheduler to run the MCP server actor on a daily, weekly, or custom interval. Compare the crossDomainEdges from successive runs to detect when causal relationships between domains change — for example, detecting when the economics-health causal link strengthens during a recession.

Is it legal to collect and analyze data from these 18 sources? All 18 upstream data sources are publicly available APIs from US federal agencies (FRED, BLS, Congress, SEC, FEMA, FDA, CPSC, NVD), international organizations (WHO, IMF), research databases (PubMed, OpenAlex), and environmental monitoring networks (NOAA, OpenAQ). Public API usage complies with each source's terms of service. For web scraping legality generally, see Apify's guide.

How is this different from existing causal inference libraries like DoWhy or CausalML? DoWhy and CausalML are Python libraries that require you to supply your own data. Causal Panopticon handles data collection, feature extraction, and graph construction end-to-end, and exposes the entire pipeline as an MCP tool that any AI agent can call. It is designed for agentic workflows where the AI needs to autonomously collect evidence, build a causal model, and return actionable inference results — not for researchers who already have a clean dataset and want fine-grained programmatic control.

Can I use this with Claude, GPT-4, or other AI models? Yes. Any AI model with MCP client support can connect to this server at https://causal-panopticon-mcp.apify.actor/mcp. Claude Desktop and Cursor have native MCP support. For other models, use an MCP-compatible client library or call the HTTP endpoint directly from your agent's tool-use framework.

What happens if some upstream actors fail during a run? The runActor function catches all errors and returns an empty array, so the run continues with partial data. The affected domain will contribute no nodes to the causal graph. Monitor graphNodes in the response — if it is much lower than expected given the number of domains specified, one or more upstream actors likely failed. The Apify run log will show which actors returned errors.

Does the server retain my data between calls? No. The MCP server processes each tool call independently and does not persist causal graphs or data between calls. The Apify platform stores run logs and output datasets per-run in your account, which you can access via the Apify console or API for debugging purposes.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom causal inference workflows, multi-domain research pipelines, or enterprise integrations, reach out through the Apify platform.

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Causal Panopticon MCP Server?

Start for free on Apify. No credit card required.

Open on Apify Store