How much does Knowledge Graph Causal Discovery MCP Server cost?

Knowledge Graph Causal Discovery MCP Server uses pay-per-event pricing at $0.08 per discover-causal-structure. For example, 100 events cost $8.00 and 1,000 events cost $80.00. You only pay for what you use — there are no monthly fees.

How do I use Knowledge Graph Causal Discovery MCP Server?

Add the Knowledge Graph Causal Discovery MCP Server MCP endpoint to Claude Desktop, Cursor, Windsurf, or any MCP-compatible AI client using your Apify API token for authentication. The server exposes 8 tools that your AI assistant can call directly via natural language prompts. Results return as structured JSON within the conversation. Each tool call costs $0.08 per discover-causal-structure.

Is Knowledge Graph Causal Discovery MCP Server reliable?

Knowledge Graph Causal Discovery MCP Server has a maintenance pulse score of 90/100, with 8 builds in the last 30 days and the most recent build today.

What output format does Knowledge Graph Causal Discovery MCP Server return?

Knowledge Graph Causal Discovery MCP Server returns structured data in JSON format by default. You can also export results as CSV or Excel from the Apify Console. Each result includes all extracted fields in a flat, machine-readable structure that integrates directly with spreadsheets, CRMs, and automation tools via Apify integrations.

Are there alternatives to Knowledge Graph Causal Discovery MCP Server?

Yes. ApifyForge lists multiple actors in each category with different strengths. Browse related actors on the Knowledge Graph Causal Discovery MCP Server page or use the ApifyForge actor recommender to find the best fit for your use case. The right choice depends on your input data, budget, and required output fields.

AIDEVELOPER TOOLS

Knowledge Graph Causal Discovery MCP Server

Knowledge Graph Causal Discovery MCP Server is an MCP (Model Context Protocol) server available on ApifyForge at $0.08 per discover-causal-structure. MCP server wrapping 17 Apify actors to construct causal graphs from multi-domain data, apply do-calculus reasoning, and estimate causal effects via semiparametric methods.

Best for AI developers and agent builders who need structured real-world data inside Claude, Cursor, or other MCP-compatible clients.

Not ideal for non-AI workflows or use cases that don't involve an MCP-compatible client.

Try on Apify Store

$0.08per event

Tools exposed

Each pricing event corresponds to a tool your AI agent can call through MCP.

discover-causal-structureFCI constraint-based causal skeleton discovery · $0.08/call

compute-interventional-effectsDo-calculus with ID algorithm · $0.10/call

simulate-counterfactualsTwin network structural method · $0.08/call

extract-causal-claims-literatureNLP causal extraction from literature · $0.06/call

embed-causal-knowledge-graphRotatE complex-space embedding · $0.06/call

estimate-causal-effect-tmleTargeted maximum likelihood estimation · $0.08/call

check-graph-consistencySheaf cohomology H1 obstruction detection · $0.08/call

attribute-source-contributionShapley cooperative game attribution · $0.06/call

Example prompts

Natural language queries you can ask your AI assistant that would trigger this MCP server.

"Run a discover causal structure on Acme Corp and summarize the findings"

"Can you compute interventional effects and highlight any red flags?"

"What tools does the Knowledge Graph Causal Discovery MCP Server have available?"

Last verified: March 27, 2026

Actively maintained

Maintenance Pulse

$0.08

Per event

What to know

Requires an MCP-compatible client (Claude Desktop, Cursor, Windsurf, or similar).
Tool call results depend on the availability of upstream public APIs.
Requires an Apify account and API token for authentication.

Maintenance Pulse

90/100

Last Build

Today

Last Version

1d ago

Builds (30d)

Issue Response

N/A

Cost Estimate

How many results do you need?

discover-causal-structures

Estimated cost:$8.00

Pricing

Pay Per Event model. You only pay for what you use.

Event	Description	Price
discover-causal-structure	FCI constraint-based causal skeleton discovery	$0.08
compute-interventional-effects	Do-calculus with ID algorithm	$0.10
simulate-counterfactuals	Twin network structural method	$0.08
extract-causal-claims-literature	NLP causal extraction from literature	$0.06
embed-causal-knowledge-graph	RotatE complex-space embedding	$0.06
estimate-causal-effect-tmle	Targeted maximum likelihood estimation	$0.08
check-graph-consistency	Sheaf cohomology H1 obstruction detection	$0.08
attribute-source-contribution	Shapley cooperative game attribution	$0.06

Example: 100 events = $8.00 · 1,000 events = $80.00

Documentation

Knowledge graph causal discovery over multi-domain research data, delivered through a single Model Context Protocol interface. This MCP server is built for researchers, data scientists, and AI agents that need to go beyond correlation — discovering directed causal structure, estimating treatment effects, and reasoning about counterfactuals from the published literature and public datasets.

The server orchestrates 17 Apify actors in parallel across five source domains — academic, biomedical, regulatory, economic, and safety — assembling the results into a unified causal knowledge graph. Eight specialized tools then apply rigorous causal inference algorithms: FCI skeleton learning, GES with BIC scoring, Pearl's do-calculus with the ID algorithm, twin network counterfactuals, TMLE estimation, RotatE knowledge graph embeddings, sheaf cohomology consistency checking, and Shapley source attribution. Every tool call returns structured JSON with mathematical scores and supporting evidence.

⬇️ What data can you access?

Data Point	Source	Coverage
📄 Academic papers and citations	OpenAlex, Semantic Scholar, Crossref	250M+ scholarly works with citation graphs
📑 Preprints and open access	arXiv, CORE	Physics, CS, quantitative biology, math
🧬 Biomedical literature	PubMed	36M+ citations with MeSH indexing
🏥 Clinical trials	ClinicalTrials.gov	450K+ registered studies with protocol data
💊 Drug adverse event reports	OpenFDA	FDA FAERS pharmacovigilance database
🔬 NIH research grants	NIH Reporter	Active and historical funded projects
📜 Federal regulations	Federal Register	US regulatory actions and proposed rules
🏛️ Congressional legislation	Congress.gov	Bills, resolutions, and amendments
🗂️ Government datasets	Data.gov	300K+ federal open data assets
📈 Economic time series	FRED	Federal Reserve GDP, inflation, employment
🌍 World development indicators	World Bank	200+ country development metrics
⚠️ Product recall notices	CPSC	Consumer product safety recall database
💬 Consumer complaints	CFPB	Financial protection complaint records
📖 Encyclopedia context	Wikipedia	Background knowledge and concept disambiguation

Why use Knowledge Graph Causal Discovery MCP?

Assembling a causal inference pipeline from scratch requires integrating a dozen data sources, implementing graph construction logic, and coding algorithms that span three decades of academic literature. A typical research team spending a week on this still ends up with a pipeline that covers two or three data domains at best.

This MCP server covers 17 data sources, applies 10 peer-reviewed causal algorithms, and returns structured results in seconds — directly inside Claude, Cursor, Windsurf, or any MCP-compatible AI client.

Always-live data — every tool call fetches fresh results from source APIs; no stale snapshots or cached indexes
Parallel execution — up to 17 actors run simultaneously per query, not sequentially, so response time scales with the slowest source rather than the sum
Standby mode — the server stays warm between calls, eliminating cold-start latency for interactive research sessions
Pay-per-call — no monthly subscription; each tool costs between $0.035 and $0.050, so a full 8-tool pipeline costs under $0.35
MCP-native — works in Claude Desktop, Cursor, Windsurf, Cline, and any client that speaks the Model Context Protocol

⬆️ MCP tools

Tool	Price	Algorithm	Best for
`discover_causal_structure`	$0.045	FCI + GES + additive noise model	Initial causal graph structure from observational data
`compute_interventional_effects`	$0.050	Pearl's do-calculus + ID algorithm + Balke-Pearl LP	Policy evaluation, treatment planning, intervention design
`simulate_counterfactuals`	$0.045	Twin network method + Tian-Pearl bounds	"What if" analysis, legal causation, necessity/sufficiency
`extract_causal_claims_literature`	$0.035	NLP pattern matching + evidence classification	Systematic reviews, evidence synthesis, claim auditing
`embed_causal_knowledge_graph`	$0.040	RotatE complex-valued embeddings	Link prediction, entity similarity, pathway discovery
`estimate_causal_effect_tmle`	$0.050	TMLE + Super Learner ensemble + influence function CI	Semiparametric ATE estimation with doubly-robust CI
`check_graph_consistency`	$0.035	Sheaf cohomology H¹(G,F)	Validating causal assumptions, identifiability checks
`attribute_source_contribution`	$0.040	Shapley values + nucleolus + core stability	Data source prioritization, budget allocation

Use cases for knowledge graph causal discovery

Drug safety signal detection

Pharmacovigilance teams combine PubMed biomedical literature, ClinicalTrials.gov outcome data, and FDA adverse event reports into a single causal graph. The discover_causal_structure tool identifies directed edges between compounds and adverse outcomes. The compute_interventional_effects tool estimates P(adverse event | do(prescribe drug)) using back-door adjustment on confounders sourced from NIH grant data and OpenAlex citations.

Policy impact assessment

Policy analysts estimate causal effects of regulatory interventions on economic outcomes by combining Federal Register rules, FRED economic time series, and World Bank development indicators. The estimate_causal_effect_tmle tool applies TMLE with Super Learner to produce doubly-robust average treatment effect estimates with 95% confidence intervals from the influence function — going beyond naive before/after comparison.

Systematic review and evidence synthesis

Literature reviewers use extract_causal_claims_literature to scan thousands of academic papers across OpenAlex, Semantic Scholar, Crossref, arXiv, and CORE simultaneously. Claims are classified by strength (strong/moderate/weak/correlational) and evidence level (RCT/observational/case study/review). Conflicting claims across sources are flagged automatically, replacing weeks of manual screening.

Counterfactual reasoning for legal and regulatory causation

Legal teams and regulators assessing causation in product liability or pharmaceutical harm cases use simulate_counterfactuals to compute the Probability of Necessity (PN = P(Y_x'=0 | X=x, Y=y)) and Probability of Sufficiency (PS = P(Y_x=1 | X=x', Y=0)) via the twin network method. Tian-Pearl monotonicity bounds are validated to constrain the counterfactual probabilities.

Knowledge graph completion in biomedical AI

AI research teams use embed_causal_knowledge_graph to generate RotatE complex-valued entity embeddings where relations are unit-modulus rotations in complex space (t = h · r, |r_i| = 1). MRR and Hits@10 link prediction metrics identify missing drug-disease or gene-pathway edges. Self-adversarial negative sampling with margin gamma ensures high-quality embeddings even in sparse graph regions.

Data acquisition prioritization

Research operations teams with limited budgets use attribute_source_contribution to calculate Shapley values for each data domain (academic, biomedical, regulatory, economic, safety). The Shapley allocation phi_i quantifies each source's marginal contribution to causal graph quality across all subsets. Nucleolus computation and core non-emptiness check confirm allocation stability before committing to data subscriptions.

How to connect this MCP server

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "knowledge-graph-causal-discovery": {
      "url": "https://knowledge-graph-causal-discovery-mcp.apify.actor/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Cursor / Windsurf / Cline

Add the MCP endpoint in your editor's MCP settings panel:

Endpoint URL: https://knowledge-graph-causal-discovery-mcp.apify.actor/mcp
Authentication: Bearer token with your Apify API token

Python (MCP client)

import anthropic

client = anthropic.Anthropic()

# The MCP server exposes 8 tools — ask Claude to use them
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=4096,
    tools=[{
        "type": "custom",
        "name": "discover_causal_structure",
        # Claude resolves this via the connected MCP server
    }],
    messages=[{
        "role": "user",
        "content": "Discover the causal structure linking smoking exposure to lung cancer outcomes using academic and biomedical sources."
    }],
    mcp_servers=[{
        "url": "https://knowledge-graph-causal-discovery-mcp.apify.actor/mcp",
        "authorization_token": "YOUR_APIFY_TOKEN"
    }]
)
print(response.content)

Direct cURL

# Discover causal structure
curl -X POST "https://knowledge-graph-causal-discovery-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "discover_causal_structure",
      "arguments": {
        "query": "smoking lung cancer mortality",
        "sources": ["academic", "biomedical"]
      }
    },
    "id": 1
  }'

# Estimate treatment effect via TMLE
curl -X POST "https://knowledge-graph-causal-discovery-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "estimate_causal_effect_tmle",
      "arguments": {
        "query": "statin therapy cardiovascular mortality reduction",
        "sources": ["academic", "biomedical", "regulatory"]
      }
    },
    "id": 2
  }'

Tool reference

`discover_causal_structure`

Discovers causal graph structure from observational data using three combined algorithms:

FCI (Fast Causal Inference) — constraint-based skeleton discovery via Kernel Conditional Independence (KCI) tests, tolerant of latent confounders. Builds the CPDAG (completed partially directed acyclic graph) including bidirectional edges for hidden common causes.
GES (Greedy Equivalence Search) — score-based refinement using BIC (Bayesian Information Criterion) to navigate Markov equivalence classes. BIC = log(likelihood) − (k/2) · log(N) where k is the number of free parameters.
Additive noise model — edge orientation via HSIC (Hilbert-Schmidt Independence Criterion) between residuals and cause. If HSIC(e, X) < HSIC(e, Y), the model orients X → Y.

Returns: directed and bidirectional edges, Markov equivalence class size, BIC score, p-values per edge.

Price: $0.045 per call. Calls up to 10 actors for academic + biomedical sources.

`compute_interventional_effects`

Computes P(Y | do(X)) — the distribution of Y under intervention on X — via Pearl's do-calculus:

Rule 1 — insertion/deletion of observations
Rule 2 — action/observation exchange
Rule 3 — insertion/deletion of actions
ID algorithm — systematic identifiability test for interventional queries in semi-Markovian models
Back-door criterion — adjustment for observed confounders
Front-door criterion — adjustment via mediating variables when confounders are unobserved
Balke-Pearl LP bounds — linear programming bounds for effects not identifiable by do-calculus, constraining via observable distributions

Returns: do-effects with adjustment sets, identifiability flags, LP bound intervals.

Price: $0.050 per call.

`simulate_counterfactuals`

Simulates counterfactual outcomes via the structural twin network method:

Constructs a factual world (X=x, Y=y observed) and a counterfactual world (X=x' intervened)
Both worlds share the same exogenous variables U (the twin network's key property)
Computes Probability of Necessity (PN): P(Y_{x'}=0 | X=x, Y=y)
Computes Probability of Sufficiency (PS): P(Y_x=1 | X=x', Y=0)
Validates Tian-Pearl monotonicity bounds: PN ≤ P(Y=y|X=x), PS ≤ P(Y=0|X=x')

Returns: PN and PS per outcome pair, twin network size, monotonicity check result.

Price: $0.045 per call.

`extract_causal_claims_literature`

Extracts and classifies causal claims from academic literature via NLP pattern matching:

Claim strength classification: strong / moderate / weak / correlational based on verb and hedge patterns
Evidence level classification: RCT / observational / case_study / review based on study design signals in titles and abstracts
Conflict detection: flags pairs of sources making opposing claims about the same cause-effect pair

Draws from OpenAlex, Semantic Scholar, Crossref, arXiv, CORE (academic), and PubMed, ClinicalTrials.gov, NIH Grants, OpenFDA (biomedical) depending on selected sources.

Returns: classified claim list with citations, counts by strength and evidence level, conflicting claim pairs.

Price: $0.035 per call.

`embed_causal_knowledge_graph`

Embeds the causal knowledge graph using RotatE, a complex-valued knowledge graph embedding model:

Relations are rotations in complex space: t = h · r where each component satisfies |r_i| = 1 (unit modulus constraint)
Scoring function: f(h, r, t) = −||h · r − t|| (L1 norm of the complex residual)
Self-adversarial negative sampling — samples negative triples with probability proportional to their current score, weighted by softmax temperature
Margin-based loss with margin gamma separating positive and negative triple scores
Cluster assignment via k-means over entity embedding norms

Returns: entity embeddings with norms and nearest neighbours, MRR (Mean Reciprocal Rank), Hits@10, cluster labels, phase range.

Price: $0.040 per call.

`estimate_causal_effect_tmle`

Estimates average treatment effects via TMLE (Targeted Maximum Likelihood Estimation) following the semiparametric efficiency pipeline:

Initial estimate Q⁰(A, W) via Super Learner ensemble (weighted cross-validated learner combination)
Propensity score g(A | W) with positivity truncation at [0.01, 0.99] to prevent near-deterministic treatment
Clever covariate H(A, W) = A/g(1|W) − (1−A)/g(0|W)
Targeting step — fit epsilon via MLE of logistic model indexed by H, updating Q⁰
Updated estimate Q*(A, W) = expit(logit(Q⁰) + epsilon · H)
ATE = E[Q*(1, W)] − E[Q*(0, W)] (plug-in estimator from targeted fit)
Influence function IC(O) for 95% Wald confidence interval: ATE ± 1.96 · SE(IC)

Returns: ATE per treatment-outcome pair, standard error, 95% CI, influence function norm, cross-validated risk, Super Learner weights.

Price: $0.050 per call.

`check_graph_consistency`

Checks causal graph consistency using sheaf cohomology over the graph structure:

Sheaf F on graph G assigns vector spaces F(v) to vertices and linear maps F(e) to edges
Coboundary operator: (δ₀s)(e) = F(e)(s(v)) − s(w) measures local section disagreement
H¹(G, F) = ker(δ₁) / im(δ₀) — first cohomology group dimension measures obstructions to global consistency
Separate checks: acyclicity (no directed cycles), faithfulness (no spurious independencies), causal sufficiency (no hidden common causes), instrument validity (exclusion restriction), positivity (treatment overlap), Markov compatibility (observed independencies match graph)

Returns: pass/fail per check with violation counts, H¹ cohomology dimension, global section existence flag.

Price: $0.035 per call.

`attribute_source_contribution`

Attributes each data source's contribution to causal graph quality via cooperative game theory:

Each data domain (academic, biomedical, regulatory, economic, safety) is a player in the coalition game
Value function v(S) = quality of causal graph (node count · edge density · mean edge weight) using only sources in coalition S
Shapley value phi_i = Σ_S [|S|!(n−|S|−1)!/n!] · [v(S ∪ {i}) − v(S)] — fair marginal contribution
Nucleolus — lexicographically minimises the maximum excess, finding the most stable payoff allocation
Core non-emptiness check — tests whether the Shapley allocation is stable against all coalitional deviations

Best used with all five source categories to get meaningful attribution across the full coalition space.

Returns: Shapley values per source, marginal contributions, nucleolus allocation, core stability flag.

Price: $0.040 per call.

Output examples

`discover_causal_structure` — smoking and lung cancer

{
  "nodeCount": 94,
  "edgeCount": 187,
  "relations": [
    {
      "cause": "Cigarette smoking and lung adenocarcinoma risk: a pooled analysis",
      "effect": "Lung cancer incidence in never-smokers vs. ever-smokers cohort",
      "edgeType": "causes",
      "strength": 0.74,
      "pValue": 0.003,
      "method": "FCI-KCI"
    },
    {
      "cause": "KRAS mutation frequency in tobacco-exposed lung tissue",
      "effect": "Non-small-cell lung carcinoma progression",
      "edgeType": "causes",
      "strength": 0.61,
      "pValue": 0.011,
      "method": "GES-BIC"
    },
    {
      "cause": "Secondhand smoke exposure biomarker cotinine",
      "effect": "Lung cancer incidence in never-smokers vs. ever-smokers cohort",
      "edgeType": "bidirectional",
      "strength": 0.43,
      "pValue": 0.048,
      "method": "additive-noise-HSIC"
    }
  ],
  "totalEdges": 187,
  "directedEdges": 141,
  "bidirectionalEdges": 46,
  "markovEquivalenceSize": 12,
  "bicScore": -4823.7
}

`estimate_causal_effect_tmle` — statin therapy and cardiovascular mortality

{
  "nodeCount": 112,
  "estimates": [
    {
      "treatment": "High-intensity statin therapy (atorvastatin 40-80mg)",
      "outcome": "Major adverse cardiovascular events at 5 years",
      "ate": -0.082,
      "standardError": 0.019,
      "confidenceInterval": [-0.119, -0.045],
      "influenceFunctionNorm": 0.041
    },
    {
      "treatment": "High-intensity statin therapy (atorvastatin 40-80mg)",
      "outcome": "All-cause mortality",
      "ate": -0.031,
      "standardError": 0.014,
      "confidenceInterval": [-0.058, -0.004],
      "influenceFunctionNorm": 0.028
    }
  ],
  "significantCount": 2,
  "averageATE": -0.056,
  "crossValidatedRisk": 0.113,
  "superLearnerWeights": {
    "logistic": 0.34,
    "randomForest": 0.41,
    "xgboost": 0.25
  }
}

`simulate_counterfactuals` — treatment necessity and sufficiency

{
  "nodeCount": 87,
  "outcomes": [
    {
      "factual": "Patient received antihypertensive therapy (X=1), experienced stroke (Y=1)",
      "counterfactual": "Patient did not receive antihypertensive therapy (X=0)",
      "factualValue": 1.0,
      "counterfactualValue": 0.0,
      "probabilityOfNecessity": 0.71,
      "probabilityOfSufficiency": 0.38
    }
  ],
  "twinNetworkSize": 174,
  "averagePN": 0.71,
  "averagePS": 0.38,
  "monotonicityHolds": true
}

`check_graph_consistency` — causal assumption validation

{
  "nodeCount": 94,
  "edgeCount": 187,
  "checks": [
    { "check": "acyclicity", "passed": true, "violationCount": 0, "details": "No directed cycles detected" },
    { "check": "faithfulness", "passed": true, "violationCount": 2, "details": "2 near-cancelling paths detected" },
    { "check": "causal_sufficiency", "passed": false, "violationCount": 7, "details": "7 bidirectional edges suggest latent confounders" },
    { "check": "instrument_validity", "passed": true, "violationCount": 0, "details": "NIH grant instruments satisfy exclusion restriction" },
    { "check": "positivity", "passed": true, "violationCount": 0, "details": "Propensity scores in [0.04, 0.96]" },
    { "check": "markov_compatibility", "passed": true, "violationCount": 1, "details": "1 d-separation violation" }
  ],
  "totalChecks": 6,
  "passedChecks": 5,
  "sheafCohomologyDim": 3,
  "globalSectionExists": false
}

How much does it cost to use the Knowledge Graph Causal Discovery MCP?

This MCP uses pay-per-event pricing — you are charged only when a tool is called. Platform compute costs are included.

Tool	Price per call	10 calls	50 calls
`discover_causal_structure`	$0.045	$0.45	$2.25
`compute_interventional_effects`	$0.050	$0.50	$2.50
`simulate_counterfactuals`	$0.045	$0.45	$2.25
`extract_causal_claims_literature`	$0.035	$0.35	$1.75
`embed_causal_knowledge_graph`	$0.040	$0.40	$2.00
`estimate_causal_effect_tmle`	$0.050	$0.50	$2.50
`check_graph_consistency`	$0.035	$0.35	$1.75
`attribute_source_contribution`	$0.040	$0.40	$2.00

Full 8-tool pipeline per query: $0.34. Running the complete causal discovery pipeline daily for a month costs approximately $10.

Apify's free plan includes $5 of monthly platform credits, which covers roughly 14 full-pipeline runs at no cost.

You can set a maximum spending limit per session in your Apify account to prevent unexpected charges. The MCP server stops charging and returns an error message if your event limit is reached.

How the Knowledge Graph Causal Discovery MCP works

Phase 1 — parallel data ingestion

When a tool is called, the server identifies which source categories are requested (academic, biomedical, regulatory, economic, safety) and constructs a call list of up to 17 actor invocations:

Academic (6 actors): OpenAlex (30 results), Semantic Scholar (30), Crossref (20), arXiv (20), CORE (20), Wikipedia (15)
Biomedical (4 actors): PubMed (30), ClinicalTrials.gov (20), NIH Grants (15), OpenFDA (20)
Regulatory (3 actors): Federal Register (20), Congress Bills (15), Data.gov (15)
Economic (2 actors): FRED (20), World Bank (15)
Safety (2 actors): CPSC Recalls (15), CFPB Complaints (15)

All actors run via Promise.all — parallel, not sequential. Each actor has a 180-second timeout. A failed actor returns an empty array rather than failing the entire request, ensuring partial results are always returned.

Phase 2 — causal graph construction

Results from all actors are merged into a typed causal graph (CausalGraph). Nodes are classified by domain signals:

Biomedical results containing "trial", "treatment", "therapy", or "drug" → intervention nodes
Other biomedical results → outcome nodes
Wikipedia articles → confounder nodes (background knowledge)
NIH grants → instrument nodes (funding as instrumental variable)
Clinical trial records → intervention nodes
Regulatory and economic results → confounder and variable nodes respectively

Edges are built from domain heuristics: interventions connect to outcomes with causal weights; confounders connect to both interventions and outcomes; instruments connect to their intervention targets. Variable-to-variable edges are oriented by the additive noise model: HSIC(residual, X) vs HSIC(residual, Y) determines direction.

Phase 3 — algorithm application

The requested algorithm is applied to the constructed graph:

FCI builds a skeleton from KCI tests, then runs orientation rules for v-structures and Meek's propagation rules, producing the CPDAG. GES refines via BIC-scored forward/backward/turning phases. The additive noise model resolves remaining unoriented edges via HSIC.
Do-calculus applies Rules 1-3 iteratively, testing back-door and front-door criteria against the graph topology. The ID algorithm determines identifiability. Balke-Pearl LP bounds are computed for non-identifiable effects.
Twin network duplicates the graph, wires shared exogenous nodes, then propagates structural equations through both copies to compute PN and PS.
TMLE initialises Q⁰ via Super Learner, estimates propensity scores with truncation, constructs the clever covariate H, fits epsilon via logistic regression, and computes ATE from the targeted Q*.
RotatE initialises entity embeddings, applies unit-modulus rotational updates via self-adversarial negative sampling, and reports MRR and Hits@10.
Sheaf cohomology constructs coboundary matrices δ₀ and δ₁ from the graph's incidence structure, computes ker(δ₁)/im(δ₀), and maps violations to specific causal assumption failures.
Shapley enumerates all 2^n subsets of the source coalition, computes graph quality for each, and applies the Shapley formula. Nucleolus is found via lexicographic minimax excess optimisation.

Phase 4 — structured response

Results are serialised to JSON and returned via the MCP protocol. Every response includes nodeCount and edgeCount from the constructed graph, plus the algorithm-specific metrics.

Tips for best results

Start with discover_causal_structure before interventional tools. The FCI/GES structure output tells you which adjustment sets are valid for do-calculus. Running compute_interventional_effects without knowing the graph structure risks incorrect confounder adjustment.
Use academic + biomedical sources as your baseline. These two categories trigger 10 actors and cover the densest evidence base. Add regulatory for policy questions, economic for macroeconomic analyses, and safety for product harm or financial misconduct queries.
For counterfactual and legal causation work, use check_graph_consistency first. The sheaf cohomology check confirms whether the graph satisfies causal sufficiency and instrument validity — two assumptions that simulate_counterfactuals relies on for valid PN/PS estimates.
Run attribute_source_contribution with all five sources to get meaningful Shapley values. With fewer than three sources, the coalition game has too few subsets to produce stable marginal contributions. The nucleolus calculation requires at least three active players.
For systematic reviews, extract_causal_claims_literature with academic + biomedical is the most cost-effective entry point at $0.035 per call. Use the returned conflicting claim pairs to identify which relationships need deeper structure discovery or TMLE estimation.
Phrase queries as domain-variable pairs for best graph construction: "smoking lung cancer" rather than "does smoking cause cancer?" The graph builder identifies causal nodes from result titles, and specific entity names produce cleaner node classification.
For rare or niche topics, add academic and include arXiv/CORE by selecting all academic sources — preprint servers often have earlier causal evidence than indexed journals for fast-moving research areas.
Combine tools in a pipeline for full causal analysis: discover_causal_structure → check_graph_consistency → compute_interventional_effects → estimate_causal_effect_tmle. Total pipeline cost: $0.18 per complete analysis.

Combine with other Apify MCP servers

MCP Server	How to combine
ryanclinton/market-microstructure-manipulation-mcp	Feed causal structure output into market microstructure analysis; Granger causality in that MCP complements Pearl-style do-calculus here
ryanclinton/litigation-intelligence-mcp	Use counterfactual PN/PS scores as inputs to pre-litigation risk scoring; necessary causation probability is a key legal standard
ryanclinton/open-source-supply-chain-risk-mcp	Use causal structure discovery to identify which OSS dependencies causally propagate vulnerabilities vs. correlate with them
ryanclinton/esg-risk-assessment-mcp	Combine regulatory causal graphs with ESG risk scoring to distinguish causal regulatory exposure from correlated industry effects
ryanclinton/drug-pipeline-intelligence-mcp	Feed TMLE treatment effect estimates into drug pipeline analysis to supplement trial data with observational causal evidence

Limitations

No primary data access. This server analyses published literature, trial registries, and government databases. It does not access raw patient-level data, proprietary biobank records, or paywalled journal content.
Graph construction uses heuristic node classification, not ground-truth ontology mapping. Node types (intervention, outcome, confounder) are inferred from title text patterns, which can misclassify ambiguous entities.
Causal algorithms operate on the constructed proxy graph, not on the original numeric data. The FCI, GES, TMLE, and other algorithms produce relative estimates calibrated to the graph structure rather than estimates from primary observations.
TMLE requires sufficient node density to produce meaningful Super Learner estimates. Queries returning fewer than 20 nodes may produce wide confidence intervals.
RotatE embeddings are initialised fresh per call — there is no persistent knowledge graph that improves over time with repeated queries. Embedding quality scales with node count; sparse graphs produce lower MRR.
Sheaf cohomology results are sensitive to bidirectional edge prevalence. Graphs with many hidden-confounder edges (common in observational literature) will show positive H¹ dimension even for well-studied domains.
Source availability is not guaranteed. All 17 upstream actors call live public APIs. Outages, rate limiting, or temporary API changes at any source return empty arrays rather than errors, which reduces graph density but does not fail the request.
Regulatory and economic sources are US-centric. The Federal Register, Congress Bills, FRED, and CPSC cover US institutions. For international regulatory causal analysis, rely on academic and biomedical sources which have global coverage.

Integrations

Apify API — call the MCP server programmatically from Python, JavaScript, or any HTTP client using the Apify Actor API
Webhooks — trigger downstream workflows (Slack alerts, database writes, report generation) when a causal analysis completes
Zapier — connect causal discovery results to Google Sheets, HubSpot, Notion, or any of Zapier's 6,000+ apps without code
Make — build multi-step automation scenarios that chain causal discovery with data enrichment, notifications, and CRM updates
LangChain / LlamaIndex — use the MCP server as a causal reasoning tool within RAG pipelines and autonomous agent frameworks

❓ FAQ

How many data sources does a single causal discovery query touch? Up to 17 actors run in parallel depending on which source categories you select. The academic category triggers 6 actors (OpenAlex, Semantic Scholar, Crossref, arXiv, CORE, Wikipedia). biomedical triggers 4 (PubMed, ClinicalTrials.gov, NIH Grants, OpenFDA). regulatory triggers 3, economic 2, and safety 2. Selecting all five categories runs all 17 actors simultaneously.

How is this different from a standard literature review tool or RAG pipeline? Standard literature review tools return ranked documents. This server constructs a typed causal graph from those documents and applies formal causal inference algorithms — FCI, do-calculus, twin networks, TMLE — to extract directional causal relationships, not just associations. The output is mathematical causal structure, not retrieved text.

How fresh is the data returned? All data is fetched live at query time from each source API. There is no cached index. Results reflect the current state of OpenAlex, PubMed, FRED, and the other databases at the moment of the call.

Can I use only one or two source categories to reduce cost? Yes. Every tool accepts a sources array with any combination of academic, biomedical, regulatory, economic, and safety. Using only academic + biomedical is sufficient for most research questions and is the default for all tools except attribute_source_contribution.

What does a Shapley value of 0.4 for biomedical sources mean? It means biomedical data sources (PubMed, ClinicalTrials.gov, NIH Grants, OpenFDA) contribute 40% of the total causal graph quality, measured as the average marginal contribution of that data domain across all possible subsets of the five source categories.

Is it legal to use the data from these sources? All 17 sources are publicly available APIs and open government databases. PubMed, ClinicalTrials.gov, FDA, FRED, World Bank, and the others are free public resources. See Apify's guide on web scraping legality.

Can this replace a randomised controlled trial? No. TMLE and do-calculus provide observational causal inference, which relies on assumptions (no unmeasured confounding, positivity, consistency) that are untestable from data alone. The tools identify causal hypotheses and estimate effect sizes from observational evidence — they do not generate experimental evidence. The check_graph_consistency tool explicitly flags violations of causal sufficiency and other key assumptions.

How long does a typical tool call take? Most tool calls complete in 20–60 seconds. Time depends on source category selection — academic + biomedical (10 actors) typically takes 25–45 seconds; all five categories (17 actors) may take 45–90 seconds. Actor timeouts are set to 180 seconds per source.

Can I use this with a custom MCP client or agent framework? Yes. The server implements the standard MCP protocol at /mcp. Any client that supports MCP — including Cursor, Windsurf, Cline, custom Python MCP clients, or LangChain agent frameworks — can connect to https://knowledge-graph-causal-discovery-mcp.apify.actor/mcp.

What happens if a data source is temporarily unavailable? Individual actor failures return empty arrays rather than propagating errors. The graph is built from available sources, and the causal algorithm runs on the reduced graph. The response always includes nodeCount and edgeCount so you can verify graph density and re-run with different sources if needed.

Can I run structure discovery and TMLE estimation on the same query to cross-validate results? Yes, and this is the recommended workflow for high-stakes analyses. discover_causal_structure identifies the graph topology and adjustment sets. estimate_causal_effect_tmle uses that topology to select valid confounders for the Super Learner and propensity model. Running both costs $0.095.

Does the server support streaming responses for long-running queries? The server uses the Streamable HTTP transport from the MCP SDK, which supports streaming. MCP clients that implement streaming (including Claude Desktop) will receive incremental updates during long-running actor calls.

Help us improve

If you encounter unexpected results or errors, enable run sharing so we can diagnose issues faster:

Go to Account Settings > Privacy
Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong. Your data is visible only to the actor developer, not publicly.

Support

Found a bug or need a feature? Open an issue in the Issues tab on this actor's page. For custom causal inference configurations, domain-specific ontology integration, or enterprise deployments, reach out through the Apify platform.

Related actors

AI Cold Email Writer — $0.01/Email, Zero LLM Markup

Generates personalized cold emails from enriched lead data using your own OpenAI or Anthropic key. Subject line, body, CTA, and optional follow-up sequence — $0.01/email, zero LLM markup.

$0.05/event

AI Outreach Personalizer — Emails with Your LLM Key

Generate personalized cold emails using your own OpenAI or Anthropic API key. Subject lines, opening lines, full bodies — tailored to each lead's role, company, and signals. $0.01/lead compute + your LLM costs. Zero AI markup.

$0.01/event

Bulk Email Verifier — MX, SMTP & Disposable Detection at Scale

Verify email deliverability in bulk — MX records, SMTP mailbox checks, disposable detection (55K+ domains), role-based flagging, catch-all detection, domain health scoring (SPF/DKIM/DMARC), and confidence scores. $0.005/email, no subscription.

$0.005/event

CFPB Complaint Intelligence — Vendor Risk & Screening

Turn 5M+ CFPB consumer complaints into decisions: screen companies pass / review / fail, score complaint-handling risk, monitor what changed since last run, benchmark cohorts, and build audit-ready due-diligence packs. Filter by company, product, state, and date. No API key.

$0.002/event

Not sure which actor to pick?

Try the actor recommender

Last verified: March 27, 2026

Ready to try Knowledge Graph Causal Discovery MCP Server?

Run it on your own Apify account. Apify offers a free tier with $5 of monthly credits.

Open on Apify Store

Knowledge Graph Causal Discovery MCP Server

Tools exposed

Example prompts

What to know

Maintenance Pulse

Cost Estimate

Pricing

Documentation

⬇️ What data can you access?

Why use Knowledge Graph Causal Discovery MCP?

⬆️ MCP tools

Use cases for knowledge graph causal discovery

Drug safety signal detection

Policy impact assessment

Systematic review and evidence synthesis

Counterfactual reasoning for legal and regulatory causation

Knowledge graph completion in biomedical AI

Data acquisition prioritization

How to connect this MCP server

Claude Desktop

Cursor / Windsurf / Cline

Python (MCP client)

Direct cURL

Tool reference

discover_causal_structure

compute_interventional_effects

simulate_counterfactuals

extract_causal_claims_literature

embed_causal_knowledge_graph

estimate_causal_effect_tmle

check_graph_consistency

attribute_source_contribution

Output examples

discover_causal_structure — smoking and lung cancer

estimate_causal_effect_tmle — statin therapy and cardiovascular mortality

simulate_counterfactuals — treatment necessity and sufficiency

check_graph_consistency — causal assumption validation

How much does it cost to use the Knowledge Graph Causal Discovery MCP?

How the Knowledge Graph Causal Discovery MCP works

Phase 1 — parallel data ingestion

Phase 2 — causal graph construction

Phase 3 — algorithm application

Phase 4 — structured response

Tips for best results

Combine with other Apify MCP servers

Limitations

Integrations

❓ FAQ

Help us improve

Support

Related actors

AI Cold Email Writer — $0.01/Email, Zero LLM Markup

AI Outreach Personalizer — Emails with Your LLM Key

Bulk Email Verifier — MX, SMTP & Disposable Detection at Scale

CFPB Complaint Intelligence — Vendor Risk & Screening

Ready to try Knowledge Graph Causal Discovery MCP Server?

`discover_causal_structure`

`compute_interventional_effects`

`simulate_counterfactuals`

`extract_causal_claims_literature`

`embed_causal_knowledge_graph`

`estimate_causal_effect_tmle`

`check_graph_consistency`

`attribute_source_contribution`

`discover_causal_structure` — smoking and lung cancer

`estimate_causal_effect_tmle` — statin therapy and cardiovascular mortality

`simulate_counterfactuals` — treatment necessity and sufficiency

`check_graph_consistency` — causal assumption validation