AIDEVELOPER TOOLS

Market Microstructure & Manipulation MCP

Market microstructure analysis and manipulation detection via an MCP server that orchestrates 14 financial data actors in parallel. Built for quantitative researchers, compliance teams, and algorithmic traders who need production-grade econometric methods — Hawkes processes, BOCPD, spectral transfer entropy, Granger causality, and game-theoretic surveillance optimization — without managing infrastructure.

Try on Apify Store
$0.10per event
0
Users (30d)
0
Runs (30d)
90
Actively maintained
Maintenance Pulse
$0.10
Per event

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

simulate-order-book-dynamicss
Estimated cost:$10.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
simulate-order-book-dynamicsQueue-reactive Hawkes process simulation$0.10
detect-spoofing-manipulationBOCPD with Dirichlet-multinomial prior$0.10
measure-cross-asset-informationSpectral transfer entropy analysis$0.08
decompose-spread-componentsMRR adverse selection decomposition$0.06
identify-insider-abnormal-flowHasbrouck information share via Johansen VECM$0.08
discover-manipulation-causalityLASSO Granger with debiased inference$0.08
classify-market-regimesStudent-t HMM with Viterbi decoding$0.06
optimize-surveillance-strategyGame-theoretic optimal surveillance design$0.10

Example: 100 events = $10.00 · 1,000 events = $100.00

Connect to your AI agent

Add this MCP server to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.

MCP Endpoint
https://ryanclinton--market-microstructure-manipulation-mcp.apify.actor/mcp
Claude Desktop Config
{
  "mcpServers": {
    "market-microstructure-manipulation-mcp": {
      "url": "https://ryanclinton--market-microstructure-manipulation-mcp.apify.actor/mcp"
    }
  }
}

Documentation

Market microstructure analysis and manipulation detection via an MCP server that orchestrates 14 financial data actors in parallel. Built for quantitative researchers, compliance teams, and algorithmic traders who need production-grade econometric methods — Hawkes processes, BOCPD, spectral transfer entropy, Granger causality, and game-theoretic surveillance optimization — without managing infrastructure.

Connect any MCP-compatible AI client to analyze order book dynamics, detect spoofing and layering, measure cross-asset information flow, decompose bid-ask spreads, identify abnormal insider flows, discover causal manipulation networks, classify market regimes, and optimize surveillance resource allocation. All eight tools run against live data assembled from Finnhub, SEC EDGAR, SEC insider filings, congressional stock disclosures, CoinGecko, ECB rates, FRED, BLS, and more.

What data can you access?

Data PointSourceCoverage
📈 Stock prices and financialsFinnhubUS equities, quotes, earnings, company metrics
📋 SEC regulatory filingsEDGAR10-K, 10-Q, 8-K, enforcement actions
👤 Insider transactionsSEC Form 4Officer and director buys, sells, option exercises
🏛️ Congressional stock tradesSTOCK Act trackerSenate and House member trade disclosures
🪙 Cryptocurrency marketsCoinGecko10,000+ coins, prices, volumes, market caps
💱 Currency exchange ratesExchange Rate Tracker150+ live currency pairs
🇪🇺 ECB reference ratesECB Exchange RatesDaily EUR reference rates for FX microstructure
📉 Federal Reserve indicatorsFREDFEDFUNDS, VIX, GDP, macroeconomic series
📊 Labor market statisticsBLSCPI, unemployment, PPI for regime context
📰 Federal regulationsFederal RegisterSEC and CFTC rulemakings and enforcement orders
💬 Tech community signalsHacker NewsMarket-relevant technology and finance discussion
🔍 Website change monitoringWebsite Change MonitorRegulatory and corporate page updates
📩 Consumer complaintsCFPBFinancial product complaint trends
📅 Historical exchange ratesExchange Rate HistoryLong-term FX time series for trend analysis

Why use Market Microstructure & Manipulation MCP?

Building market surveillance infrastructure from scratch requires accessing 10+ financial APIs, implementing BOCPD, Hawkes processes, VECM cointegration, and game-theoretic LP solvers — then keeping it maintained as APIs change. Most academic implementations run locally, require Python environments, and break in production.

This MCP server handles all of that. Send a natural-language query from Claude, Cursor, or any MCP client and receive structured analytical output backed by live financial data and peer-reviewed econometric methods.

  • Scheduling — run nightly manipulation scans and regime checks on a recurring schedule to detect emerging patterns
  • API access — trigger runs programmatically from Python, JavaScript, or any HTTP client via the Apify API
  • Parallel data fetching — up to 8 actors execute simultaneously per tool call, assembling market data in seconds rather than minutes
  • Monitoring — receive Slack or email alerts when spoofing is detected or regime transitions occur via webhooks
  • Integrations — connect to Zapier, Make, or any webhook-compatible service for automated surveillance notifications

Features

  • Queue-reactive Hawkes process estimation — multi-dimensional intensity lambda_d(t) = mu_d + sum alpha_{dd'} exp(-beta(t-s)) estimated via EM algorithm with E/M-step iteration; branching ratio computed as spectral radius of the alpha/beta matrix via power iteration (100 iterations to convergence)
  • Bayesian Online Changepoint Detection (BOCPD) — maintains run length posterior P(r_t | x_{1:t}) with normal-inverse-gamma conjugate prior and constant hazard function; classifies detected changepoints into 4 manipulation patterns: layering, spoofing, wash trading, momentum ignition
  • Spectral transfer entropy — directed information flow TE(f) = 1/(4pi) ln(S_Y(f)/S_{Y|X}(f)) computed from AR spectral estimates across asset pairs; identifies which markets lead and which follow
  • Johansen VECM with Hasbrouck information share — price discovery attribution via Cholesky decomposition of the innovation covariance matrix from a vector error correction model
  • MRR spread decomposition — Madhavan-Richardson-Roomans model separates bid-ask spread into adverse selection, inventory cost, and order processing components; Kyle lambda computed via OLS regression on signed order flow
  • Roll measure and Amihud illiquidity — Roll measure derived from return autocovariance; Amihud ratio measures price impact per unit of trading volume
  • Event study CAR methodology — Cumulative Abnormal Return computed in [-5, +30] event window around insider and congressional transactions; t-statistics for significance; HHI-based information share concentration
  • LASSO-penalized Granger causality — coordinate descent with soft-thresholding selects relevant VAR lags; debiased coefficients with confidence intervals; F-statistics for Granger causality test
  • Student-t Hidden Markov Model — 4-state HMM (calm, volatile, crisis, recovery) with fat-tail Student-t emissions; EM forward-backward algorithm for parameter estimation; Viterbi algorithm for MAP path decoding
  • Extensive-form game theory — regulator vs. manipulator modeled as zero-sum game; Nash equilibrium computed via fictitious play (200 iterations); outputs optimal budget allocation, game value, detection probability, and deterrence effect
  • Gauss elimination linear solver — partial pivoting for numerically stable solutions in OLS and game-theory computations
  • 14 parallel data sources — Finnhub, CoinGecko, SEC EDGAR, SEC insider, congressional tracker, exchange rates, ECB, FRED, BLS, Federal Register, Hacker News, website monitor, CFPB, exchange rate history

Use cases for market microstructure analysis

Quantitative research and HFT strategy evaluation

Quantitative researchers testing high-frequency strategies need to understand order book dynamics before going live. The simulate_order_book_dynamics tool estimates Hawkes process parameters from real market data, returning the branching ratio and criticality index. A branching ratio approaching 1.0 signals a market near self-excitation criticality — a regime where HFT strategies face reflexivity risk. This replaces weeks of custom data pipeline work with a single tool call.

Compliance and market surveillance

Compliance teams at broker-dealers and exchanges need continuous spoofing and layering surveillance. The detect_spoofing_manipulation tool applies BOCPD to detect changepoints in order flow, classifies them by manipulation type, and returns confidence-scored alerts. Feeding these outputs into a daily webhook delivers an automated surveillance pipeline without proprietary surveillance system costs.

Regulatory economics and enforcement research

Regulators and enforcement economists investigating manipulation cases need to trace causal relationships between market variables. The discover_manipulation_causality tool applies LASSO-penalized Granger causality with F-statistics and debiased coefficients, returning a directed causal network with p-values. Combine this with the identify_insider_abnormal_flow tool to build an evidence package tracing information leakage from insider trades through to price movements.

Portfolio risk management and regime allocation

Portfolio managers using factor models need to know the current market regime before sizing positions. The classify_market_regimes tool uses a Student-t HMM to classify current conditions as calm, volatile, crisis, or recovery, and returns transition probabilities and expected regime durations. This feeds directly into volatility-targeted allocation frameworks.

Cross-asset price discovery research

Academic researchers studying information transmission across asset classes can use measure_cross_asset_information to quantify directed transfer entropy and Hasbrouck information shares across equities, crypto, and FX simultaneously. The spectral decomposition identifies which frequencies carry the most information flow.

Surveillance budget optimization

Regulatory agencies with finite enforcement budgets need to allocate resources for maximum deterrence. The optimize_surveillance_strategy tool models the regulator-manipulator interaction as an extensive-form zero-sum game, computing Nash equilibrium action probabilities and returning the optimal budget allocation with detection probability and deterrence effect estimates.

How to use the Market Microstructure & Manipulation MCP

  1. Get your Apify API token — sign up at apify.com and copy your token from Account Settings. The Free plan includes $5 of monthly credits.
  2. Add the MCP server to your client — paste the server URL into your MCP client configuration (see connection examples below). No software to install.
  3. Send a market query — ask your AI client something like "Analyze AAPL for spoofing patterns" or "Classify the current crypto market regime." The server routes your query to the appropriate tool.
  4. Receive structured analysis — results return as structured JSON with algorithm outputs, confidence metrics, alert classifications, and supporting market data from live sources.

MCP tools

ToolPriceActorsDescription
simulate_order_book_dynamics$0.0454Hawkes process estimation with branching ratio, criticality index, and queue imbalance
detect_spoofing_manipulation$0.0506BOCPD spoofing detection with manipulation pattern classification and confidence scores
measure_cross_asset_information$0.0405Spectral transfer entropy and Hasbrouck information share from Johansen VECM
decompose_spread_components$0.0353MRR spread decomposition: adverse selection, inventory, order processing, Kyle lambda
identify_insider_abnormal_flow$0.0454CAR event study [-5,+30] for insider and congressional trades with t-statistics
discover_manipulation_causality$0.0405LASSO Granger causality network with F-statistics, p-values, and debiased coefficients
classify_market_regimes$0.0354Student-t HMM with Viterbi path: calm, volatile, crisis, recovery classification
optimize_surveillance_strategy$0.0408Extensive-form game theory: Nash equilibrium surveillance budget allocation

Tool parameters

Each tool accepts the same two parameters, passed at call time from your MCP client:

ParameterTypeRequiredDefaultDescription
querystringYesMarket query: ticker, asset class, sector, or company name (e.g., "AAPL", "crypto", "energy sector", "Nancy Pelosi")
max_resultsnumberNo40–50Maximum results to fetch per actor. Available on simulate_order_book_dynamics and measure_cross_asset_information. Reduce for faster, lower-cost queries.
hazard_lambdanumberNo100Expected run length between changepoints for BOCPD. Available on detect_spoofing_manipulation. Higher values produce fewer, higher-confidence changepoints.

Connection tips

  • Start with classify_market_regimes — it is the cheapest tool ($0.035) and gives immediate context on current market conditions before running more expensive analyses.
  • Use detect_spoofing_manipulation before optimize_surveillance_strategy — the surveillance optimizer runs BOCPD and HMM internally as prerequisites, so they share the data fetch cost.
  • Tune hazard_lambda for your use casehazard_lambda: 200 for weekly regime scans reduces false positives; hazard_lambda: 50 for intraday monitoring increases sensitivity.
  • Batch queries by asset class — querying "US tech sector" returns broader data across multiple tickers than querying "MSFT" alone, making the cross-asset tools more informative.

How to connect this MCP server

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "market-microstructure-manipulation": {
      "url": "https://market-microstructure-manipulation-mcp.apify.actor/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
      }
    }
  }
}

Cursor, Windsurf, or Cline

Add the server URL to your MCP settings panel:

https://market-microstructure-manipulation-mcp.apify.actor/mcp

Set the Authorization header to Bearer YOUR_APIFY_TOKEN.

Python

import anthropic
import json

client = anthropic.Anthropic()

# The MCP server connects via the Apify Standby URL
# Use with Claude or any MCP-compatible framework

# Direct HTTP call example
import urllib.request

payload = json.dumps({
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
        "name": "detect_spoofing_manipulation",
        "arguments": {
            "query": "AAPL",
            "hazard_lambda": 100
        }
    },
    "id": 1
}).encode()

req = urllib.request.Request(
    "https://market-microstructure-manipulation-mcp.apify.actor/mcp",
    data=payload,
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer YOUR_APIFY_TOKEN"
    }
)

with urllib.request.urlopen(req) as response:
    result = json.loads(response.read())
    tool_result = json.loads(result["result"]["content"][0]["text"])
    print(f"Detected: {tool_result['totalDetected']} manipulation events")
    print(f"Changepoints: {tool_result['changepointCount']}")
    for alert in tool_result.get("alerts", []):
        print(f"  [{alert['pattern']}] {alert['asset']} confidence={alert['confidence']:.2f}")

JavaScript

const payload = {
    jsonrpc: "2.0",
    method: "tools/call",
    params: {
        name: "classify_market_regimes",
        arguments: { query: "SPY crypto macro" }
    },
    id: 1
};

const response = await fetch(
    "https://market-microstructure-manipulation-mcp.apify.actor/mcp",
    {
        method: "POST",
        headers: {
            "Content-Type": "application/json",
            "Authorization": "Bearer YOUR_APIFY_TOKEN"
        },
        body: JSON.stringify(payload)
    }
);

const data = await response.json();
const result = JSON.parse(data.result.content[0].text);

console.log(`Current regime: ${result.currentRegime}`);
console.log(`Stationary distribution:`, result.stationaryDistribution);
console.log(`Expected durations:`, result.expectedDuration);
for (const regime of result.regimes.slice(0, 3)) {
    console.log(`  ${regime.regime}: prob=${regime.probability.toFixed(3)}, vol=${regime.volatility.toFixed(4)}`);
}

cURL

# Detect spoofing on a specific ticker
curl -X POST "https://market-microstructure-manipulation-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "detect_spoofing_manipulation",
      "arguments": {
        "query": "AAPL",
        "hazard_lambda": 100
      }
    },
    "id": 1
  }'

# Decompose bid-ask spread
curl -X POST "https://market-microstructure-manipulation-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "decompose_spread_components",
      "arguments": { "query": "SPY" }
    },
    "id": 2
  }'

# List all available tools
curl -X POST "https://market-microstructure-manipulation-mcp.apify.actor/mcp" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -d '{"jsonrpc":"2.0","method":"tools/list","params":{},"id":3}'

Output examples

detect_spoofing_manipulation output

{
  "totalDetected": 3,
  "changepointCount": 7,
  "maxRunLength": 42,
  "falsePositiveRate": 0.082,
  "alerts": [
    {
      "timestamp": 1711027200000,
      "asset": "AAPL",
      "pattern": "layering",
      "confidence": 0.87,
      "runLength": 12,
      "hazardRate": 0.01,
      "posteriorChangeProb": 0.923
    },
    {
      "timestamp": 1711040400000,
      "asset": "AAPL",
      "pattern": "momentum_ignition",
      "confidence": 0.74,
      "runLength": 5,
      "hazardRate": 0.01,
      "posteriorChangeProb": 0.761
    }
  ]
}

classify_market_regimes output

{
  "currentRegime": "volatile",
  "transitionMatrix": [
    [0.92, 0.06, 0.01, 0.01],
    [0.08, 0.81, 0.09, 0.02],
    [0.02, 0.11, 0.84, 0.03],
    [0.04, 0.08, 0.05, 0.83]
  ],
  "stationaryDistribution": [0.38, 0.31, 0.18, 0.13],
  "expectedDuration": {
    "calm": 12.5,
    "volatile": 5.3,
    "crisis": 6.3,
    "recovery": 5.9
  },
  "regimeCount": 4,
  "viterbiPath": ["calm", "calm", "volatile", "volatile", "crisis", "recovery"],
  "regimes": [
    {
      "period": "2024-03-21",
      "regime": "volatile",
      "probability": 0.81,
      "mean": -0.0012,
      "volatility": 0.0184,
      "degreesOfFreedom": 4.2
    }
  ]
}

decompose_spread_components output

{
  "averageSpread": 0.00312,
  "averageLambda": 0.00847,
  "rollMeasure": 0.00154,
  "amihudIlliquidity": 0.000213,
  "components": [
    {
      "asset": "AAPL",
      "adverseSelection": 0.00142,
      "inventoryCost": 0.00089,
      "orderProcessing": 0.00081,
      "totalSpread": 0.00312,
      "kyleLambda": 0.00847,
      "informedFraction": 0.454
    }
  ]
}

identify_insider_abnormal_flow output

{
  "significantCount": 4,
  "averageCAR": 0.0312,
  "informationShareConcentration": 0.641,
  "totalFlows": 12,
  "flows": [
    {
      "insider": "Timothy D. Cook",
      "asset": "AAPL",
      "tradeDate": "2024-03-15",
      "cumulativeAbnormalReturn": 0.0847,
      "tStatistic": 2.94,
      "eventWindow": [-5, 30],
      "informationShare": 0.182
    }
  ]
}

Output fields

simulate_order_book_dynamics

FieldTypeDescription
branchingRationumberSpectral radius of alpha/beta matrix. Values >0.9 indicate near-criticality
criticalityIndexnumberDistance from criticality boundary (0=stable, 1=critical)
queueImbalancenumberBid-minus-ask depth normalized by total depth
meanIntensitynumber[]Mean event intensity per dimension
kernelNormsnumber[][]DxD matrix of Hawkes kernel L1 norms
hawkesParams.munumber[]Baseline intensities per dimension
hawkesParams.branchingRationumberBranching ratio from EM-estimated parameters
hawkesParams.logLikelihoodnumberLog-likelihood of estimated Hawkes model
snapshotCountnumberNumber of order book snapshots analyzed
snapshotsOrderBookState[]Up to 10 order book states with bid/ask depths, mid price, spread, imbalance

detect_spoofing_manipulation

FieldTypeDescription
totalDetectednumberCount of manipulation events classified
changepointCountnumberTotal BOCPD changepoints detected
maxRunLengthnumberLongest stable run between changepoints
falsePositiveRatenumberEstimated false positive rate
alerts[].timestampnumberUnix milliseconds of detected event
alerts[].assetstringAsset symbol associated with the alert
alerts[].patternstringOne of: layering, spoofing, wash_trading, momentum_ignition
alerts[].confidencenumberClassification confidence [0,1]
alerts[].runLengthnumberRun length at changepoint
alerts[].posteriorChangeProbnumberPosterior probability of regime change at this point

measure_cross_asset_information

FieldTypeDescription
totalInformationFlownumberSum of directed transfer entropy across all asset pairs
dominantFrequencynumberFrequency carrying the most information flow
hasbrouckShareRecord<string, number>Price discovery attribution share per asset
channels[].sourcestringSource asset in the information flow pair
channels[].targetstringTarget asset receiving information
channels[].spectralTEnumberSpectral transfer entropy value
channels[].peakFrequencynumberFrequency at which TE peaks
channels[].significancenumberStatistical significance of the channel
channels[].lagnumberEstimated lead-lag in periods

decompose_spread_components

FieldTypeDescription
averageSpreadnumberMean bid-ask spread across analyzed assets
averageLambdanumberMean Kyle lambda (price impact coefficient)
rollMeasurenumberRoll measure from return autocovariance
amihudIlliquiditynumberAmihud illiquidity ratio
components[].adverseSelectionnumberAdverse selection component of spread
components[].inventoryCostnumberInventory holding cost component
components[].orderProcessingnumberOrder processing cost component
components[].kyleLambdanumberPrice impact per unit signed order flow
components[].informedFractionnumberEstimated fraction of informed traders

identify_insider_abnormal_flow

FieldTypeDescription
significantCountnumberFlows with t-statistic above significance threshold
averageCARnumberMean cumulative abnormal return across all flows
informationShareConcentrationnumberHHI of information share (0=dispersed, 1=concentrated)
flows[].insiderstringName of insider or congressman
flows[].cumulativeAbnormalReturnnumberCAR over [-5, +30] event window
flows[].tStatisticnumbert-statistic for CAR significance
flows[].eventWindow[number, number]Event window bounds in days relative to trade date
flows[].informationSharenumberFraction of price discovery attributed to this flow

discover_manipulation_causality

FieldTypeDescription
networkDensitynumberFraction of possible causal links that are significant
lassoSelectedCountnumberNumber of links surviving LASSO soft-thresholding
strongestLinkCausalLink or nullLink with highest Granger F-statistic
links[].causestringVariable driving causation
links[].effectstringVariable receiving causal influence
links[].grangerFStatnumberF-statistic from Granger causality test
links[].pValuenumberp-value for Granger causality
links[].debiasedCoeffnumberDebiased VAR coefficient
links[].confidenceInterval[number, number]95% confidence interval for coefficient
links[].lagnumberLag at which causality operates

classify_market_regimes

FieldTypeDescription
currentRegimestringCurrent state: calm, volatile, crisis, or recovery
transitionMatrixnumber[][]4x4 Markov transition probability matrix
stationaryDistributionnumber[]Long-run probability of each regime
expectedDurationRecord<string, number>Expected periods in each regime before transition
viterbiPathstring[]MAP sequence of regime states
regimes[].volatilitynumberRegime-specific volatility from Student-t emission
regimes[].degreesOfFreedomnumberStudent-t degrees of freedom (lower = fatter tails)

optimize_surveillance_strategy

FieldTypeDescription
gameValuenumberNash equilibrium value of the surveillance game
detectionProbabilitynumberProbability of detecting manipulation under optimal strategy
deterrenceEffectnumberReduction in manipulation probability from optimal surveillance
optimalBudgetAllocationRecord<string, number>Budget fraction per surveillance action
actions[].actionstringSurveillance action label
actions[].nashEquilibriumProbnumberNash equilibrium probability of selecting this action
actions[].expectedDetectionnumberExpected detection rate from this action
actions[].resourceCostnumberResource cost of this action
spoofingAlertCountnumberAlert count from prerequisite BOCPD run
currentRegimestringCurrent market regime from prerequisite HMM run

How much does it cost to run market microstructure analysis?

This MCP server uses pay-per-event pricing — you pay per tool call. Platform compute costs are included.

ToolPrice per call10 calls50 calls
decompose_spread_components$0.035$0.35$1.75
classify_market_regimes$0.035$0.35$1.75
measure_cross_asset_information$0.040$0.40$2.00
discover_manipulation_causality$0.040$0.40$2.00
optimize_surveillance_strategy$0.040$0.40$2.00
simulate_order_book_dynamics$0.045$0.45$2.25
identify_insider_abnormal_flow$0.045$0.45$2.25
detect_spoofing_manipulation$0.050$0.50$2.50

The Apify Free plan includes $5 of monthly platform credits — enough for approximately 100–140 tool calls per month using the standard mix of tools. You can set a maximum spending limit per run to control costs. The server stops when your budget is reached.

Compared to Bloomberg Terminal API ($24,000/year) or Refinitiv Eikon ($22,000/year) for similar financial data access and analytical capabilities, this approach costs a fraction of a cent per analysis and requires no long-term commitment.

How Market Microstructure & Manipulation MCP works

Phase 1: Parallel data assembly

When a tool call arrives, the server dispatches between 3 and 8 actor calls simultaneously using Promise.all. Each actor runs in a separate Apify container with 256 MB memory and a 120-second timeout. Data from Finnhub (stock prices, fundamentals), CoinGecko (crypto markets), SEC EDGAR (filings), SEC insider Form 4 data, congressional stock disclosures, exchange rates, ECB rates, FRED macro indicators, and BLS labor data are assembled into a common time series format using extractTimeSeries — a multi-key extractor that resolves price, value, rate, close, amount, volume, and count fields from heterogeneous actor outputs.

Phase 2: Algorithm application

Each tool applies a specific peer-reviewed econometric method:

  • simulate_order_book_dynamics — Encodes price changes as event arrivals in D dimensions (bounded to 3–12 based on data volume). Runs EM estimation: E-step computes parentage probabilities proportional to alpha * exp(-beta * delta_t); M-step updates baseline intensity mu, excitation alpha, and decay beta. Branching ratio is the spectral radius of the alpha/beta matrix computed via 100-iteration power iteration. Criticality index is the clipped distance from the stability boundary.

  • detect_spoofing_manipulation — Implements BOCPD with normal-inverse-gamma conjugate prior. At each timestep, the algorithm updates the run length posterior using the predictive probability of the observation under the conjugate model and the constant hazard function. Changepoints are declared where the posterior assigns the highest probability to run length zero. The return characteristics around each changepoint — mean, variance, autocorrelation — are used to classify the manipulation pattern.

  • measure_cross_asset_information — Computes spectral transfer entropy between all asset pairs using AR model-based spectral estimates. Hasbrouck information shares are computed from a Johansen VECM by taking the Cholesky decomposition of the innovation covariance matrix and computing the squared elements of the lower-triangular factor. The dominant frequency is identified at the spectral peak.

  • optimize_surveillance_strategy — Runs detectSpoofing and classifyRegimes as prerequisites, then builds the surveillance game matrix from the resulting alert counts and regime classification. Fictitious play over 200 iterations converges to the Nash equilibrium mixed strategy. The deterrence effect is computed as the reduction in manipulation probability when the regulator commits to the Nash equilibrium strategy.

Phase 3: Structured response

Results are serialized as JSON and returned via the MCP protocol's CallToolResult format, with a single text content item containing the full structured output. The Express server handles the MCP transport layer, with CORS enabled for cross-origin clients and the Apify container readiness probe on GET /.

Tips for best results

  1. Use regime classification as a preamble. Call classify_market_regimes first with your target asset or sector. Current regime context improves interpretation of all downstream results — spoofing patterns behave differently in crisis vs. calm regimes.

  2. Tune hazard_lambda to your time horizon. The default hazard_lambda: 100 assumes roughly one regime change per 100 observations. For daily data spanning a year, this is appropriate. For intraday data, use hazard_lambda: 500 to avoid over-segmentation.

  3. Combine insider flow with causality discovery. Run identify_insider_abnormal_flow first, then feed the same query to discover_manipulation_causality. The causal network reveals which market variables are downstream of the insider trading detected in the event study.

  4. Use sector queries for cross-asset analysis. measure_cross_asset_information is most informative with broad queries like "US financials" or "crypto DeFi" that return diverse assets across data sources. Single-ticker queries return narrower cross-asset networks.

  5. The surveillance optimizer is most useful after a spoofing alert. When detect_spoofing_manipulation returns alerts, run optimize_surveillance_strategy with the same query. It runs BOCPD and HMM internally and uses those results to calibrate the game-theoretic model — so the surveillance allocation directly reflects current alert conditions.

  6. Lower max_results for faster exploratory queries. Setting max_results: 20 on simulate_order_book_dynamics or measure_cross_asset_information reduces data fetch time and cost. Use higher values (50–100) for production surveillance runs requiring higher statistical power.

  7. Persist results for longitudinal analysis. Store tool outputs in a database and run identical queries weekly. Changes in branching ratio, regime distribution, or causal network density over time reveal structural market shifts that single-point analyses miss.

Combine with other Apify actors

ActorHow to combine
SEC EDGAR Filing SearchRun a targeted EDGAR search for enforcement actions after detect_spoofing_manipulation alerts to cross-reference historical prosecution patterns for the same asset
Congressional Stock TrackerFeed congressional trade data directly to identify_insider_abnormal_flow for per-member CAR analysis; identify which members' trades precede the largest price movements
Finnhub Stock DataUse as a standalone data source for pre-analysis data validation before sending to the MCP tools
FRED Economic DataPull macro time series (VIX, FEDFUNDS, yield curve) and overlay regime classification outputs to validate HMM state assignments against macro turning points
Website Change MonitorMonitor SEC enforcement pages and company IR sites; feed change events as query context to detect_spoofing_manipulation for pre-announcement surveillance
Hacker News SearchCross-reference Hacker News discussion sentiment with classify_market_regimes output to assess whether retail attention cycles align with HMM-identified regime transitions
CoinGecko Crypto DataUse for standalone crypto market data pulls to validate simulate_order_book_dynamics branching ratio estimates against raw price series

Limitations

  • No live Level 2 order book data — The Hawkes process and spread decomposition tools operate on price and volume data from Finnhub and CoinGecko, not tick-level bid and ask depth feeds. Results are approximations of microstructure dynamics rather than exact order book measurements.
  • No real-time streaming — This is a request-response MCP server, not a streaming feed. Each tool call fetches fresh data at call time; there is no continuous monitoring built into the server itself. Use Apify scheduling for recurring runs.
  • Actor timeout at 120 seconds — If an upstream data source is slow or returns errors, the relevant actor returns an empty array. The algorithm degrades gracefully with partial data but outputs should be interpreted with appropriate caution when source counts are low.
  • Finnhub free tier limitations — Finnhub data depth depends on your Finnhub subscription. The actor uses the free tier by default, which may have rate limits and delayed data for some endpoints.
  • Hawkes EM convergence — EM estimation is initialized from the data directly. With very short time series (fewer than 20 observations per dimension), the branching ratio estimate may be unreliable. Use max_results: 50 or higher for stable estimates.
  • BOCPD hazard assumption — The constant hazard function assumes changepoints arrive at a fixed rate. Markets with clustered volatility (e.g., earnings seasons) will generate more changepoints than the model implies. Interpret absolute changepoint counts in context.
  • Congressional trade data latency — STOCK Act disclosures are filed within 30–45 days of the trade. Recent trades may not yet appear in the congressional tracker data, so identify_insider_abnormal_flow results for recent periods should be treated as incomplete.
  • No cross-run state — Each tool call is stateless. The server does not maintain a persistent order book or rolling window across calls. For longitudinal analysis, store results externally and compute changes in application code.
  • Game-theoretic model is simplified — The surveillance game uses a finite set of 4 actions with linear payoffs. Real regulatory games involve more complex information structures. Results are directionally informative rather than prescriptive.

Integrations

  • Apify API — Call the MCP endpoint programmatically from any language; use the Apify run API to store outputs in datasets for downstream processing
  • Webhooks — Configure webhooks to fire when a scheduled scan detects spoofing alerts or a regime transition occurs
  • Zapier — Route spoofing alerts from scheduled MCP runs into Slack, email, or Google Sheets workflows without writing code
  • Make — Build multi-step scenarios that trigger surveillance analysis on SEC EDGAR filing events and route results to compliance ticketing systems
  • LangChain / LlamaIndex — Integrate this MCP server into RAG pipelines where the AI agent needs live market microstructure context to answer quantitative finance questions
  • Claude Desktop — Direct MCP integration; ask Claude to detect manipulation patterns, classify regimes, or optimize surveillance budgets in natural language with no code required

Troubleshooting

Empty alerts from detect_spoofing_manipulation despite expecting manipulation. The BOCPD algorithm requires sufficient variance in the price series to detect changepoints. If the queried asset has very low volatility in the current data window, the run length posterior will favor long stable runs. Try lowering hazard_lambda to 50 to increase sensitivity, or broaden the query to include more assets.

classify_market_regimes returning the same regime repeatedly. This occurs when the data from all 4 actors returns a homogeneous time series (e.g., all values near zero or all identical). Check that the query returns meaningful data by testing with a broad query like "SPY VIX bonds" that spans multiple asset classes and ensures diverse numeric series.

Tool calls timing out. If upstream actors (especially Finnhub or CoinGecko during high-traffic periods) are slow, the 120-second actor timeout may be reached. The server will return results with empty arrays from the timed-out sources. For time-sensitive use cases, use decompose_spread_components or classify_market_regimes which use fewer actors and are less likely to timeout.

optimize_surveillance_strategy returning uniform budget allocation. The game-theoretic optimizer converges to Nash equilibrium via fictitious play. If the prerequisite BOCPD and HMM runs return no alerts and a stable regime, the game matrix will be near-uniform and so will the allocation. This is a valid result indicating that no single surveillance action dominates when no manipulation signals are present.

Hacker News returning irrelevant results. The Hacker News actor searches by keyword. Generic ticker symbols like "T" or "F" may match unrelated discussion threads. Use company names or more specific query terms (e.g., "Tesla stock" instead of "TSLA") for the tools that include Hacker News as a source.

Responsible use

  • This server accesses only publicly available financial data from licensed or open-access sources.
  • Congressional stock disclosures are public records under the STOCK Act (2012) and are legal to access and analyze.
  • SEC EDGAR filings are public records provided by the US Securities and Exchange Commission.
  • Do not use outputs from this server as the sole basis for investment decisions. Market microstructure analysis is one input among many.
  • Comply with applicable securities laws when using manipulation detection outputs. Alert thresholds should be calibrated by qualified compliance professionals before use in formal surveillance programs.
  • For guidance on web scraping legality, see Apify's guide.

FAQ

How does market microstructure manipulation detection work technically? The detect_spoofing_manipulation tool implements Bayesian Online Changepoint Detection (BOCPD) with a normal-inverse-gamma conjugate prior. At each time step it updates the run length posterior P(r_t | x_{1:t}) using the predictive probability under the conjugate model and a constant hazard function. When the posterior assigns high probability to run length zero, a changepoint is declared. The characteristics of the return series around the changepoint — mean, variance, skewness, autocorrelation — determine whether the pattern is classified as layering, spoofing, wash trading, or momentum ignition.

What manipulation patterns can the MCP detect? The BOCPD detector classifies four patterns: layering (repeated large orders placed and canceled near the top of book), spoofing (large deceptive orders intended to move prices before cancellation), wash trading (self-dealing transactions that create artificial volume), and momentum ignition (sequences designed to trigger stop orders and create cascading price movement).

How accurate is the Hawkes process branching ratio estimate? Accuracy depends on the number of observations. With 50+ price change events, EM typically converges to stable estimates within 20 iterations. The spectral radius power iteration converges reliably for the matrix sizes encountered (3–12 dimensions). Treat branching ratios above 0.9 as indicative of elevated microstructure stress rather than as exact measurements.

Is it legal to analyze congressional stock trades? Yes. Congressional stock disclosures are mandatory public records under the STOCK Act of 2012. The law requires senators and representatives to disclose stock trades within 30–45 days. The data is publicly available through official congressional disclosure databases and is legal to access, analyze, and publish.

How is this different from Bloomberg or Refinitiv market surveillance tools? Bloomberg BTCA and Refinitiv surveillance products cost $22,000–$24,000 per year, require dedicated terminals, and are designed for institutional compliance departments. This MCP server exposes the same analytical methods (Hawkes processes, event studies, Granger causality) for $0.035–$0.050 per analysis call with no subscription, no terminal, and direct integration into any MCP-compatible AI client.

Can I schedule this MCP server to run daily manipulation scans? Yes. Use Apify's scheduling feature to trigger a recurring actor run that calls the MCP server on a cron schedule. Store results to an Apify dataset and configure a webhook to fire when totalDetected > 0 in the spoofing tool output. This creates a low-cost automated surveillance pipeline.

How many tool calls can I make with the Apify free plan? The Apify Free plan includes $5 of monthly credits. At an average cost of $0.042 per call across all tools, this provides approximately 119 tool calls per month. For heavier usage, the Apify Starter plan at $49/month provides approximately 1,160 tool calls per month.

What happens if one of the 14 data actors fails or times out? Each actor call is wrapped in a try-catch with a 120-second timeout. If an actor fails, it returns an empty array. The algorithms proceed with the remaining data, degrading gracefully. You can identify partial results by checking whether the snapshotCount or channelCount fields in the output are lower than expected.

How does the Student-t HMM differ from a standard Gaussian HMM? The Student-t emission distribution has heavier tails than the Gaussian, making it more robust to outliers and financial crises where return distributions exhibit extreme kurtosis. The degrees-of-freedom parameter nu is estimated jointly with the regime means and variances. Lower nu values (2–4) indicate fat-tailed regimes typical of crisis periods; higher values (10+) are close to Gaussian and typical of calm regimes.

Can I use this MCP server with AI coding assistants like Cursor or Windsurf? Yes. Any MCP-compatible client works with this server. Add the server URL https://market-microstructure-manipulation-mcp.apify.actor/mcp to your client's MCP configuration with your Apify token as the Authorization header. The server exposes 8 tools that your AI assistant can call directly from its context window.

Does the game-theoretic surveillance optimizer produce prescriptive recommendations? The output is directionally informative, not prescriptive. The Nash equilibrium budget allocation tells you which surveillance actions have the highest expected detection value under the worst-case manipulator strategy. Real regulatory surveillance programs should use this as one quantitative input alongside legal constraints, data availability, and institutional judgment.

How long does a typical tool call take? Tool calls that fetch from 3–4 actors (like decompose_spread_components and classify_market_regimes) typically complete in 15–40 seconds. Tools that fetch from 6–8 actors (like detect_spoofing_manipulation and optimize_surveillance_strategy) typically take 30–70 seconds. All actor calls run in parallel, so latency is bounded by the slowest single actor rather than the sum.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Market Microstructure & Manipulation MCP?

Start for free on Apify. No credit card required.

Open on Apify Store