Adversarial Corporate Opacity MCP
**Beneficial ownership detection and corporate opacity analysis** via the Model Context Protocol, built for AI agents that investigate entities across 6 international registries and 4 sanctions watchlists. This MCP server implements six distinct anti-concealment algorithms — from BFS ownership graph traversal to Bayesian belief propagation — and delivers a structured opacity score plus formal Enhanced Due Diligence reports that hold up in compliance workflows.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| unfold-ownership | BFS ownership traversal with nominee detection | $0.08 |
| screen-transliteration | 5-stage cross-lingual name matching | $0.06 |
| detect-bursts | Kleinberg burst detection on incorporation dates | $0.06 |
| cluster-addresses | DBSCAN clustering on geocoded addresses | $0.06 |
| correlate-infra | WL graph kernel on DNS/SSL subgraphs | $0.08 |
| infer-bo | Loopy belief propagation ownership inference | $0.10 |
| opacity-score | Comprehensive entity opacity assessment | $0.12 |
| edd-report | Full enhanced due diligence report | $0.15 |
Example: 100 events = $8.00 · 1,000 events = $80.00
Connect to your AI agent
Add this MCP server to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.
https://ryanclinton--adversarial-corporate-opacity-mcp.apify.actor/mcp{
"mcpServers": {
"adversarial-corporate-opacity-mcp": {
"url": "https://ryanclinton--adversarial-corporate-opacity-mcp.apify.actor/mcp"
}
}
}Documentation
Beneficial ownership detection and corporate opacity analysis via the Model Context Protocol, built for AI agents that investigate entities across 6 international registries and 4 sanctions watchlists. This MCP server implements six distinct anti-concealment algorithms — from BFS ownership graph traversal to Bayesian belief propagation — and delivers a structured opacity score plus formal Enhanced Due Diligence reports that hold up in compliance workflows.
When a corporate structure is deliberately obscured through nominee directors, secrecy jurisdictions, shared registered addresses, or adversarial name variations, standard screening misses it. This server targets exactly those evasion patterns. It orchestrates 15 Apify actors in parallel per tool call, runs cross-lingual transliteration matching against OFAC and Interpol, clusters shell company address farms with DBSCAN, and infers beneficial ownership through loopy belief propagation on a multi-evidence factor graph.
What data can you access?
| Data Point | Source | Example |
|---|---|---|
| 📁 Global corporate registry records | OpenCorporates | 140+ jurisdictions, company names, officers, filing status |
| 📁 UK company filings and PSC persons | UK Companies House | Officers, persons with significant control, filing history |
| 📁 Canadian federal corporations | Canada Corporation Search | Directors, incorporation date, federal status |
| 📁 Australian business numbers | Australia ABN Lookup | Entity type, GST registration, ABN status |
| 📁 New Zealand company registrations | NZ Companies Office | NZBN, directors, registered address |
| 🔗 Legal entity identifiers | GLEIF LEI | Global parent/child corporate relationships |
| ⚠️ US Treasury SDN sanctions | OFAC Sanctions Search | Entity names, aliases, identification numbers |
| ⚠️ Global sanctions and PEPs | OpenSanctions | 100+ programs, politically exposed persons |
| ⚠️ International wanted persons | Interpol Red Notices | Subject profiles, charges, issuing country |
| ⚠️ US federal wanted persons | FBI Most Wanted | Charges, descriptions, known aliases |
| 🌐 Domain registration records | WHOIS Lookup | Registrant, registrar, creation date, nameservers |
| 🌐 DNS configuration records | DNS Record Lookup | A, MX, NS, TXT records revealing shared hosting |
| 🌐 IP geolocation and ASN data | IP Geolocation | ISP, ASN, country, hosting provider |
| 🔒 TLS certificate transparency logs | crt.sh Search | Certificate issuers and shared SSL assets |
| 📍 Geographic coordinates | Nominatim Geocoder | Lat/lon for address clustering analysis |
MCP tools for corporate opacity analysis
| Tool | Price | Algorithm | Best for |
|---|---|---|---|
unfold_ownership_graph | $0.045 | BFS with jurisdictional hop penalties | Multi-layered shell structures, nominee director detection |
screen_with_transliteration | $0.040 | 5-stage phonetic pipeline | Sanctions evasion via name variations, Cyrillic lookalikes |
detect_registration_bursts | $0.040 | Kleinberg infinite-state automaton | Coordinated shell company creation campaigns |
cluster_shell_addresses | $0.045 | DBSCAN spatial clustering | Registered agent address farms, co-location detection |
correlate_infrastructure | $0.040 | Weisfeiler-Lehman graph kernel | Hidden entity relationships via shared domains, IPs, TLS |
infer_beneficial_owner | $0.050 | Loopy belief propagation | UBO identification from multi-source evidence |
compute_entity_opacity_score | $0.045 | Weighted composite scoring | Single opacity grade for compliance decisions |
generate_edd_report | $0.050 | Full 6-algorithm pipeline | Formal EDD/KYC documentation, regulatory filings |
Why use this MCP server for beneficial ownership analysis?
Manual beneficial ownership investigation requires searching each corporate registry individually, cross-referencing sanctions lists by hand, and trying to correlate infrastructure data with ownership records. For a Cayman-registered entity with UK and Canadian subsidiaries, that is 4-6 hours of research before any analysis begins. Standard compliance tools use exact-match or basic fuzzy screening that misses intentional transliteration evasion.
This server automates the entire investigation pipeline in a single tool call:
- Parallel data collection — 3-15 actors run simultaneously per tool call, collapsing hours of research into 30-120 seconds
- Adversarial evasion detection — the 5-stage transliteration pipeline catches Cyrillic lookalikes, diacritic stripping, and name reordering that standard matching misses
- Structured opacity scoring — every entity gets a numeric score with grade (LOW/MEDIUM/HIGH/EXTREME/CRITICAL) and weighted factor breakdown for audit documentation
- AI-native interface — integrates directly with Claude, Cursor, Windsurf, and any MCP-compatible agent
- Pay-per-use pricing — no monthly subscription; a complete 7-tool EDD investigation costs under $0.35
Features
- BFS ownership graph traversal across 6 registries with per-hop opacity penalties: 0.1 for same-jurisdiction hops, 0.3 for cross-jurisdiction, 0.5 for hops through 22 identified secrecy jurisdictions including Cayman Islands (KY), British Virgin Islands (VG), Panama (PA), Jersey (JE), Liechtenstein (LI), and 17 others
- Nominee director detection using 15 formation agent name patterns including Trident Trust, Mossack Fonseca pattern names, Portcullis, Asiaciti, and generic terms like "corporate services", "registered agent", "company formation"
- Circular ownership detection — flags and counts circular structures where entity A owns entity B which owns entity A
- 5-stage transliteration screening — Unicode NFKD normalization with diacritic stripping, Double Metaphone phonetic encoding, Caverphone encoding, Jaro-Winkler distance with prefix bonus, and token-set ratio with phonetic bonus when metaphone codes match
- Kleinberg burst detection — infinite-state automaton using Viterbi-style dynamic programming to find optimal state sequences where high-rate states represent suspicious incorporation bursts; autocorrelation analysis detects periodic registration patterns
- DBSCAN address clustering with epsilon=50m (0.00045 degrees) and minPts=3; co-location suspicion score = entities_in_cluster × (1 − diversity_index) × jurisdiction_risk_weight, where diversity_index is Shannon entropy of entity types divided by log(n)
- Weisfeiler-Lehman graph kernel on DNS/SSL subgraphs: builds infrastructure graphs (domains → IPs → nameservers → SSL issuers), iteratively relabels nodes by hashing neighbor labels over 3 iterations, computes normalized dot product of label histograms; kernel value above 0.7 indicates likely shared control
- Loopy belief propagation on a factor graph with 6 evidence variables: ownership registration, officer overlap, address co-location, infrastructure sharing, sanctions co-occurrence, and temporal co-registration; iterates with damping factor 0.5 up to 50 iterations until convergence
- Weighted composite opacity scoring: ownership depth (25%), transliteration risk (15%), burst anomaly (10%), co-location (15%), infrastructure concealment (15%), beneficial owner uncertainty (20%)
- Formal EDD report generation with severity-graded findings (low/medium/high/critical) and actionable recommendations for compliance file documentation
- Standby mode deployment — MCP server stays warm on Apify for sub-second response initiation
Use cases for beneficial ownership investigation
AML and KYC compliance workflows
Compliance officers at banks, payment processors, and fund administrators need to identify ultimate beneficial owners during customer onboarding. Multi-layered structures through offshore jurisdictions can obscure true ownership through 4-8 corporate layers. unfold_ownership_graph traverses up to depth 10 across all registries, computing opacity penalties at each hop and flagging nominee directors — producing structured evidence that feeds directly into compliance files.
Sanctions evasion detection
Financial institutions screening counterparties face the problem that sanctioned entities deliberately vary their names to evade standard lists. A Russian oligarch's entity might appear as "Ivanов" (Cyrillic О), "Ivánov" (diacritic A), or "Vanovi" (reordered tokens). screen_with_transliteration applies a 5-stage phonetic pipeline across OFAC, OpenSanctions, Interpol, and FBI databases to catch all these variants in a single call, returning a CLEAR/MODERATE/HIGH/CRITICAL severity rating with evidence per match.
Shell company farm identification
Registered agents in Delaware, Wyoming, Cayman, and BVI sometimes host thousands of entities at a single address. cluster_shell_addresses geocodes all registered addresses associated with an entity, runs DBSCAN spatial clustering, and computes a co-location suspicion score per cluster. This surfaces registered agent farms that represent fabricated corporate diversity rather than genuine separate businesses.
Investigative journalism and corporate research
Journalists and researchers investigating offshore financial structures need to connect apparently unrelated entities that share beneficial ownership. correlate_infrastructure maps shared domains, IP addresses, and TLS certificates between entities using Weisfeiler-Lehman graph kernels — finding connections that no corporate registry records. infer_beneficial_owner then combines all available evidence through belief propagation to assign posterior ownership probabilities.
Coordinated incorporation campaign detection
Private equity analysts, regulators, and intelligence teams investigating corporate fraud benefit from detecting when entities were incorporated in coordinated batches — a pattern associated with shell company creation campaigns. detect_registration_bursts applies the Kleinberg automaton to incorporation date sequences and identifies temporal clusters with statistical significance, including autocorrelation analysis to detect periodic (non-random) patterns.
Enhanced Due Diligence documentation
For formal regulatory filings, correspondent banking relationships, or high-value transaction approvals, generate_edd_report runs all six algorithms in sequence and produces a complete report with ownership graph, sanctions findings, burst analysis, address clusters, infrastructure correlations, beneficial ownership inferences, and a composite opacity score with severity-graded findings and recommendations. The JSON output is structured for direct inclusion in compliance audit trails.
How to connect this MCP server for beneficial ownership detection
- Get your Apify API token — sign up at apify.com, go to Settings > Integrations, and copy your API token.
- Add the server to your MCP client — paste the configuration below into your client's MCP settings file. Replace
YOUR_APIFY_TOKENwith your actual token. - Start your agent session — the 8 tools appear automatically in your agent's tool list. No further setup required.
- Make your first call — ask your agent to investigate an entity: "Use unfold_ownership_graph to map the ownership structure of Meridian Holdings Ltd, jurisdiction KY."
MCP client configuration
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"adversarial-corporate-opacity": {
"url": "https://adversarial-corporate-opacity-mcp.apify.actor/mcp",
"headers": {
"Authorization": "Bearer YOUR_APIFY_TOKEN"
}
}
}
}
Cursor / Windsurf / Cline
Add to your MCP settings:
{
"mcpServers": {
"adversarial-corporate-opacity": {
"url": "https://adversarial-corporate-opacity-mcp.apify.actor/mcp",
"headers": {
"Authorization": "Bearer YOUR_APIFY_TOKEN"
}
}
}
}
Direct HTTP (cURL)
curl -X POST "https://adversarial-corporate-opacity-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "unfold_ownership_graph",
"arguments": {
"entity_name": "Meridian Holdings Ltd",
"jurisdiction": "KY"
}
},
"id": 1
}'
MCP tool reference
unfold_ownership_graph
BFS traversal of corporate ownership graph across 6 international registries. At each node computes an opacity score from jurisdictional hop penalties, nominee detection (15 formation agent patterns), and circular ownership identification. Prunes at depth 10 or when cumulative opacity exceeds threshold.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
entity_name | string | Yes | — | Target entity name to trace ownership from |
jurisdiction | string | No | — | Primary jurisdiction ISO code (e.g. "GB", "KY", "PA") to prioritize a specific registry |
company_number | string | No | — | Company registration number if known, passed to the relevant registry |
max_depth | number | No | 10 | Maximum BFS traversal depth (1–10) |
Example call:
{
"entity_name": "Meridian Offshore Holdings Ltd",
"jurisdiction": "KY",
"max_depth": 8
}
screen_with_transliteration
5-stage cross-lingual name matching against OFAC, OpenSanctions, Interpol, and FBI watchlists. Catches adversarial transliterations including Cyrillic lookalikes, diacritic evasion, and name token reordering.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
names | string[] | Yes | — | Entity or person names to screen (1–50 names per call) |
include_interpol | boolean | No | true | Include Interpol Red Notices in screening |
include_fbi | boolean | No | true | Include FBI Most Wanted in screening |
Example call:
{
"names": ["Viktor Petrenko", "Viktоr Petrеnko", "V. Petrenkov"],
"include_interpol": true,
"include_fbi": false
}
detect_registration_bursts
Applies the Kleinberg infinite-state automaton to incorporation date sequences. Uses Viterbi-style dynamic programming to identify high-rate burst states, then runs autocorrelation to detect periodic (non-random) registration patterns.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
entity_name | string | Yes | — | Entity or person name whose corporate registrations to analyze |
jurisdiction | string | No | — | Focus on a specific jurisdiction ISO code |
cluster_shell_addresses
DBSCAN spatial clustering (epsilon=50m, minPts=3) on geocoded registered addresses. Computes co-location suspicion score per cluster using Shannon entropy diversity index and jurisdiction risk weighting. Up to 20 addresses are geocoded per call via Nominatim.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
entity_name | string | Yes | — | Entity or person name to find associated addresses |
jurisdiction | string | No | — | Focus on a specific jurisdiction |
correlate_infrastructure
Builds entity infrastructure graphs (domains → IPs → nameservers → SSL issuers), applies 3-iteration Weisfeiler-Lehman relabeling, and computes normalized dot product of label histograms. Kernel value above 0.7 indicates high probability of shared control.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
entities | string[] | Yes | — | Entity names to compare infrastructure fingerprints (1–20) |
domains | string[] | No | — | Known domains in "EntityName:domain.com" format |
Example call:
{
"entities": ["Meridian Holdings Ltd", "Atlas Capital Partners"],
"domains": ["Meridian Holdings Ltd:meridian-hld.com", "Atlas Capital Partners:atlas-cap.io"]
}
infer_beneficial_owner
Bayesian beneficial ownership inference using loopy belief propagation on a factor graph. Combines 6 evidence types per person-entity pair: ownership registration, officer overlap, address co-location, infrastructure sharing, sanctions co-occurrence, and temporal co-registration. Runs with damping factor 0.5, max 50 iterations.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
entity_name | string | Yes | — | Target entity to identify beneficial owners of |
known_persons | string[] | No | — | Known associated persons (directors, shareholders, nominees) |
jurisdiction | string | No | — | Primary jurisdiction ISO code |
compute_entity_opacity_score
Runs all six algorithms and returns a weighted composite opacity score. Component weights: ownership depth 25%, transliteration risk 15%, burst anomaly 10%, co-location 15%, infrastructure concealment 15%, beneficial owner uncertainty 20%. Grades: LOW (<0.15), MODERATE (0.15–0.35), ELEVATED (0.35–0.55), HIGH (0.55–0.75), CRITICAL (>0.75).
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
entity_name | string | Yes | — | Entity to compute opacity score for |
jurisdiction | string | No | — | Primary jurisdiction ISO code |
domains | string[] | No | — | Known associated domains for infrastructure analysis |
known_persons | string[] | No | — | Known associated persons for beneficial owner inference |
generate_edd_report
Full Enhanced Due Diligence report combining all six algorithms. Runs all 15 actors across registries, watchlists, and infrastructure sources. Returns structured findings with severity grades (low/medium/high/critical), an overall risk score, and actionable recommendations.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
entity_name | string | Yes | — | Target entity for Enhanced Due Diligence |
jurisdiction | string | No | — | Primary jurisdiction ISO code |
domains | string[] | No | — | Known domains for infrastructure analysis |
key_persons | string[] | No | — | Key persons to investigate (directors, UBOs) |
Output example
compute_entity_opacity_score response for a Cayman-registered entity:
{
"entity": "Meridian Offshore Holdings Ltd",
"opacityScore": {
"entity": "Meridian Offshore Holdings Ltd",
"overallOpacity": 0.71,
"ownershipDepthScore": 0.85,
"transliterationRisk": 0.20,
"burstAnomalyScore": 0.60,
"coLocationScore": 0.75,
"infraConcealmentScore": 0.55,
"beneficialOwnerUncertainty": 0.80,
"grade": "HIGH"
},
"dataSources": {
"corporateRecords": 47,
"leiRecords": 3,
"watchlistEntries": 0,
"geocodedAddresses": 8
}
}
unfold_ownership_graph summary for the same entity:
{
"entity": "Meridian Offshore Holdings Ltd",
"summary": {
"totalNodes": 14,
"totalEdges": 13,
"maxDepthReached": 6,
"circularOwnership": true,
"nominees": 3,
"formationAgents": 2,
"secrecyHops": 4,
"totalOpacity": 3.4,
"riskIndicator": "HIGH"
}
}
screen_with_transliteration severity result:
{
"severity": "HIGH",
"result": {
"matches": [
{
"entityName": "Viktor Petrenko",
"watchlistName": "Viktor Petrenkо",
"stage": "double_metaphone",
"similarity": 0.94,
"phoneticBonus": 0.08,
"finalScore": 0.91,
"source": "OFAC-SDN"
}
],
"totalScreened": 3,
"totalMatches": 1,
"pipelineStats": [
{ "stage": "unicode_normalization", "matchesFound": 0 },
{ "stage": "double_metaphone", "matchesFound": 1 },
{ "stage": "caverphone", "matchesFound": 0 },
{ "stage": "jaro_winkler", "matchesFound": 0 },
{ "stage": "token_set_ratio", "matchesFound": 0 }
]
},
"watchlistSources": {
"ofac": true,
"opensanctions": true,
"interpol": true,
"fbi": false
}
}
Output fields
| Field | Type | Description |
|---|---|---|
entity | string | The entity name investigated |
opacityScore.overallOpacity | number | Weighted composite opacity score (0.0–1.0) |
opacityScore.grade | string | LOW / MODERATE / ELEVATED / HIGH / CRITICAL |
opacityScore.ownershipDepthScore | number | BFS traversal opacity component (weight: 25%) |
opacityScore.transliterationRisk | number | Sanctions phonetic match risk component (weight: 15%) |
opacityScore.burstAnomalyScore | number | Kleinberg burst detection component (weight: 10%) |
opacityScore.coLocationScore | number | DBSCAN address co-location component (weight: 15%) |
opacityScore.infraConcealmentScore | number | WL graph kernel concealment component (weight: 15%) |
opacityScore.beneficialOwnerUncertainty | number | Belief propagation uncertainty component (weight: 20%) |
summary.totalNodes | number | Entity nodes found in ownership graph |
summary.totalEdges | number | Ownership relationships discovered |
summary.circularOwnership | boolean | Whether circular ownership structures exist |
summary.nominees | number | Nominee directors/officers detected |
summary.formationAgents | number | Formation agent patterns detected |
summary.secrecyHops | number | Hops through secrecy jurisdictions |
summary.riskIndicator | string | STANDARD / ELEVATED / HIGH |
result.matches[].finalScore | number | Composite transliteration match score (0.0–1.0) |
result.matches[].stage | string | Pipeline stage that produced the match |
result.matches[].source | string | Watchlist source (OFAC-SDN, OpenSanctions, Interpol, FBI) |
severity | string | CLEAR / MODERATE / HIGH / CRITICAL |
bursts[].burstLevel | number | Kleinberg state level (higher = more anomalous) |
bursts[].periodicity | number | null | Detected registration periodicity in days |
clusters[].suspicionScore | number | Co-location suspicion score per address cluster |
clusters[].diversityIndex | number | Shannon entropy of entity types in cluster |
inferences[].posteriorProbability | number | Bayesian posterior probability of beneficial ownership |
inferences[].converged | boolean | Whether belief propagation converged for this inference |
How much does it cost to run beneficial ownership investigations?
This MCP server uses pay-per-event pricing — you pay per tool call. Platform compute costs are included. Each tool has a fixed price regardless of how many registries or watchlists it queries internally.
| Scenario | Tool | Price | Notes |
|---|---|---|---|
| Quick sanctions screen (10 names) | screen_with_transliteration | $0.040 | OFAC + OpenSanctions + Interpol + FBI |
| Ownership graph traversal | unfold_ownership_graph | $0.045 | Up to 6 registries in parallel |
| Shell address cluster analysis | cluster_shell_addresses | $0.045 | Includes geocoding via Nominatim |
| Beneficial owner inference | infer_beneficial_owner | $0.050 | Full 15-actor evidence collection |
| Full EDD report | generate_edd_report | $0.050 | All 6 algorithms, all 15 actors |
| Complete 7-tool investigation | All tools once | $0.355 | Full anti-concealment analysis |
| Monthly compliance workflow (50 entities) | Mixed tools | ~$10–18 | Varies by tool mix |
Set a maximum spending limit per run in your Apify account to prevent unexpected costs. The server respects the limit and returns a structured error rather than continuing.
Compare this to dedicated KYC/AML platforms at $500–2,000/month with per-query fees on top. At $0.04–0.05 per tool call, most compliance teams spend under $20/month for investigative queries.
How this MCP server works
Phase 1: Parallel data collection
Each tool call dispatches between 3 and 15 Apify actors in parallel using Promise.all. Registry selection is jurisdiction-aware: a GB entity queries UK Companies House plus OpenCorporates plus GLEIF; a KY entity defaults to OpenCorporates plus GLEIF plus all registries in full-scan mode. Actor calls have a 120-second timeout (180 seconds for the full EDD pipeline) with graceful fallback to empty arrays on failure, so partial data always produces a result.
Phase 2: Algorithm execution
Raw registry records, sanctions entries, and infrastructure data feed into six purpose-built algorithms implemented in scoring.ts:
Ownership graph (BFS): Entities become graph nodes. OpenCorporates officer arrays and GLEIF parent-child relationships become edges. Hop penalties are assigned per transition: 0.1 (same jurisdiction), 0.3 (cross-jurisdiction), 0.5 (secrecy jurisdiction from a hardcoded set of 22). Officer names are matched against 15 formation agent patterns using substring search to identify nominees. Circular ownership is detected by tracking visited node IDs during BFS traversal.
Transliteration pipeline: Each input name is processed through 5 stages in sequence. Unicode NFKD normalization strips diacritics and normalizes Cyrillic lookalikes. Double Metaphone and Caverphone produce phonetic encodings compared to watchlist entries. Jaro-Winkler computes character-level similarity with a prefix bonus for names sharing an initial sequence. Token-set ratio computes bag-of-words overlap at the token level, with a phonetic bonus added when metaphone codes align.
Kleinberg burst detection: Incorporation dates from all records are extracted and sorted. The Kleinberg automaton models state transitions where higher states represent higher registration rates. Viterbi dynamic programming finds the optimal state sequence. Autocorrelation is computed at lags 1–30 to detect periodic patterns.
DBSCAN clustering: Address strings are compared using a string similarity function (EPS=0.7 similarity threshold, minPts=2). Clusters are expanded iteratively. Shell score per cluster is computed as entities_per_address × (1 − diversity_fraction). In the full pipeline, actual geocoordinates from Nominatim are used with 50-meter epsilon.
Weisfeiler-Lehman kernel: Each entity's domain portfolio is converted to a subgraph. Nodes are iteratively relabeled by hashing their current label with sorted neighbor labels over 3 iterations. Label histogram vectors are computed per entity and compared via normalized dot product. High kernel values identify entities with structurally similar digital infrastructure graphs.
Loopy belief propagation: Variables represent is_beneficial_owner(person, entity) Boolean states. Factor nodes combine 6 evidence potentials. Messages are passed between variable and factor nodes iteratively with a 0.5 damping factor. Convergence is checked at each iteration; the algorithm stops at convergence or 50 iterations. Posterior probabilities above 0.7 are flagged HIGH.
Phase 3: Composite scoring and response assembly
computeOpacityScore takes the six algorithm outputs and applies the weighted formula to produce a single opacity score with a categorical grade. The generate_edd_report tool additionally assembles all six outputs into a unified EDDReport structure with severity-graded findings and natural-language recommendations.
Tips for best results
-
Provide the jurisdiction code when known. Specifying "KY" routes queries to OpenCorporates + GLEIF only (faster, lower internal cost); omitting it triggers a full scan of all 6 registries (more thorough but 2× slower).
-
Supply known domains to
correlate_infrastructureandcompute_entity_opacity_score. Without domains, infrastructure analysis returns minimal results. Format as "EntityName:domain.com" — multiple domains per entity are supported. -
Use
screen_with_transliterationbeforeinfer_beneficial_owner. Sanctions hits update the sanctions_co_occurrence evidence factor in belief propagation. Running screening first and passing those names to the inference tool gives higher-quality posterior probabilities. -
For large-scale batch investigations, call tools in parallel. The Apify API supports concurrent runs. Screening 50 entities one at a time takes 50 × 60 seconds; running 10 in parallel reduces wall-clock time to 300 seconds.
-
Start with
compute_entity_opacity_scorefor triage. It runs all algorithms and returns a single grade. Route only HIGH and CRITICAL entities to the more expensivegenerate_edd_reportfor full documentation. -
Set a spending limit on your Apify account. Tools like
generate_edd_reporttrigger up to 15 sub-actor runs internally. A spending cap ensures that unexpected entity complexity (many addresses, many officers) does not result in runaway costs. -
Use
detect_registration_burstson known persons, not just entities. A nominee director who appears as an officer on 40 companies registered in the same 3-month window is a strong shell company farm signal regardless of entity names.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Counterparty Due Diligence MCP | Run broad KYB screening first, then escalate HIGH-risk entities to this server for deep anti-concealment analysis |
| OFAC Sanctions Search | Direct OFAC lookups when you need raw SDN records without the transliteration pipeline overhead |
| OpenSanctions Search | Query the full OpenSanctions dataset directly for custom PEP or watchlist workflows |
| UK Companies House | Pull UK PSC (persons with significant control) records directly for UK entity investigations |
| OpenCorporates Search | Raw registry data for custom analysis pipelines outside the MCP interface |
| GLEIF LEI Lookup | Look up parent/child corporate relationships via Legal Entity Identifiers directly |
| WHOIS Domain Lookup | Domain registration data for custom infrastructure investigations |
| Company Deep Research | Comprehensive company intelligence reports to supplement EDD findings |
Opacity scoring reference
| Grade | Score Range | Meaning | Recommended Action |
|---|---|---|---|
| LOW | 0.00–0.15 | Transparent structure, minimal concealment indicators | Standard due diligence |
| MODERATE | 0.15–0.35 | Some complexity, limited opacity signals | Enhanced documentation recommended |
| ELEVATED | 0.35–0.55 | Multiple opacity factors present | Enhanced due diligence required |
| HIGH | 0.55–0.75 | Significant concealment indicators detected | EDD report, senior compliance review |
| CRITICAL | 0.75+ | Multiple active concealment techniques detected | Escalate, consider relationship rejection |
Limitations
- Public registries only. The server accesses only publicly available corporate registry data. Private company ownership in jurisdictions with no public beneficial owner register (Delaware LLCs, Wyoming, BVI pre-2023) may not be fully traceable.
- No real-time sanctions updates. Watchlist data reflects the underlying actor caches. OFAC and OpenSanctions actors refresh periodically; there may be a delay of hours to days between a new designation and its appearance in results.
- Bearer share structures. Entities using bearer shares (most jurisdictions have abolished these but historical structures exist) cannot be traced through public registry data.
- Trust structures. Discretionary trusts typically have no public beneficial owner record. The system will identify trust structures as opacity factors but cannot penetrate them.
- Name disambiguation. Common names like "John Smith" or "Zhang Wei" will match many unrelated individuals in corporate records. The transliteration pipeline scores are most reliable for distinctive names.
- Rate limits on sub-actors. Each tool call internally calls multiple Apify actors. In high-volume scenarios, underlying actors may approach rate limits. The server handles this gracefully with empty-array fallbacks, but results may be incomplete for very large entity name queries (50 names in a single screening call).
- English-language name matching. The phonetic pipeline (Double Metaphone, Caverphone) is optimized for names of Latin-script and English-phonological origin. Chinese, Arabic, and other non-Latin scripts are handled through Unicode normalization but phonetic matching accuracy is lower.
- Infrastructure analysis requires known domains. Without providing domain names, the
correlate_infrastructureand infrastructure component ofcompute_entity_opacity_scorereturns minimal results, reducing overall scoring accuracy.
Integrations
- Apify API — Call tools programmatically from compliance systems, trigger via HTTP POST to the
/mcpendpoint - Webhooks — Configure alerts when runs complete or when opacity scores exceed thresholds
- Apify Schedules — Set up recurring monitoring of high-risk entities on daily or weekly intervals
- Dataset export — Download EDD reports as JSON for audit trail storage and compliance file documentation
- LangChain / LlamaIndex — Integrate into RAG pipelines for automated compliance research agents
Responsible use
- This server accesses only publicly available government registries, open sanctions databases, and publicly disclosed certificate transparency logs.
- Use results to inform compliance decisions — do not make adverse determinations based solely on opacity scores without review by qualified compliance professionals.
- Comply with GDPR and applicable data protection laws when processing personal data returned in corporate officer records.
- Transliteration screening results are probabilistic indicators, not confirmed matches. Always verify high-scoring matches against primary source documents.
- Do not use this server to conduct surveillance or harassment of private individuals.
- For guidance on web scraping legality, see Apify's guide.
FAQ
How many jurisdictions does beneficial ownership detection cover? Direct registry access covers UK, Australia, Canada, and New Zealand. OpenCorporates extends coverage to 140+ jurisdictions with varying data depth. GLEIF provides global LEI parent-child relationships independent of national registries. For entities in jurisdictions with no public registry, the server relies on cross-border officer and address correlations as proxy evidence.
How does transliteration screening differ from standard sanctions matching? Standard screening uses exact string matching or simple Levenshtein distance, which misses intentional evasion. This server's 5-stage pipeline applies Unicode normalization (catching Cyrillic homoglyphs like "o" vs "о"), Double Metaphone phonetic encoding (catching phonetic equivalents like "Petrov"/"Petroff"), Caverphone encoding, Jaro-Winkler character-level similarity with a prefix bonus, and token-set ratio to catch name reordering. The combination catches evasion patterns that would require a human expert to identify manually.
Can the server detect all beneficial owners? No. It infers likely beneficial owners from available public data and returns posterior probabilities with evidence breakdowns. Ownership chains using bearer shares, discretionary trusts, or jurisdictions with no public registry cannot be fully penetrated. The opacity score quantifies how much of the structure is obscured, so a CRITICAL grade signals that definitive UBO identification requires primary source documents, not just public data.
How accurate is the Kleinberg burst detection? The automaton identifies statistically anomalous registration clustering relative to the overall registration rate of the dataset. It will flag bursts reliably when a significant number of registrations occur in a compressed time window. Accuracy depends on data completeness — sparse registry results from jurisdictions with limited public data may under-report bursts.
How long does each tool call take?
screen_with_transliteration and detect_registration_bursts typically complete in 30–60 seconds. unfold_ownership_graph and cluster_shell_addresses take 60–120 seconds due to parallel registry queries. infer_beneficial_owner and generate_edd_report take 90–180 seconds as they run the full 15-actor pipeline. The server uses standby mode, so there is no cold-start delay.
Is it legal to access this data? All data sources are publicly available government registries, open sanctions databases, and certificate transparency logs. See Apify's guide on web scraping legality. Sanctions list access is not restricted. Corporate registry data is public record. Use of the data for compliance purposes is consistent with AML/KYC regulatory obligations in most jurisdictions.
How does this differ from dedicated KYC compliance software like Comply Advantage or Dow Jones Risk & Compliance? Enterprise KYC platforms charge $500–2,000/month with per-query fees and long procurement cycles. This server costs $0.04–0.05 per query with no subscription, no minimum spend, and no integration project. The trade-off is that enterprise platforms maintain curated, continuously updated datasets with human review; this server uses open public data. For investigative research and supplementary screening, it provides comparable algorithmic depth at a fraction of the cost.
Can I run batch investigations across many entities simultaneously?
Each tool call handles one entity (or up to 50 names for screen_with_transliteration). For batch investigations, use the Apify API to trigger multiple concurrent runs. The Apify platform supports parallel actor execution, so you can process 10–20 entities simultaneously with separate API calls.
What does the Weisfeiler-Lehman kernel value mean in practice? The kernel value is the normalized dot product of infrastructure label histograms between two entities. A value above 0.7 indicates that both entities share a structurally similar digital infrastructure graph — same IP ranges, nameservers, or SSL certificate issuers — suggesting common control. A value of 1.0 indicates identical infrastructure fingerprints. Values below 0.4 indicate no meaningful infrastructure overlap.
What happens when a registry is unavailable or returns no data?
All sub-actor calls include graceful error handling that returns empty arrays on failure. The algorithms process whatever data is available and note data gaps in the response. An empty ownership graph does not produce a HIGH opacity score — insufficient data is treated as uninformative rather than suspicious. The dataSources field in responses shows how many records each source contributed.
Can I schedule recurring opacity monitoring for a watchlist of entities? Yes. Use Apify Schedules to trigger tool calls via the HTTP API on a daily or weekly schedule. Store results in Apify Datasets and configure webhooks to alert your team when opacity scores exceed a threshold or when new sanctions matches appear.
Does this server replace a compliance officer? No. It provides investigative intelligence and structured evidence to support compliance workflows. Regulatory decisions under AML/KYC obligations require qualified compliance professionals to interpret findings, apply judgment, and document conclusions. Treat outputs as research inputs, not compliance determinations.
Troubleshooting
Opacity score seems low despite a complex-looking structure. The composite score weights are calibrated to the evidence available from public data. A multi-layered structure in jurisdictions with strong public registries (UK, Australia) may score lower than expected because the transparency of those registries reduces actual opacity, even if the structure is complex. Try generate_edd_report for the full narrative finding set, which distinguishes structural complexity from deliberate concealment.
unfold_ownership_graph returns only 1–2 nodes. This typically means the entity name has no strong matches in OpenCorporates or GLEIF. Try the exact registered name from the official registry, include the company number if known, or check the jurisdiction code — a GB company searched without "Ltd" or "Limited" may not match. Some jurisdictions have limited OpenCorporates coverage.
screen_with_transliteration returns CLEAR for a name with known aliases. The phonetic pipeline works best for Latin-script names. Non-Latin script names (Arabic, Chinese) rely primarily on token-set ratio. For names with known aliases, pass all known alias variants in the names array rather than relying on the pipeline to generate them.
infer_beneficial_owner returns "Insufficient data" message. This occurs when corporate records contain no officer or director names. This is common for entities registered in jurisdictions with no public officer disclosure. Supply known persons via the known_persons parameter to seed the inference graph.
Tool call times out after 180 seconds. The full EDD pipeline can exceed the timeout for entities with very large corporate graphs (100+ related entities). Use compute_entity_opacity_score first to assess complexity, then call generate_edd_report with a specific jurisdiction parameter to constrain registry queries.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom compliance integrations or enterprise deployments, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Bulk Email Verifier
Verify email deliverability at scale. MX record validation, SMTP mailbox checks, disposable and role-based detection, catch-all flagging, and confidence scoring. No external API costs.
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Website Tech Stack Detector
Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.
Ready to try Adversarial Corporate Opacity MCP?
Start for free on Apify. No credit card required.
Open on Apify Store