Startup Ecosystem Intelligence MCP Server
Startup ecosystem intelligence for VC deal sourcing gives your AI assistant instant access to 8 public data sources — patents, GitHub activity, job postings, ArXiv research, tech stacks, corporate registries, and SaaS competitive data — all fused into a single structured deal memo. Built for venture capitalists, corporate development teams, and accelerator managers who need quantified, behavior-based signals rather than self-reported pitch deck data.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| discover_startups | OpenCorporates + GitHub + SaaS intel sector search. | $0.10 |
| assess_innovation_velocity | Patents (USPTO+EPO) + GitHub + ArXiv publication velocity. | $0.12 |
| decode_hiring_signals | Job market role mix analysis, strategy inference. | $0.06 |
| analyze_competitive_moat | Tech stack + patents + competitor density + network effects. | $0.15 |
| verify_corporate_structure | OpenCorporates entity status, jurisdictions, history. | $0.05 |
| track_technology_trends | ArXiv + GitHub + patent trends for a technology area. | $0.10 |
| benchmark_against_cohort | SaaS competitive landscape + tech stack + hiring comparison. | $0.10 |
| generate_deal_memo | All 8 data sources, 4 scoring models, deal rating, thesis, red flags. | $0.40 |
Example: 100 events = $10.00 · 1,000 events = $100.00
Connect to your AI agent
Add this MCP server to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.
https://ryanclinton--startup-ecosystem-intelligence-mcp.apify.actor/mcp{
"mcpServers": {
"startup-ecosystem-intelligence-mcp": {
"url": "https://ryanclinton--startup-ecosystem-intelligence-mcp.apify.actor/mcp"
}
}
}Documentation
Startup ecosystem intelligence for VC deal sourcing gives your AI assistant instant access to 8 public data sources — patents, GitHub activity, job postings, ArXiv research, tech stacks, corporate registries, and SaaS competitive data — all fused into a single structured deal memo. Built for venture capitalists, corporate development teams, and accelerator managers who need quantified, behavior-based signals rather than self-reported pitch deck data.
This MCP server runs as a persistent Apify Standby actor and exposes 8 tools over the Model Context Protocol. Every tool call triggers parallel data collection across multiple APIs, applies one or more scoring algorithms (Innovation Velocity, Hiring Signal Decoder, Competitive Moat Analyzer, Corporate Health), and returns structured JSON with scores, signals, investment thesis points, and red flags. No manual research. No Crunchbase subscriptions. No waiting for founders to reply.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| 📋 Corporate registration, entity status, officers | OpenCorporates (140+ jurisdictions) | "Acme Corp — Active, Delaware, 2 entities" |
| 🔬 US patent filings, assignees, claims | USPTO Patent Search | "7 patents filed, 3 in AI/ML classification" |
| 🌍 European patent applications | EPO Patent Search | "4 EP applications, 2 granted" |
| ⭐ GitHub repos, stars, forks, commit velocity | GitHub Repo Search | "23 repos, 4,812 stars, updated 3 days ago" |
| 🛠 Website tech stack components | Website Tech Stack Detector | "React, GraphQL, Kubernetes, Snowflake — 12 components" |
| 💼 Job postings by role, seniority, function | Job Market Intelligence | "34 open roles — 62% engineering, 18% sales" |
| 📄 Pre-publication ArXiv research papers | ArXiv Preprint Search | "9 papers, 5 published last 6 months" |
| 🏆 SaaS competitors, features, positioning | SaaS Competitive Intelligence | "8 direct competitors identified" |
| 🎯 Innovation Velocity Score (0-100) | Composite model | "Score: 74 — FAST velocity level" |
| 📊 Composite deal rating | All 8 sources | "DILIGENCE — compositeScore: 61" |
Why use Startup Ecosystem Intelligence MCP?
Manual startup research is slow, inconsistent, and biased toward companies with strong networks. A junior analyst takes 8-12 hours to gather patent data, check corporate registries, scan job boards, and benchmark GitHub activity for a single company. Multiplied across a pipeline of 50 deals per month, that is 400-600 analyst hours just to screen at the top of the funnel.
This MCP automates the entire observable-signal layer of startup due diligence. Connect it to Claude, Cursor, or any MCP-compatible client and run a full deal memo in under two minutes.
- Standby mode — the server stays warm and responds instantly without cold-start delays between queries
- API access — call any of the 8 tools programmatically from Python, JavaScript, or curl
- Parallel data collection — up to 8 Apify actors fire simultaneously, so a deal memo takes 60-90 seconds rather than running sequentially
- Spending controls — set a per-session budget; the server returns a clean error when the limit is reached
- Integrations — connect to Zapier, Make, or webhooks to trigger deal memo generation from a CRM deal stage change
Features
- Innovation Velocity Score (0-100) — weighted composite of USPTO patents (up to 25 pts), EPO patents combined, GitHub repo count (up to 15 pts), total GitHub stars on a log2 scale (up to 15 pts), ArXiv paper count (up to 25 pts), and a recency bonus for activity in the past 6 months (up to 20 pts)
- Velocity level classification — five tiers: DORMANT, SLOW, MODERATE, FAST, HYPERGROWTH based on composite score thresholds (0/20/40/60/80)
- Hiring Signal Decoder — classifies 34+ job title keywords into 6 role categories (engineering, sales, marketing, executive, operations, other) and infers strategic direction: BUILDING (≥50% engineering), SCALING (≥40% sales), PIVOTING (≥30% executive), or MAINTAINING
- Competitive Moat Analyzer — scores tech stack depth (up to 25 pts), patent protection (up to 30 pts), competitor density using an inverse scoring model (fewer competitors = higher score), and GitHub star community proxy as network-effect signal (up to 20 pts via log2 scaling)
- Moat type classification — five levels: NONE, WEAK, MODERATE, STRONG, FORTRESS
- Corporate Health Check — active entity ratio scoring, existence verification, structural complexity penalty (entities beyond 3 incur penalty), and jurisdiction scoring that rewards 1-3 jurisdictions over tax-haven complexity
- Deal rating engine — composite score weighted as Innovation 30% + Moat 25% + Hiring 25% + Corporate 20%; maps to PASS / WATCH / DILIGENCE / STRONG_BUY at 25/50/75 thresholds
- Red flag detection — auto-surfaces concerning signals: dormant innovation, weak moat in a dense market, ≥3 dissolved entities, poor corporate health
- Investment thesis generation — auto-generates 1-4 thesis bullet points when positive signals cross thresholds
- 8 specialized MCP tools — each independently callable for targeted analysis without running the full deal memo
- Parallel actor orchestration —
runActorsParallel()usesPromise.allSettled()so a single failing data source does not block the entire analysis - Jurisdiction filtering —
discover_startupsandverify_corporate_structureaccept an optional jurisdiction code for country-specific corporate registry queries
Use cases for startup ecosystem intelligence
VC deal sourcing with alternative data
Venture associates processing 40-80 inbound decks per week need a fast triage layer before partner time. Use discover_startups to surface active companies in a sector, then generate_deal_memo to score the top candidates on observable behavior — patent velocity, GitHub credibility, hiring direction — before scheduling calls. Replace the first week of due diligence with a 90-second automated screen.
Corporate development and acquisition scouting
Corporate development teams at technology companies need to map IP landscapes before approaching acquisition targets. analyze_competitive_moat reveals patent portfolio depth and tech stack complexity for any named company. assess_innovation_velocity surfaces which players are accelerating R&D output before they become expensive. Run this across a watchlist of 20 companies weekly with scheduled MCP calls.
Accelerator and portfolio benchmarking
Accelerator operators managing cohorts of 20-40 companies need a consistent evaluation framework. benchmark_against_cohort compares a portfolio company's tech stack maturity, open hiring positions, and competitive density against the sector. decode_hiring_signals monitors whether a portfolio company has shifted from building to scaling mode — an early signal of product-market fit.
Technology trend scouting and thesis development
LP-facing investment teams building thesis documents need to quantify technology momentum before capital allocation. track_technology_trends queries ArXiv paper counts, GitHub star velocity, and USPTO patent filings for a technology area (e.g., "diffusion models", "vector databases", "solid-state batteries") and returns a structured signal picture from research to commercialization.
Pre-investment due diligence automation
Legal and finance teams conducting formal due diligence can use verify_corporate_structure to check entity status, jurisdiction count, and inactive entity history across 140+ corporate registries via OpenCorporates before paying for a law firm to run the same check. Flags like dissolved subsidiaries or unusual jurisdiction stacking surface immediately.
Competitive landscape mapping for portfolio companies
Portfolio company founders preparing competitive analysis for board presentations can use benchmark_against_cohort to enumerate direct competitors, compare tech stack sophistication, and quantify open hiring volume as a proxy for competitor growth rate.
How to use startup ecosystem intelligence
- Connect the MCP server — Add the endpoint
https://startup-ecosystem-intelligence-mcp.apify.actor/mcpto your MCP client (Claude Desktop, Cursor, Windsurf, or Cline) with your Apify API token as the Bearer token. - Choose your tool — Ask your AI assistant to run a deal memo ("Generate a deal memo for Cohere") or a targeted analysis ("What is Databricks' hiring strategy right now?").
- Wait 60-90 seconds — The server queries up to 8 data sources in parallel. Full deal memos take longer than single-dimension tools.
- Review the structured output — Scores, ratings, signals, investment thesis points, and red flags are returned as structured JSON that your AI assistant can reason over and summarize.
Input parameters
This is an MCP server. There are no actor-level input parameters — connection is handled by your MCP client. Each tool accepts its own arguments as described below.
Tool parameters
| Tool | Parameter | Type | Required | Description |
|---|---|---|---|---|
discover_startups | query | string | Yes | Technology, market sector, or keyword to search |
discover_startups | jurisdiction | string | No | Country/jurisdiction code filter (e.g., "us_de", "gb") |
assess_innovation_velocity | company | string | Yes | Company or organization name |
decode_hiring_signals | company | string | Yes | Company name |
decode_hiring_signals | location | string | No | Location filter for job postings |
analyze_competitive_moat | company | string | Yes | Company name |
analyze_competitive_moat | website | string | No | Company website URL — enables tech stack detection by URL rather than name |
verify_corporate_structure | company | string | Yes | Company name |
verify_corporate_structure | jurisdiction | string | No | Jurisdiction code to filter corporate registry results |
track_technology_trends | technology | string | Yes | Technology, framework, or research topic |
benchmark_against_cohort | company | string | Yes | Company name to benchmark |
benchmark_against_cohort | website | string | No | Company website URL |
generate_deal_memo | company | string | Yes | Startup company name |
generate_deal_memo | website | string | No | Company website URL — improves tech stack detection accuracy |
Connection examples
Claude Desktop — add to claude_desktop_config.json:
{
"mcpServers": {
"startup-ecosystem": {
"url": "https://startup-ecosystem-intelligence-mcp.apify.actor/mcp",
"headers": {
"Authorization": "Bearer YOUR_APIFY_TOKEN"
}
}
}
}
Cursor or Windsurf — MCP settings:
{
"startup-ecosystem-intelligence": {
"url": "https://startup-ecosystem-intelligence-mcp.apify.actor/mcp",
"headers": { "Authorization": "Bearer YOUR_APIFY_TOKEN" }
}
}
Direct HTTP call:
curl -X POST "https://startup-ecosystem-intelligence-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"assess_innovation_velocity","arguments":{"company":"Mistral AI"}},"id":1}'
Input tips
- Provide the website URL when available —
analyze_competitive_moatandgenerate_deal_memouse it for tech stack detection by URL, which is more accurate than name-based lookup. - Use jurisdiction codes for corporate verification — Pass "us_de" for Delaware, "gb" for UK, "de" for Germany. Without it, results include all matching jurisdictions.
- Run
assess_innovation_velocityfirst for pre-revenue companies — patents and ArXiv papers are available even when revenue data is not. - Use
track_technology_trendsfor thesis building — Query a technology area ("agentic RAG", "model distillation") rather than a company name to map the research-to-commercialization pipeline.
Output example
Full output from generate_deal_memo for a hypothetical company:
{
"company": "NovaSynth AI",
"compositeScore": 68,
"dealRating": "DILIGENCE",
"innovationVelocity": {
"score": 74,
"patentCount": 5,
"epoPatentCount": 3,
"githubRepos": 18,
"githubStars": 6240,
"arxivPapers": 7,
"velocityLevel": "FAST",
"signals": [
"8 patents filed (5 USPTO, 3 EPO)",
"6240 GitHub stars across 18 repos — strong developer community",
"18 public repositories — active open source presence",
"7 ArXiv publications — research-driven innovation"
]
},
"hiringSignals": {
"score": 61,
"totalJobs": 28,
"engineeringJobs": 16,
"salesJobs": 5,
"executiveJobs": 2,
"strategyInference": "BUILDING",
"roleDistribution": {
"engineering": 16,
"sales": 5,
"marketing": 3,
"executive": 2,
"operations": 1,
"other": 1
},
"signals": [
"57% engineering hires — product building phase",
"28 open positions — significant growth"
]
},
"competitiveMoat": {
"score": 58,
"techStackDepth": 14,
"competitorCount": 6,
"patentProtection": 8,
"moatType": "MODERATE",
"moatFactors": [
"Technical complexity",
"Patent portfolio"
],
"signals": [
"14 technology components detected — complex tech stack",
"8 patents providing IP protection"
]
},
"corporateHealth": {
"score": 82,
"entityCount": 2,
"activeEntities": 2,
"inactiveEntities": 0,
"jurisdictions": ["us_de", "gb"],
"healthLevel": "STRONG",
"signals": [
"2 corporate registration(s) found"
]
},
"allSignals": [
"8 patents filed (5 USPTO, 3 EPO)",
"6240 GitHub stars across 18 repos — strong developer community",
"7 ArXiv publications — research-driven innovation",
"57% engineering hires — product building phase",
"28 open positions — significant growth",
"14 technology components detected — complex tech stack",
"2 corporate registration(s) found"
],
"investmentThesis": [
"High innovation velocity (74/100) — strong R&D output",
"Engineering-heavy hiring — product building phase (ideal for early-stage)"
],
"redFlags": []
}
Output fields
| Field | Type | Description |
|---|---|---|
company | string | Company name as provided |
compositeScore | number (0-100) | Weighted composite: Innovation 30% + Moat 25% + Hiring 25% + Corporate 20% |
dealRating | string | PASS / WATCH / DILIGENCE / STRONG_BUY at thresholds 25/50/75 |
innovationVelocity.score | number (0-100) | Innovation Velocity Score |
innovationVelocity.patentCount | number | USPTO patents found |
innovationVelocity.epoPatentCount | number | EPO patents found |
innovationVelocity.githubRepos | number | GitHub repositories found |
innovationVelocity.githubStars | number | Total stars across all repos |
innovationVelocity.arxivPapers | number | ArXiv preprints found |
innovationVelocity.velocityLevel | string | DORMANT / SLOW / MODERATE / FAST / HYPERGROWTH |
innovationVelocity.signals | string[] | Human-readable evidence statements |
hiringSignals.score | number (0-100) | Hiring Signal Score |
hiringSignals.totalJobs | number | Total open positions found |
hiringSignals.engineeringJobs | number | Engineering/developer roles |
hiringSignals.salesJobs | number | Sales/BD/revenue roles |
hiringSignals.executiveJobs | number | VP/Director/C-suite roles |
hiringSignals.strategyInference | string | BUILDING / SCALING / PIVOTING / MAINTAINING / UNKNOWN |
hiringSignals.roleDistribution | object | Count per role category (engineering, sales, marketing, executive, operations, other) |
competitiveMoat.score | number (0-100) | Competitive Moat Score |
competitiveMoat.techStackDepth | number | Technology components detected |
competitiveMoat.competitorCount | number | Direct competitors found |
competitiveMoat.patentProtection | number | Total patents (USPTO + EPO) |
competitiveMoat.moatType | string | NONE / WEAK / MODERATE / STRONG / FORTRESS |
competitiveMoat.moatFactors | string[] | Named moat sources (e.g., "Patent portfolio", "Community/network effects") |
corporateHealth.score | number (0-100) | Corporate Health Score |
corporateHealth.entityCount | number | Total corporate entities found |
corporateHealth.activeEntities | number | Active/good standing entities |
corporateHealth.inactiveEntities | number | Dissolved/revoked entities |
corporateHealth.jurisdictions | string[] | Unique jurisdiction codes found |
corporateHealth.healthLevel | string | POOR / CONCERNING / ACCEPTABLE / GOOD / STRONG |
allSignals | string[] | All signal statements from all four models combined |
investmentThesis | string[] | Auto-generated positive thesis points |
redFlags | string[] | Auto-generated concern statements |
How much does it cost to run startup due diligence?
This MCP uses pay-per-event pricing — you pay $0.045 per tool call. Platform compute costs are included. The generate_deal_memo tool runs 8 actors in parallel but still costs a flat $0.045.
| Scenario | Tool calls | Cost per call | Total cost |
|---|---|---|---|
| Quick test — single innovation check | 1 | $0.045 | $0.045 |
| Targeted analysis — 3 dimensions | 3 | $0.045 | $0.14 |
| Full deal memo — one company | 1 | $0.045 | $0.045 |
| Screen 10 companies with deal memos | 10 | $0.045 | $0.45 |
| Full pipeline — 50 deals screened | 50 | $0.045 | $2.25 |
You can set a maximum spending limit per session in your Apify account settings to control costs. The server returns a clean error message when the budget is reached so your AI client can report it gracefully.
Compared to PitchBook at $20,000+/year or CB Insights at $6,000+/year, most VC teams using this MCP for deal screening spend under $20/month with no subscription commitment and no per-seat pricing.
Using startup ecosystem intelligence via the API
Python
import requests
import json
response = requests.post(
"https://startup-ecosystem-intelligence-mcp.apify.actor/mcp",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_APIFY_TOKEN",
},
json={
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "generate_deal_memo",
"arguments": {
"company": "Cohere",
"website": "https://cohere.com"
}
},
"id": 1
}
)
result = response.json()
memo = json.loads(result["result"]["content"][0]["text"])
print(f"Company: {memo['company']}")
print(f"Composite Score: {memo['compositeScore']}/100")
print(f"Deal Rating: {memo['dealRating']}")
print(f"Innovation Velocity: {memo['innovationVelocity']['score']}/100 ({memo['innovationVelocity']['velocityLevel']})")
print(f"Hiring Strategy: {memo['hiringSignals']['strategyInference']} — {memo['hiringSignals']['totalJobs']} open roles")
print(f"Moat: {memo['competitiveMoat']['moatType']}")
print(f"Red Flags: {memo['redFlags']}")
JavaScript
const response = await fetch(
"https://startup-ecosystem-intelligence-mcp.apify.actor/mcp",
{
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_APIFY_TOKEN",
},
body: JSON.stringify({
jsonrpc: "2.0",
method: "tools/call",
params: {
name: "assess_innovation_velocity",
arguments: { company: "Mistral AI" },
},
id: 1,
}),
}
);
const result = await response.json();
const velocity = JSON.parse(result.result.content[0].text);
console.log(`Innovation Velocity Score: ${velocity.innovationVelocity.score}/100`);
console.log(`Level: ${velocity.innovationVelocity.velocityLevel}`);
console.log(`Patents: ${velocity.innovationVelocity.patentCount} USPTO + ${velocity.innovationVelocity.epoPatentCount} EPO`);
console.log(`GitHub: ${velocity.innovationVelocity.githubRepos} repos, ${velocity.innovationVelocity.githubStars} stars`);
console.log(`ArXiv: ${velocity.innovationVelocity.arxivPapers} publications`);
cURL
# Run a full deal memo
curl -X POST "https://startup-ecosystem-intelligence-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "generate_deal_memo",
"arguments": {
"company": "Hugging Face",
"website": "https://huggingface.co"
}
},
"id": 1
}'
# Decode hiring signals only
curl -X POST "https://startup-ecosystem-intelligence-mcp.apify.actor/mcp" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "decode_hiring_signals",
"arguments": { "company": "Databricks", "location": "San Francisco" }
},
"id": 2
}'
How Startup Ecosystem Intelligence MCP works
MCP server architecture and standby mode
The server runs as an Apify Standby actor using Express.js with a POST /mcp endpoint. Incoming JSON-RPC 2.0 requests are handled by the @modelcontextprotocol/sdk McpServer class with a StreamableHTTPServerTransport. Each request instantiates a fresh server instance, connects it to the transport, and handles the request without persistent session state. This stateless-per-request design avoids memory leaks across high-volume pipelines.
In Standby mode (APIFY_META_ORIGIN === 'STANDBY'), the Express server remains live between requests. In non-standby mode (direct actor run), the server starts, logs a health-check message, and exits after 1 second — allowing the Apify platform to confirm the build is valid without billing for idle time.
Parallel data collection via runActorsParallel
Each tool call triggers runActorsParallel(), which fires up to 8 downstream Apify actors simultaneously using Promise.allSettled(). Actors are identified by hardcoded IDs (not slugs) to ensure version stability across actor renames. Each downstream actor runs at 256MB memory with a 120-second timeout. Failed actors return an empty array rather than failing the entire request — so a temporarily unavailable patent API does not block hiring signal analysis.
The 8 downstream actors are: opencorporates-search, patent-search (USPTO), epo-patent-search, github-repo-search, website-tech-stack-detector, job-market-intelligence, arxiv-paper-search, and saas-competitive-intel.
Four scoring models and the composite formula
scoreInnovationVelocity assigns up to 25 points for patents (USPTO + EPO at 3 pts each, capped at 25), up to 30 points for GitHub (15 pts for repo count at 2 pts each + 15 pts for log2(stars) * 2), up to 25 points for ArXiv (5 pts per paper), and up to 20 points for recency (items updated in the last 6 months at 3 pts each).
decodeHiringSignals classifies job titles against keyword lists for 6 role categories. Engineering keywords include "ml", "ai", "sre", "platform", "devops". Executive keywords include "vp", "cto", "head of", "vice president". The role mix ratios drive the four-way strategy inference.
analyzeCompetitiveMoat uses an inverted competitor score: fewer competitors increase the score. Network effects are proxied by GitHub stars using the same log2 formula as innovation velocity, scaled by 3x.
checkCorporateHealth applies a complexity penalty of 4 pts per entity beyond 3 (i.e., simple structures score higher). Jurisdiction count scoring rewards 1-3 jurisdictions and penalizes each additional jurisdiction by 5 pts beyond 3.
The composite deal memo score weights these as Innovation 30% + Moat 25% + Hiring 25% + Corporate 20%.
Pay-per-event billing with circuit-breaker
Every tool handler calls Actor.charge({ eventName: '...' }) before executing. If chargeResult.eventChargeLimitReached is true, the tool immediately returns a structured error object rather than running downstream actors. This ensures spending limits are respected at the tool level, not just at the platform level.
Tips for best results
-
Always provide the website URL for moat and deal memo analysis.
analyze_competitive_moatandgenerate_deal_memofall back to name-based tech stack detection when no URL is given, which is less accurate. Passing"website": "https://company.com"switches the tech stack detector to URL-based analysis. -
Use
assess_innovation_velocityas your first filter for deep-tech companies. Patent and ArXiv signals are available even for pre-revenue companies. If the velocity score is below 20 (DORMANT) for a company claiming to be research-driven, treat that as a red flag before spending time on a full deal memo. -
Run
track_technology_trendsquarterly to update investment theses. Query your thesis keywords (e.g., "reasoning models", "AI agents", "synthetic biology CRISPR") to get a snapshot of research paper velocity, GitHub project momentum, and patent filing rates. The trend data helps calibrate whether a theme is early-stage research or approaching commercialization. -
Combine
decode_hiring_signalswith calendar monitoring. A company that was BUILDING six months ago and is now SCALING has likely achieved product-market fit. Run hiring signal checks on your watchlist monthly to detect inflection points before they show up in public announcements. -
Verify corporate structure before term sheet conversations.
verify_corporate_structurecan surface dissolved entities, unusual jurisdiction stacking, or a high entity count in under two minutes. This takes 1-2 weeks and costs $3,000-5,000 when done through a law firm at the formal due diligence stage. -
Use
discover_startupswith jurisdiction codes to find regional plays. Pass"jurisdiction": "gb"or"jurisdiction": "de"to limit corporate registry results to UK or German entities respectively. Useful for Europe-focused fund mandates. -
Set a spending limit for batch pipeline runs. When screening a large pipeline, set an Apify spending limit at the account level so a runaway loop does not incur unexpected charges. At $0.045 per call, 1,000 calls costs $45 — budget accordingly.
Combine with other Apify MCP servers
| MCP Server | How to combine |
|---|---|
| M&A Target Intelligence MCP | Run deal memos first, then pass DILIGENCE-rated companies into M&A target screening for acquisition valuation and integration risk analysis |
| Workforce Competitive Intelligence MCP | Supplement decode_hiring_signals with talent flow data — who is leaving competitors to join a target startup is a strong conviction signal |
| Tech Ecosystem Analysis MCP | After track_technology_trends identifies a hot technology area, use tech ecosystem analysis to map the full dependency and integration landscape |
| Academic Commercialization Pipeline MCP | ArXiv papers surfaced by assess_innovation_velocity can be fed into academic commercialization screening to identify university spin-out opportunities |
| Website Tech Stack Detector | Run standalone tech stack detection across a list of competitors before calling benchmark_against_cohort to pre-populate comparison data |
| Company Deep Research | After generating a STRONG_BUY deal memo, use company deep research for a comprehensive narrative intelligence report to support IC memos |
Limitations
- No financial data — Revenue, ARR, burn rate, and fundraising history are not available from public signals. This tool provides behavioral proxies, not financial metrics.
- GitHub presence required for maximum innovation scores — Companies without public GitHub activity score lower on the innovation velocity model regardless of their actual engineering output. Stealth-mode or highly proprietary companies may be systematically underrated.
- Patent lag — USPTO and EPO patent data reflects filings, not grants, and may lag 3-18 months behind actual filing dates depending on publication delays. Very recent patent activity may not appear.
- Job posting coverage varies by market — Job market intelligence is stronger for English-language markets and major job boards. Companies in non-English markets or those hiring primarily through LinkedIn may show lower job counts than their actual open roles.
- OpenCorporates covers 140+ jurisdictions, not all — Some jurisdictions (particularly emerging markets) have limited or no coverage. Corporate verification is strongest for US, UK, EU, and Commonwealth jurisdictions.
- Tech stack detection requires a live website — The tech stack detector cannot analyze companies whose websites are behind authentication, use aggressive bot protection, or are entirely server-side rendered without JavaScript signals. Name-based fallback is available but less accurate.
- Competitor density is SaaS-biased — The
saas-competitive-intelactor is optimized for software businesses. Hardware, deep tech, and biotech companies may show artificially low competitor counts. - Composite score reflects public signals only — A company with a strong competitive moat built on proprietary hardware or closed-source IP may score lower than an equivalent open-source competitor.
- Not a replacement for full diligence — Deal memos produced here are a first-pass screening tool. Legal, financial, technical, and reference diligence must be conducted independently before investment decisions.
Integrations
- Apify API — Call the MCP endpoint programmatically from any language; useful for building batch screening pipelines that run nightly across a deal watchlist
- Webhooks — Trigger a deal memo run when a new company is added to your CRM pipeline via webhook; push the result to a Slack channel or Notion database
- Zapier — Connect to Zapier to run deal memos automatically when a deal stage changes in Salesforce, HubSpot, or Affinity CRM
- Make — Build multi-step automation scenarios: inbound pitch deck email → extract company name → run deal memo → append to Airtable deal tracker
- LangChain / LlamaIndex — Use deal memo JSON as a retrieval-augmented context source for AI investment research agents that synthesize multiple data points into LP-ready narratives
Troubleshooting
Deal memo returns empty or zero scores for a well-known company. Company name matching is case-sensitive and exact for some data sources. Try the exact legal entity name (e.g., "OpenAI, Inc." instead of "OpenAI"). For corporate registry lookups, the registered name may differ from the trade name.
assess_innovation_velocity shows no patents for an active IP company. Patent data is returned based on the assignee name in the USPTO/EPO registry. Subsidiaries or acquired companies may file patents under a different legal name. Try the parent company name or the operating entity name from the OpenCorporates output.
Tool call times out or returns an error. Each downstream actor has a 120-second timeout. During peak Apify platform load, some actors may exceed this. The runActorsParallel function uses Promise.allSettled(), so partial results are still returned for the actors that completed. Retry the specific failing tool independently.
Spending limit reached message returned instead of results. You have hit the per-session or per-account spending cap. Increase the limit in Account Settings or reduce tool calls per session.
Tech stack shows very few components for a technically sophisticated company. The tech stack detector needs a publicly accessible URL. If the website is behind aggressive bot protection (e.g., Cloudflare Under Attack Mode), provide the marketing subdomain (e.g., "https://www.company.com") rather than the app subdomain.
Responsible use
- This MCP only accesses publicly available data from corporate registries, patent databases, GitHub, ArXiv, and job posting aggregators.
- Respect the terms of service of each underlying data source and the Apify platform terms.
- Do not use corporate structure or officer data to target individuals for unsolicited outreach.
- Investment decisions based on this tool should be supplemented with proper legal, financial, and technical due diligence.
- For guidance on web scraping legality, see Apify's guide.
FAQ
How does startup ecosystem intelligence compare to Crunchbase or PitchBook for deal sourcing? Crunchbase and PitchBook aggregate self-reported data from founders and investors — round sizes, valuations, and team bios that companies control. This MCP measures observable behavior: patent filings, GitHub commits, hiring patterns, and academic research output. The two approaches are complementary. Use this MCP for behavioral due diligence and use Crunchbase/PitchBook for fundraising history.
How accurate is the Innovation Velocity Score for early-stage startups? Accuracy depends on public footprint. A seed-stage company with 3 GitHub repos, 1 ArXiv paper, and no patents will score low — not because it lacks innovation but because it lacks public evidence. The score is best interpreted as "observable innovation output" rather than "total innovation." It is most reliable for Series A+ companies with 12+ months of public activity.
Can startup ecosystem intelligence assess pre-revenue companies? Yes. Patent filings, ArXiv publications, GitHub activity, and job postings are available well before revenue. The scoring models do not require revenue data. A pre-revenue deep-tech company filing 5+ patents and publishing ArXiv papers will score high on innovation velocity.
How long does a full deal memo take to run?
generate_deal_memo fires all 8 downstream actors in parallel. Typical wall-clock time is 60-90 seconds. Individual tools (decode_hiring_signals, assess_innovation_velocity) run 1-4 actors and typically complete in 20-40 seconds.
How many companies can I screen in one session? There is no hard limit per session. At $0.045 per deal memo, screening 100 companies costs $4.50. Set a spending limit in your Apify account to cap costs. For large batch pipelines, use the HTTP API rather than an interactive MCP client.
Can I use startup ecosystem intelligence to monitor a portfolio company over time?
Yes. Run decode_hiring_signals and assess_innovation_velocity monthly on each portfolio company to track strategy shifts. A BUILDING company that moves to SCALING suggests product-market fit. A declining innovation velocity score alongside executive hiring may indicate a pivot or leadership change.
Is it legal to use public patent, GitHub, and corporate registry data for investment research? Yes. Patent databases (USPTO, EPO), corporate registries (OpenCorporates), GitHub (public repositories), and ArXiv are all public-access databases explicitly designed for public use. See Apify's guide on web scraping legality for broader context.
How is the competitive moat score affected if a company has no GitHub presence? The network effects component (up to 20 pts) will score zero. The tech stack and patent components are unaffected. A company with 15+ patents and a complex proprietary tech stack can still achieve a STRONG moat rating without GitHub presence — but the maximum achievable score is capped at 80 rather than 100.
What happens if one of the 8 data sources is unavailable during a deal memo run?
runActorsParallel uses Promise.allSettled(), so a single actor failure returns an empty array for that dimension rather than failing the entire request. The scoring models treat missing data as zero — which may lower scores. The response will still include all four scoring dimensions and the composite deal rating based on available data.
Can I integrate this with my existing deal flow CRM? Yes. Use the HTTP API (Python, JavaScript, or cURL examples above) to call the MCP endpoint from any integration layer. Zapier and Make connectors are available on the Apify platform for no-code CRM integration. Common patterns include triggering a deal memo on new deal entry in Affinity, HubSpot, or Salesforce and writing the composite score back to a custom field.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom integrations or enterprise deal sourcing pipelines, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Bulk Email Verifier
Verify email deliverability at scale. MX record validation, SMTP mailbox checks, disposable and role-based detection, catch-all flagging, and confidence scoring. No external API costs.
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Website Tech Stack Detector
Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.
Ready to try Startup Ecosystem Intelligence MCP Server?
Start for free on Apify. No credit card required.
Open on Apify Store