Website Tech Stack Detector
Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.
Maintenance Pulse
95/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| website-analyzed | Charged per website analyzed. Includes multi-page technology detection, category grouping, and confidence scoring. | $0.10 |
Example: 100 events = $10.00 · 1,000 events = $100.00
Documentation
Identify the technologies, frameworks, and services running on any website. Website Tech Stack Detector crawls one or more URLs, inspects HTTP headers, HTML meta tags, script sources, and body content, then matches them against a fingerprint database of 106 web technologies across 17 categories. Results include technology names, categories, version numbers, confidence levels, and a grouped category summary.
Point it at a list of competitor websites, prospect domains, or your own properties and get back a clean, structured breakdown of every detectable technology -- CMS platforms, frontend frameworks, analytics trackers, marketing automation tools, payment processors, CDNs, chat widgets, and more.
Why Use Website Tech Stack Detector?
Manually checking a website's technology stack means viewing page source, inspecting network requests, and using browser extensions one site at a time. This actor automates the process at scale:
- Batch analysis. Analyze dozens or hundreds of websites in a single run, no browser tabs or copy-pasting required.
- Multi-page detection. Technologies like payment widgets, chat tools, and blog frameworks often only appear on subpages. The actor crawls inner pages with smart prioritization to catch them.
- Implies chain resolution. Detecting WooCommerce automatically implies WordPress. Detecting Next.js implies React. The actor resolves these dependency chains so you get the complete picture.
- Scheduled monitoring. Set up recurring runs to track when competitors adopt new technologies, switch platforms, or add new services.
- Cost-effective. Runs on 256 MB memory using Cheerio HTML parsing (no browser rendering), analyzing 10-20 websites per minute.
Key Features
- 106 technologies detected across 17 categories including CMS, Frontend Frameworks, JavaScript Libraries, CSS Frameworks, Analytics, Marketing Automation, Chat & Support, E-Commerce & Payment, CDN & Performance, Hosting & Infrastructure, Email Services, Fonts & Icons, Security, Widgets, Developer Tools, A/B Testing, and Privacy.
- Multi-page analysis -- Crawl up to 10 pages per domain to catch technologies on subpages.
- Smart page prioritization -- Inner page discovery prioritizes /blog, /pricing, /shop, /docs, and /app over generic links.
- Version detection -- Extracts version numbers for WordPress, jQuery, Angular, Bootstrap, D3.js, PHP, Nginx, Apache, Drupal, Joomla, ASP.NET, and IIS.
- Confidence scoring -- Each detection is rated as "high" (headers, meta tags, script sources) or "medium" (HTML body patterns, implied dependencies).
- Implies chain resolution -- Automatically infers related technologies (WooCommerce → WordPress, Next.js → React, Nuxt.js → Vue.js, Gatsby → React).
- Category grouping -- Output includes a
categoriesobject grouping technology names by category for quick scanning. - Bare domain support -- Enter
stripe.comwithout the protocol; the actor normalizes it automatically. - Domain deduplication -- Same domain entered multiple times is analyzed only once.
How to Use
- Open the actor in the Apify Console.
- Enter one or more website URLs or bare domains in the Website URLs or Domains field.
- Optionally adjust Max pages per domain (default is 3; set higher to discover more technologies on inner pages).
- Click Start to run the actor.
- When the run completes, open the Dataset tab to view or export results.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
urls | String[] | Yes | -- | Website URLs or bare domains to analyze (e.g., stripe.com, https://shopify.com) |
maxPagesPerDomain | Integer | No | 3 | Max pages to crawl per domain (1-10). Higher values find more technologies |
proxyConfiguration | Object | No | Apify Proxy | Proxy settings. Apify Proxy is enabled by default |
Input Examples
Competitor tech stack comparison:
{
"urls": ["shopify.com", "bigcommerce.com", "woocommerce.com", "squarespace.com"],
"maxPagesPerDomain": 5
}
Prospect technology profiling for sales:
{
"urls": ["https://acme.co", "https://initech.com", "https://globex.io"],
"maxPagesPerDomain": 3
}
Deep single-site analysis:
{
"urls": ["https://example.com"],
"maxPagesPerDomain": 10
}
Input Tips
- Start with the default 3 pages per domain. Only increase to 5-10 if you suspect technologies are hidden on inner pages (e.g., payment widgets on checkout, blog CMS on /blog).
- Use bare domains for convenience --
stripe.comworks just as well ashttps://www.stripe.com/. - Enable proxy when analyzing many sites to avoid rate limiting.
- For large lists (100+ domains), consider splitting across multiple runs for faster completion.
Output Example
Each website produces one dataset item:
{
"url": "https://shopify.com",
"domain": "shopify.com",
"technologies": [
{
"name": "Shopify",
"category": "CMS",
"version": null,
"website": "https://www.shopify.com",
"confidence": "high"
},
{
"name": "React",
"category": "Frontend Frameworks",
"version": null,
"website": "https://react.dev",
"confidence": "high"
},
{
"name": "Cloudflare",
"category": "CDN & Performance",
"version": null,
"website": "https://www.cloudflare.com",
"confidence": "high"
},
{
"name": "Google Analytics",
"category": "Analytics",
"version": null,
"website": "https://analytics.google.com",
"confidence": "high"
},
{
"name": "Google Tag Manager",
"category": "Analytics",
"version": null,
"website": "https://tagmanager.google.com",
"confidence": "high"
},
{
"name": "Google Fonts",
"category": "Fonts & Icons",
"version": null,
"website": "https://fonts.google.com",
"confidence": "medium"
}
],
"techCount": 6,
"categories": {
"CMS": ["Shopify"],
"Frontend Frameworks": ["React"],
"CDN & Performance": ["Cloudflare"],
"Analytics": ["Google Analytics", "Google Tag Manager"],
"Fonts & Icons": ["Google Fonts"]
},
"pagesAnalyzed": 3,
"analyzedAt": "2025-01-15T10:30:00.000Z"
}
Output Fields
| Field | Type | Description |
|---|---|---|
url | String | The analyzed website URL |
domain | String | Normalized domain (www prefix stripped) |
technologies | Object[] | Array of detected technologies |
technologies[].name | String | Technology name |
technologies[].category | String | Technology category (see Detection Reference) |
technologies[].version | String|null | Version number if detectable |
technologies[].website | String | Official website of the technology |
technologies[].confidence | String | "high" (headers, meta, scripts) or "medium" (HTML patterns, implied) |
techCount | Number | Total number of technologies detected |
categories | Object | Technology names grouped by category |
pagesAnalyzed | Number | Number of pages crawled for this domain |
analyzedAt | String | ISO 8601 timestamp |
Programmatic Access (API)
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/website-tech-stack-detector").call(run_input={
"urls": ["shopify.com", "stripe.com", "vercel.com"],
"maxPagesPerDomain": 5,
})
for site in client.dataset(run["defaultDatasetId"]).iterate_items():
techs = ", ".join(t["name"] for t in site["technologies"])
print(f'{site["domain"]} ({site["techCount"]} techs): {techs}')
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/website-tech-stack-detector").call({
urls: ["shopify.com", "stripe.com", "vercel.com"],
maxPagesPerDomain: 5,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const site of items) {
const techs = site.technologies.map(t => t.name).join(", ");
console.log(`${site.domain} (${site.techCount} techs): ${techs}`);
}
cURL
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~website-tech-stack-detector/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"urls": ["shopify.com", "stripe.com", "vercel.com"],
"maxPagesPerDomain": 5
}'
# Fetch results
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How It Works
Detection Pipeline
For each page crawled, the detector runs three detection passes in order of cost (cheapest first):
-
HTTP Headers -- Checks response headers like
Server,X-Powered-By,X-Drupal-Cache,X-Shopify-Stagefor technology signatures. Detections from headers are rated high confidence. -
HTML Meta + Scripts -- Inspects
<meta name="generator">tags and<script src>attributes for matching patterns. Script-based detection catches CDN URLs (e.g.,cdn.shopify.com), analytics snippets, and framework-specific files. Detections are rated high confidence. -
HTML Body Patterns -- Scans the full HTML body for technology-specific patterns (e.g.,
wp-content,gatsby-image,data-wf-site). This is the most expensive check and runs last. Detections are rated medium confidence.
After all three passes, the detector resolves implies chains: if WooCommerce is detected, WordPress is automatically added. If Next.js is detected, React is added. This loop runs until no new implied technologies are found.
Page Prioritization
When crawling inner pages (beyond the homepage), the actor prioritizes pages likely to reveal additional technologies:
| Priority | Page Paths | Why |
|---|---|---|
| 1 (highest) | /blog, /pricing, /product, /shop, /store, /app, /dashboard | Different CMS, payment widgets, app frameworks |
| 2 | /about, /contact, /team | Form widgets, map embeds, chat tools |
| 3 | /docs, /documentation, /help, /support | Documentation frameworks, search tools |
| 5 (default) | All other internal links | General coverage |
Asset files (images, CSS, JS, PDF) and anchor-only links are automatically skipped.
Version Detection
Version numbers are extracted for technologies that expose them:
| Source | Technologies |
|---|---|
HTTP headers (Server, X-Powered-By, X-AspNet-Version) | PHP, Nginx, Apache, ASP.NET, IIS |
Meta tags (<meta name="generator">) | WordPress, Drupal, Joomla |
| Script filenames (version in URL path) | jQuery, Angular, Bootstrap, D3.js |
Detection Reference — 17 Categories
| Category | Example Technologies |
|---|---|
| CMS | WordPress, Shopify, Drupal, Joomla, Squarespace, Wix, Webflow, Ghost, Hugo, Gatsby, BigCommerce, Magento, PrestaShop, WooCommerce |
| Frontend Frameworks | React, Vue.js, Angular, Next.js, Nuxt.js, Svelte, Ember.js, Backbone.js |
| JavaScript Libraries | jQuery, Lodash, Moment.js, D3.js, Three.js, GSAP, Axios |
| CSS Frameworks | Bootstrap, Tailwind CSS, Bulma, Foundation, Materialize |
| Analytics | Google Analytics, Google Tag Manager, Hotjar, Mixpanel, Segment, Plausible, Matomo, Amplitude, Heap, Pendo |
| Marketing Automation | HubSpot, Marketo, Pardot, Mailchimp, Klaviyo, ActiveCampaign, Drip, ConvertKit |
| Chat & Support | Intercom, Drift, Zendesk, LiveChat, Crisp, Tidio, Olark, Freshdesk |
| E-Commerce & Payment | Stripe, PayPal, Square, Braintree |
| CDN & Performance | Cloudflare, Fastly, Akamai, CloudFront, jsDelivr, Unpkg |
| Hosting & Infrastructure | Nginx, Apache, IIS, PHP, ASP.NET, Vercel, Netlify, Heroku, AWS |
| Email Services | Mailgun, SendGrid, Postmark |
| Fonts & Icons | Google Fonts, Font Awesome, Adobe Fonts |
| Security | reCAPTCHA, hCaptcha, Cloudflare Bot Management |
| Widgets | Google Maps, YouTube Embed |
| Developer Tools | Webpack, Vite, Babel |
| A/B Testing | Optimizely, Google Optimize, VWO |
| Privacy | OneTrust, CookieBot, TrustArc |
How Much Does It Cost?
Website Tech Stack Detector uses Cheerio-based HTML parsing (no browser rendering), making it very cost-effective:
| Scenario | Time | Cost |
|---|---|---|
| 10 websites, 3 pages each | ~30 seconds | Less than $0.01 |
| 50 websites, 3 pages each | ~2-3 minutes | ~$0.03 |
| 100 websites, 3 pages each | ~5-8 minutes | ~$0.05-$0.10 |
| 100 websites, 1 page each | ~2-3 minutes | ~$0.02-$0.04 |
Runs on 256 MB memory (minimum Apify allocation). The primary cost driver is the number of pages crawled -- setting maxPagesPerDomain to 1 is roughly 3x faster and cheaper than the default of 3.
Tips
- Start with the default 3 pages per domain. This catches most technologies without excessive crawling. Only increase to 5-10 if you suspect technologies are hidden on inner pages.
- Use bare domains for convenience.
stripe.comworks just as well ashttps://www.stripe.com/. - Combine with other actors for enrichment. Pair with Website Contact Scraper or B2B Lead Qualifier to build enriched lead lists that include technology data.
- Schedule weekly runs to monitor competitor technology changes.
- Check the
categoriesobject for a quick summary instead of scrolling through the full technologies array. - Use
techCountfor filtering. Export to Google Sheets and sort bytechCountto identify the most technically sophisticated websites in your prospect list.
Limitations
- HTML-only detection. Uses CheerioCrawler without a browser, so technologies that are loaded entirely through JavaScript at runtime (client-side-only rendering with no server-side hints) may not be detected.
- Fingerprint database scope. The 106-technology database covers the most common web technologies but may miss niche or very new tools. Technologies without distinctive HTML/header signatures are harder to detect.
- No login-protected pages. The actor cannot access pages behind authentication. Technologies used only in dashboard or admin areas will not be detected.
- Version detection is limited. Only technologies that expose version numbers in HTTP headers, meta tags, or script filenames can have versions extracted. Most technologies return
nullfor version. - Confidence levels are binary. Detection confidence is either "high" or "medium" -- there is no numeric score. "Medium" can mean either an HTML body pattern match or an implied dependency.
- Rate limiting. The actor uses 10 concurrent requests at 120 requests/minute. Very large lists (1000+ sites) should be split across runs.
Responsible Use
This actor accesses publicly available website content. Follow these guidelines:
- Respect robots.txt. The CheerioCrawler respects robots.txt directives by default. Websites that disallow crawling will be skipped.
- Use reasonable crawl depths. The default of 3 pages per domain is conservative. Avoid setting
maxPagesPerDomainto 10 on hundreds of sites simultaneously. - Do not use for competitive harm. Technology information should be used for legitimate business purposes like market research, sales prospecting, and security assessments, not for exploiting detected vulnerabilities.
FAQ
How many technologies can this actor detect? The fingerprint database covers 106 technologies across 17 categories, including major CMS platforms, frontend frameworks, analytics services, marketing tools, payment processors, CDNs, and more.
Does it detect version numbers? Yes, for technologies that expose version info in headers, meta tags, or script filenames. Version detection works for WordPress, Drupal, Joomla, jQuery, Angular, Bootstrap, D3.js, PHP, Nginx, Apache, ASP.NET, and IIS.
Why should I crawl more than just the homepage? Many technologies only appear on specific pages. Payment widgets load on checkout pages. Blog CMS themes appear on /blog. Chat widgets may initialize on certain routes. Crawling 3-5 pages significantly improves detection coverage.
Can I scan hundreds of websites in one run? Yes. The actor handles concurrent requests with 10 concurrency and 120 requests/minute rate limiting. For very large lists (1000+), consider splitting across runs.
What does the confidence field mean? "High" means the technology was identified through a definitive signal (HTTP header, meta tag, script source URL). "Medium" means it was detected through an HTML body pattern match or inferred via an implies relationship.
What are implies chains? Some technologies imply the presence of others. Detecting WooCommerce implies WordPress is present. Detecting Next.js implies React. Detecting Gatsby implies React. The actor automatically resolves these chains.
Integrations
- Apify API -- Trigger runs programmatically and retrieve results via REST API.
- Google Sheets -- Export results directly to a spreadsheet for team sharing and analysis.
- Zapier / Make -- Connect to 5,000+ apps through Apify integrations.
- Webhooks -- Get notified when runs complete and pipe results to your own endpoint.
- Scheduled Runs -- Track technology changes over time with recurring runs.
Related Actors
| Actor | What It Does | How It Complements This Actor |
|---|---|---|
| Website Contact Scraper | Extracts emails, phones, contacts from websites | Combine tech stack data with contact information for sales outreach |
| B2B Lead Qualifier | Scores and grades leads based on website signals | The qualifier detects CMS and tech as part of its scoring -- use this actor for a deeper standalone analysis |
| B2B Lead Gen Suite | Full enrichment pipeline from URLs | Build complete lead profiles including contacts, patterns, scores, and tech stacks |
| Company Deep Research Agent | Comprehensive company intelligence reports | Combine tech stack analysis with broader company research |
| WHOIS Domain Lookup | Domain registration and ownership data | Pair domain age and registrar data with technology analysis |
| SaaS Competitive Intelligence | Competitive analysis for SaaS companies | Supplement competitive intelligence with technology stack comparisons |
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Website Contact Scraper
Extract emails, phone numbers, team members, and social media links from any business website. Feed it URLs from Google Maps or your CRM and get structured contact data back. Fast HTTP requests, no browser — scrapes 1,000 sites for ~$0.50.
Email Pattern Finder
Discover the email format used by any company. Enter a domain like stripe.com and detect patterns like [email protected]. Then generate email addresses for any name. Combine with Website Contact Scraper to turn company websites into complete email lists.
Waterfall Contact Enrichment
Find business emails, phones, and social profiles from a name + company domain. Cascades through MX validation, website scraping, pattern detection, and SMTP verification. Free Clay alternative.
B2B Lead Qualifier - Score & Rank Company Leads
Score and rank B2B leads 0-100 by crawling company websites. Analyzes 30+ signals across contact reachability, business legitimacy, online presence, website quality, and team transparency. No AI keys needed.
Ready to try Website Tech Stack Detector?
Start for free on Apify. No credit card required.
Open on Apify Store