DEVELOPER TOOLSLEAD GENERATIONSEO TOOLS

Website Tech Stack Detector

Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.

Try on Apify Store
$0.10per event
3
Users (30d)
36
Runs (30d)
95
Actively maintained
Maintenance Pulse
$0.10
Per event

Maintenance Pulse

95/100
Last Build
Today
Last Version
3d ago
Builds (30d)
10
Issue Response
5h avg

Cost Estimate

How many results do you need?

website-analyzeds
Estimated cost:$10.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
website-analyzedCharged per website analyzed. Includes multi-page technology detection, category grouping, and confidence scoring.$0.10

Example: 100 events = $10.00 · 1,000 events = $100.00

Documentation

Identify the technologies, frameworks, and services running on any website. Website Tech Stack Detector crawls one or more URLs, inspects HTTP headers, HTML meta tags, script sources, and body content, then matches them against a fingerprint database of 106 web technologies across 17 categories. Results include technology names, categories, version numbers, confidence levels, and a grouped category summary.

Point it at a list of competitor websites, prospect domains, or your own properties and get back a clean, structured breakdown of every detectable technology -- CMS platforms, frontend frameworks, analytics trackers, marketing automation tools, payment processors, CDNs, chat widgets, and more.

Why Use Website Tech Stack Detector?

Manually checking a website's technology stack means viewing page source, inspecting network requests, and using browser extensions one site at a time. This actor automates the process at scale:

  • Batch analysis. Analyze dozens or hundreds of websites in a single run, no browser tabs or copy-pasting required.
  • Multi-page detection. Technologies like payment widgets, chat tools, and blog frameworks often only appear on subpages. The actor crawls inner pages with smart prioritization to catch them.
  • Implies chain resolution. Detecting WooCommerce automatically implies WordPress. Detecting Next.js implies React. The actor resolves these dependency chains so you get the complete picture.
  • Scheduled monitoring. Set up recurring runs to track when competitors adopt new technologies, switch platforms, or add new services.
  • Cost-effective. Runs on 256 MB memory using Cheerio HTML parsing (no browser rendering), analyzing 10-20 websites per minute.

Key Features

  • 106 technologies detected across 17 categories including CMS, Frontend Frameworks, JavaScript Libraries, CSS Frameworks, Analytics, Marketing Automation, Chat & Support, E-Commerce & Payment, CDN & Performance, Hosting & Infrastructure, Email Services, Fonts & Icons, Security, Widgets, Developer Tools, A/B Testing, and Privacy.
  • Multi-page analysis -- Crawl up to 10 pages per domain to catch technologies on subpages.
  • Smart page prioritization -- Inner page discovery prioritizes /blog, /pricing, /shop, /docs, and /app over generic links.
  • Version detection -- Extracts version numbers for WordPress, jQuery, Angular, Bootstrap, D3.js, PHP, Nginx, Apache, Drupal, Joomla, ASP.NET, and IIS.
  • Confidence scoring -- Each detection is rated as "high" (headers, meta tags, script sources) or "medium" (HTML body patterns, implied dependencies).
  • Implies chain resolution -- Automatically infers related technologies (WooCommerce → WordPress, Next.js → React, Nuxt.js → Vue.js, Gatsby → React).
  • Category grouping -- Output includes a categories object grouping technology names by category for quick scanning.
  • Bare domain support -- Enter stripe.com without the protocol; the actor normalizes it automatically.
  • Domain deduplication -- Same domain entered multiple times is analyzed only once.

How to Use

  1. Open the actor in the Apify Console.
  2. Enter one or more website URLs or bare domains in the Website URLs or Domains field.
  3. Optionally adjust Max pages per domain (default is 3; set higher to discover more technologies on inner pages).
  4. Click Start to run the actor.
  5. When the run completes, open the Dataset tab to view or export results.

Input Parameters

ParameterTypeRequiredDefaultDescription
urlsString[]Yes--Website URLs or bare domains to analyze (e.g., stripe.com, https://shopify.com)
maxPagesPerDomainIntegerNo3Max pages to crawl per domain (1-10). Higher values find more technologies
proxyConfigurationObjectNoApify ProxyProxy settings. Apify Proxy is enabled by default

Input Examples

Competitor tech stack comparison:

{
    "urls": ["shopify.com", "bigcommerce.com", "woocommerce.com", "squarespace.com"],
    "maxPagesPerDomain": 5
}

Prospect technology profiling for sales:

{
    "urls": ["https://acme.co", "https://initech.com", "https://globex.io"],
    "maxPagesPerDomain": 3
}

Deep single-site analysis:

{
    "urls": ["https://example.com"],
    "maxPagesPerDomain": 10
}

Input Tips

  • Start with the default 3 pages per domain. Only increase to 5-10 if you suspect technologies are hidden on inner pages (e.g., payment widgets on checkout, blog CMS on /blog).
  • Use bare domains for convenience -- stripe.com works just as well as https://www.stripe.com/.
  • Enable proxy when analyzing many sites to avoid rate limiting.
  • For large lists (100+ domains), consider splitting across multiple runs for faster completion.

Output Example

Each website produces one dataset item:

{
    "url": "https://shopify.com",
    "domain": "shopify.com",
    "technologies": [
        {
            "name": "Shopify",
            "category": "CMS",
            "version": null,
            "website": "https://www.shopify.com",
            "confidence": "high"
        },
        {
            "name": "React",
            "category": "Frontend Frameworks",
            "version": null,
            "website": "https://react.dev",
            "confidence": "high"
        },
        {
            "name": "Cloudflare",
            "category": "CDN & Performance",
            "version": null,
            "website": "https://www.cloudflare.com",
            "confidence": "high"
        },
        {
            "name": "Google Analytics",
            "category": "Analytics",
            "version": null,
            "website": "https://analytics.google.com",
            "confidence": "high"
        },
        {
            "name": "Google Tag Manager",
            "category": "Analytics",
            "version": null,
            "website": "https://tagmanager.google.com",
            "confidence": "high"
        },
        {
            "name": "Google Fonts",
            "category": "Fonts & Icons",
            "version": null,
            "website": "https://fonts.google.com",
            "confidence": "medium"
        }
    ],
    "techCount": 6,
    "categories": {
        "CMS": ["Shopify"],
        "Frontend Frameworks": ["React"],
        "CDN & Performance": ["Cloudflare"],
        "Analytics": ["Google Analytics", "Google Tag Manager"],
        "Fonts & Icons": ["Google Fonts"]
    },
    "pagesAnalyzed": 3,
    "analyzedAt": "2025-01-15T10:30:00.000Z"
}

Output Fields

FieldTypeDescription
urlStringThe analyzed website URL
domainStringNormalized domain (www prefix stripped)
technologiesObject[]Array of detected technologies
technologies[].nameStringTechnology name
technologies[].categoryStringTechnology category (see Detection Reference)
technologies[].versionString|nullVersion number if detectable
technologies[].websiteStringOfficial website of the technology
technologies[].confidenceString"high" (headers, meta, scripts) or "medium" (HTML patterns, implied)
techCountNumberTotal number of technologies detected
categoriesObjectTechnology names grouped by category
pagesAnalyzedNumberNumber of pages crawled for this domain
analyzedAtStringISO 8601 timestamp

Programmatic Access (API)

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/website-tech-stack-detector").call(run_input={
    "urls": ["shopify.com", "stripe.com", "vercel.com"],
    "maxPagesPerDomain": 5,
})

for site in client.dataset(run["defaultDatasetId"]).iterate_items():
    techs = ", ".join(t["name"] for t in site["technologies"])
    print(f'{site["domain"]} ({site["techCount"]} techs): {techs}')

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/website-tech-stack-detector").call({
    urls: ["shopify.com", "stripe.com", "vercel.com"],
    maxPagesPerDomain: 5,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const site of items) {
    const techs = site.technologies.map(t => t.name).join(", ");
    console.log(`${site.domain} (${site.techCount} techs): ${techs}`);
}

cURL

curl -X POST "https://api.apify.com/v2/acts/ryanclinton~website-tech-stack-detector/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["shopify.com", "stripe.com", "vercel.com"],
    "maxPagesPerDomain": 5
  }'

# Fetch results
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How It Works

Detection Pipeline

For each page crawled, the detector runs three detection passes in order of cost (cheapest first):

  1. HTTP Headers -- Checks response headers like Server, X-Powered-By, X-Drupal-Cache, X-Shopify-Stage for technology signatures. Detections from headers are rated high confidence.

  2. HTML Meta + Scripts -- Inspects <meta name="generator"> tags and <script src> attributes for matching patterns. Script-based detection catches CDN URLs (e.g., cdn.shopify.com), analytics snippets, and framework-specific files. Detections are rated high confidence.

  3. HTML Body Patterns -- Scans the full HTML body for technology-specific patterns (e.g., wp-content, gatsby-image, data-wf-site). This is the most expensive check and runs last. Detections are rated medium confidence.

After all three passes, the detector resolves implies chains: if WooCommerce is detected, WordPress is automatically added. If Next.js is detected, React is added. This loop runs until no new implied technologies are found.

Page Prioritization

When crawling inner pages (beyond the homepage), the actor prioritizes pages likely to reveal additional technologies:

PriorityPage PathsWhy
1 (highest)/blog, /pricing, /product, /shop, /store, /app, /dashboardDifferent CMS, payment widgets, app frameworks
2/about, /contact, /teamForm widgets, map embeds, chat tools
3/docs, /documentation, /help, /supportDocumentation frameworks, search tools
5 (default)All other internal linksGeneral coverage

Asset files (images, CSS, JS, PDF) and anchor-only links are automatically skipped.

Version Detection

Version numbers are extracted for technologies that expose them:

SourceTechnologies
HTTP headers (Server, X-Powered-By, X-AspNet-Version)PHP, Nginx, Apache, ASP.NET, IIS
Meta tags (<meta name="generator">)WordPress, Drupal, Joomla
Script filenames (version in URL path)jQuery, Angular, Bootstrap, D3.js

Detection Reference — 17 Categories

CategoryExample Technologies
CMSWordPress, Shopify, Drupal, Joomla, Squarespace, Wix, Webflow, Ghost, Hugo, Gatsby, BigCommerce, Magento, PrestaShop, WooCommerce
Frontend FrameworksReact, Vue.js, Angular, Next.js, Nuxt.js, Svelte, Ember.js, Backbone.js
JavaScript LibrariesjQuery, Lodash, Moment.js, D3.js, Three.js, GSAP, Axios
CSS FrameworksBootstrap, Tailwind CSS, Bulma, Foundation, Materialize
AnalyticsGoogle Analytics, Google Tag Manager, Hotjar, Mixpanel, Segment, Plausible, Matomo, Amplitude, Heap, Pendo
Marketing AutomationHubSpot, Marketo, Pardot, Mailchimp, Klaviyo, ActiveCampaign, Drip, ConvertKit
Chat & SupportIntercom, Drift, Zendesk, LiveChat, Crisp, Tidio, Olark, Freshdesk
E-Commerce & PaymentStripe, PayPal, Square, Braintree
CDN & PerformanceCloudflare, Fastly, Akamai, CloudFront, jsDelivr, Unpkg
Hosting & InfrastructureNginx, Apache, IIS, PHP, ASP.NET, Vercel, Netlify, Heroku, AWS
Email ServicesMailgun, SendGrid, Postmark
Fonts & IconsGoogle Fonts, Font Awesome, Adobe Fonts
SecurityreCAPTCHA, hCaptcha, Cloudflare Bot Management
WidgetsGoogle Maps, YouTube Embed
Developer ToolsWebpack, Vite, Babel
A/B TestingOptimizely, Google Optimize, VWO
PrivacyOneTrust, CookieBot, TrustArc

How Much Does It Cost?

Website Tech Stack Detector uses Cheerio-based HTML parsing (no browser rendering), making it very cost-effective:

ScenarioTimeCost
10 websites, 3 pages each~30 secondsLess than $0.01
50 websites, 3 pages each~2-3 minutes~$0.03
100 websites, 3 pages each~5-8 minutes~$0.05-$0.10
100 websites, 1 page each~2-3 minutes~$0.02-$0.04

Runs on 256 MB memory (minimum Apify allocation). The primary cost driver is the number of pages crawled -- setting maxPagesPerDomain to 1 is roughly 3x faster and cheaper than the default of 3.

Tips

  1. Start with the default 3 pages per domain. This catches most technologies without excessive crawling. Only increase to 5-10 if you suspect technologies are hidden on inner pages.
  2. Use bare domains for convenience. stripe.com works just as well as https://www.stripe.com/.
  3. Combine with other actors for enrichment. Pair with Website Contact Scraper or B2B Lead Qualifier to build enriched lead lists that include technology data.
  4. Schedule weekly runs to monitor competitor technology changes.
  5. Check the categories object for a quick summary instead of scrolling through the full technologies array.
  6. Use techCount for filtering. Export to Google Sheets and sort by techCount to identify the most technically sophisticated websites in your prospect list.

Limitations

  • HTML-only detection. Uses CheerioCrawler without a browser, so technologies that are loaded entirely through JavaScript at runtime (client-side-only rendering with no server-side hints) may not be detected.
  • Fingerprint database scope. The 106-technology database covers the most common web technologies but may miss niche or very new tools. Technologies without distinctive HTML/header signatures are harder to detect.
  • No login-protected pages. The actor cannot access pages behind authentication. Technologies used only in dashboard or admin areas will not be detected.
  • Version detection is limited. Only technologies that expose version numbers in HTTP headers, meta tags, or script filenames can have versions extracted. Most technologies return null for version.
  • Confidence levels are binary. Detection confidence is either "high" or "medium" -- there is no numeric score. "Medium" can mean either an HTML body pattern match or an implied dependency.
  • Rate limiting. The actor uses 10 concurrent requests at 120 requests/minute. Very large lists (1000+ sites) should be split across runs.

Responsible Use

This actor accesses publicly available website content. Follow these guidelines:

  • Respect robots.txt. The CheerioCrawler respects robots.txt directives by default. Websites that disallow crawling will be skipped.
  • Use reasonable crawl depths. The default of 3 pages per domain is conservative. Avoid setting maxPagesPerDomain to 10 on hundreds of sites simultaneously.
  • Do not use for competitive harm. Technology information should be used for legitimate business purposes like market research, sales prospecting, and security assessments, not for exploiting detected vulnerabilities.

FAQ

How many technologies can this actor detect? The fingerprint database covers 106 technologies across 17 categories, including major CMS platforms, frontend frameworks, analytics services, marketing tools, payment processors, CDNs, and more.

Does it detect version numbers? Yes, for technologies that expose version info in headers, meta tags, or script filenames. Version detection works for WordPress, Drupal, Joomla, jQuery, Angular, Bootstrap, D3.js, PHP, Nginx, Apache, ASP.NET, and IIS.

Why should I crawl more than just the homepage? Many technologies only appear on specific pages. Payment widgets load on checkout pages. Blog CMS themes appear on /blog. Chat widgets may initialize on certain routes. Crawling 3-5 pages significantly improves detection coverage.

Can I scan hundreds of websites in one run? Yes. The actor handles concurrent requests with 10 concurrency and 120 requests/minute rate limiting. For very large lists (1000+), consider splitting across runs.

What does the confidence field mean? "High" means the technology was identified through a definitive signal (HTTP header, meta tag, script source URL). "Medium" means it was detected through an HTML body pattern match or inferred via an implies relationship.

What are implies chains? Some technologies imply the presence of others. Detecting WooCommerce implies WordPress is present. Detecting Next.js implies React. Detecting Gatsby implies React. The actor automatically resolves these chains.

Integrations

  • Apify API -- Trigger runs programmatically and retrieve results via REST API.
  • Google Sheets -- Export results directly to a spreadsheet for team sharing and analysis.
  • Zapier / Make -- Connect to 5,000+ apps through Apify integrations.
  • Webhooks -- Get notified when runs complete and pipe results to your own endpoint.
  • Scheduled Runs -- Track technology changes over time with recurring runs.

Related Actors

ActorWhat It DoesHow It Complements This Actor
Website Contact ScraperExtracts emails, phones, contacts from websitesCombine tech stack data with contact information for sales outreach
B2B Lead QualifierScores and grades leads based on website signalsThe qualifier detects CMS and tech as part of its scoring -- use this actor for a deeper standalone analysis
B2B Lead Gen SuiteFull enrichment pipeline from URLsBuild complete lead profiles including contacts, patterns, scores, and tech stacks
Company Deep Research AgentComprehensive company intelligence reportsCombine tech stack analysis with broader company research
WHOIS Domain LookupDomain registration and ownership dataPair domain age and registrar data with technology analysis
SaaS Competitive IntelligenceCompetitive analysis for SaaS companiesSupplement competitive intelligence with technology stack comparisons

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Website Tech Stack Detector?

Start for free on Apify. No credit card required.

Open on Apify Store