OTHER

G2 Company Scraper

G2 Company Scraper extracts software company leads directly from G2 category pages — giving sales teams, agencies, and researchers structured lists of companies with ratings, review counts, employee size, headquarters, and pricing tier. Point it at any G2 category slug and get a clean, deduplicated dataset ready for outreach or CRM import.

Try on Apify Store

Users (30d)

Runs (30d)

Actively maintained

Maintenance Pulse

Free

Per event

Maintenance Pulse

90/100

Last Build

Today

Last Version

1d ago

Builds (30d)

Issue Response

N/A

Documentation

Built on CheerioCrawler with session pooling and residential proxy support, the actor pages through G2 category listings automatically, applies quality filters on the fly, and stops the moment your spending limit is reached. No code required: configure categories in the UI, hit Start, and download results as JSON, CSV, or Excel.

What data can you extract?

Data Point	Source	Example
🏢 Company Name	G2 product listing card	Salesforce
🌐 Website URL	G2 listing — `productListingWebsite`	https://salesforce.com
🔗 Domain	Parsed from website URL	salesforce.com
🔗 G2 Profile URL	G2 listing — `productListingLink`	https://www.g2.com/products/salesforce-crm
⭐ G2 Rating	`productListingRating` + `ratingValue` itemprop	4.4
💬 Review Count	`productListingReviews` + `reviewCount` itemprop	23,451
👥 Employee Count	`productListingEmployeeCount` + `company_size` key	1001+
📍 Headquarters	`productListingHeadquarters` + `hq_location` key	San Francisco, CA
💰 Pricing Tier	`productListingPricing` + `starting_price` key	Freemium
🏷️ Categories	`productListingCategory` tags	["crm", "sales-force-automation"]
📝 Description	`productListingDescription` text	AI-powered CRM for enterprise sales teams
📂 Source Category	Category slug used to find this record	crm

Why use G2 Company Scraper?

Building a list of software vendors in a given category by hand means clicking through page after page on G2, copying names into a spreadsheet, looking up websites separately, and manually recording ratings. For 50 companies that takes a skilled researcher 2-3 hours. For 200 companies across 5 categories, you are looking at a full day of tedious work — and the data goes stale within weeks.

This actor automates the entire process. Supply a list of G2 category slugs, set a minimum review threshold to filter out thin listings, and the actor pages through every result automatically, deduplicates companies that appear in multiple categories, and delivers a structured dataset in minutes.

Scheduling — run weekly or monthly to keep your G2 lead lists fresh as new vendors enter each category
API access — trigger runs from Python, JavaScript, or any HTTP client and pipe results directly into your CRM or data warehouse
Proxy rotation — G2 aggressively fingerprints and blocks datacenter IPs; the actor defaults to Apify's residential proxy pool so requests look like real user traffic
Monitoring — configure Slack or email alerts when runs fail or return zero results
Integrations — connect to Zapier, Make, Google Sheets, HubSpot, or webhooks to automate downstream workflows

Features

CheerioCrawler with session pooling — uses persistCookiesPerSession: true so each session maintains state across requests, reducing block rates on G2's fingerprinting layer
Dual-selector extraction strategy — every field tries the primary data-testid selector first, then falls back to class-based and itemprop selectors, so the actor keeps working through minor G2 markup changes
Automatic pagination — after each category page is processed, the actor enqueues the next page (?page=N) until the per-category limit is reached or no more product cards are found
Per-category limits with global deduplication — maxCompaniesPerCategory stops collection per category independently; cross-category deduplication by parsed domain prevents the same vendor appearing twice
Quality filters on ingestion — minReviews and minRating filters are applied during extraction, not post-processing, so you only pay for companies that pass your criteria
Employee size filter with OR logic — supply multiple size ranges (e.g. ["1-50", "51-1000"]) and the actor includes companies matching any of them
Pricing tier normalization — raw G2 pricing strings are mapped to clean labels: Freemium, Free, Contact Vendor, or the raw text when no mapping applies
Domain extraction and normalization — website URLs are parsed to extract the registrable domain (e.g. salesforce.com), stripping www. and subpaths, ready for deduplication or enrichment lookups
PPE-safe data ordering — data is pushed to the dataset before the PPE charge event fires, so you never pay for a record that was not saved
Spending limit enforcement — when Actor.charge() returns eventChargeLimitReached, all category loops stop immediately and the actor exits cleanly
Low concurrency by design — maxConcurrency: 2 prevents G2's rate-limiting heuristics from triggering; each request retries up to 3 times with session rotation
Run summary record — a type: "summary" record is appended to the dataset at the end of every run, showing total companies found per category and overall deduplication count

Use cases for G2 company scraping

Sales prospecting and SDR list building

Sales development reps at software companies need targeted lists of vendors in adjacent or competitive categories. Instead of buying a static list from a data broker, an SDR can scrape the G2 "marketing-automation" or "sales-engagement" categories weekly, filter for companies with at least 50 reviews and a rating above 4.0, and feed results directly into their sales engagement tool. The website field links directly to the company's homepage for contact scraping in the next step.

Marketing agency new business development

Digital agencies looking for new software clients can scrape categories relevant to their service offering — for example, a PPC agency scraping "ppc-management" or "advertising-networks" to build a list of software companies actively investing in paid media. The employee count and pricing tier fields help agencies qualify prospects by company size and budget signal before spending time on outreach.

Competitive intelligence and market mapping

Product managers and strategy teams can pull a full category like "crm" or "project-management" to map every active player, track review velocity over time by scheduling repeat runs, and monitor new entrants as they appear on G2. Comparing two successive dataset snapshots reveals which vendors are gaining or losing reviews — a leading indicator of market momentum.

Recruiting and talent sourcing

Recruiters sourcing candidates from the software industry can use G2 category pages as a company discovery tool. Scraping "human-resources" or "applicant-tracking-systems" returns a list of HR tech companies with their HQ location and employee size — useful for identifying companies in a hiring phase or in the right geography for candidate placement.

Investor deal flow and portfolio monitoring

Venture and growth investors track emerging software categories for deal flow. Scraping a category like "ai-writing-assistant" or "generative-ai" with a low minimum review threshold captures early-stage companies before they appear in traditional databases. Scheduling monthly runs creates a longitudinal view of category growth and validates market size assumptions.

Technology partner and integration discovery

Partnerships teams looking for integration partners can scrape categories adjacent to their product (e.g. a CRM company scraping "electronic-signature" or "contract-management") to identify vendors with high review counts and a complementary pricing tier. The G2 profile URL links directly to each vendor's full listing for deeper research.

How to scrape G2 company listings

Enter your G2 category slugs — find the slug in any G2 URL: g2.com/categories/{slug}. For example, crm, email-marketing, project-management. Add one or more slugs to the Categories field.
Set your quality filters — enter a minimum review count (e.g. 25) to skip thin listings, and a minimum rating (e.g. 3.5) to focus on well-regarded products. Leave both at 0 to get everything.
Run the actor — click Start. For 50 companies across 2 categories, expect a typical run to complete in 3-8 minutes depending on G2 response times and proxy latency.
Download results — go to the Dataset tab and export as JSON, CSV, or Excel. Every record includes the company name, website, domain, rating, review count, employee size, HQ, pricing tier, categories, and a timestamp.

Input parameters

Parameter	Type	Required	Default	Description
`categories`	array	Yes	`["crm"]`	G2 category slugs to scrape (e.g. `crm`, `email-marketing`, `project-management`)
`maxCompaniesPerCategory`	integer	No	`50`	Max companies collected per category. Set to `0` for no limit (may run for a long time). Max: 1000
`minReviews`	integer	No	`0`	Minimum G2 review count. Companies below this threshold are skipped and not charged
`minRating`	number	No	`0`	Minimum G2 average rating (1.0–5.0). Companies below this are skipped and not charged
`employeeSizeFilter`	array	No	`[]`	Employee size ranges to include (e.g. `["1-50", "51-1000"]`). OR logic — matches any. Empty = all sizes
`deduplicateByDomain`	boolean	No	`true`	Skip companies whose domain was already seen in a previous category in this run
`proxyConfiguration`	object	No	Residential	Proxy settings. G2 blocks datacenter IPs — residential proxies required for reliable results

Input examples

Standard: scrape two categories, filter for established products:

{
  "categories": ["crm", "email-marketing"],
  "maxCompaniesPerCategory": 50,
  "minReviews": 25,
  "minRating": 3.5,
  "deduplicateByDomain": true,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Batch research: five categories, mid-market company size focus:

{
  "categories": [
    "project-management",
    "accounting",
    "marketing-automation",
    "hr-management-suites",
    "helpdesk"
  ],
  "maxCompaniesPerCategory": 100,
  "minReviews": 10,
  "minRating": 0,
  "employeeSizeFilter": ["51-1000"],
  "deduplicateByDomain": true,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Quick test: one category, top 10 only:

{
  "categories": ["crm"],
  "maxCompaniesPerCategory": 10,
  "minReviews": 0,
  "minRating": 0,
  "deduplicateByDomain": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Input tips

Find category slugs from the G2 URL — navigate to any G2 category page and copy the slug from the URL path: g2.com/categories/crm → slug is crm. The slug must be lowercase and hyphenated.
Set minReviews to filter noise — newly listed software often has 0-5 reviews. Setting minReviews: 10 or higher removes thin listings and focuses your dataset on established products.
Use residential proxies — G2 blocks datacenter IP ranges. The default proxy config uses Apify Residential proxies. Without this, most requests will return 403 or empty pages.
Batch categories in one run — running 5 categories in a single run is faster and cheaper than 5 separate single-category runs, because session warm-up overhead is paid once.
Set a spending limit before running large batches — configure a maximum spend per run in the Apify console to cap costs automatically. The actor stops cleanly when the limit is reached.

Output example

{
  "companyName": "HubSpot",
  "website": "https://www.hubspot.com",
  "domain": "hubspot.com",
  "g2ProfileUrl": "https://www.g2.com/products/hubspot-crm/reviews",
  "rating": 4.4,
  "reviewCount": 12847,
  "employeeCount": "1001+",
  "headquarters": "Cambridge, MA",
  "pricingTier": "Freemium",
  "categories": ["crm", "marketing-automation", "sales-force-automation"],
  "description": "HubSpot CRM platform gives your sales team everything they need to be more productive, maintain pipeline visibility, and grow revenue.",
  "sourceCategory": "crm",
  "scrapedAt": "2026-03-22T09:14:32.418Z"
}

A type: "summary" record is appended as the final item in every dataset:

{
  "type": "summary",
  "categoriesScraped": ["crm", "email-marketing"],
  "totalCompaniesFound": 87,
  "totalDeduplicated": 6,
  "companiesByCategory": {
    "crm": 50,
    "email-marketing": 43
  },
  "avgRating": null,
  "scrapedAt": "2026-03-22T09:21:04.113Z"
}

Output fields

Field	Type	Description
`companyName`	string \| null	Full company name as listed on G2
`website`	string \| null	Official website URL from the G2 listing
`domain`	string \| null	Registrable domain parsed from the website URL (e.g. `hubspot.com`). Useful for deduplication and enrichment
`g2ProfileUrl`	string \| null	Absolute URL to the company's G2 product listing page
`rating`	number \| null	Average G2 star rating, rounded to one decimal (1.0–5.0)
`reviewCount`	integer \| null	Total number of G2 reviews
`employeeCount`	string \| null	Employee size range as normalized by the actor (e.g. `51-1000`, `1001+`)
`headquarters`	string \| null	HQ location as listed on G2 (e.g. `San Francisco, CA`)
`pricingTier`	string \| null	Normalized pricing label: `Free`, `Freemium`, `Contact Vendor`, or raw text
`categories`	string[]	G2 category tags on the product, always includes `sourceCategory`
`description`	string \| null	Short product description from the G2 listing card
`sourceCategory`	string	The G2 category slug used to discover this company
`scrapedAt`	string	ISO 8601 timestamp of when the record was extracted

How much does it cost to scrape G2 companies?

G2 Company Scraper uses pay-per-event pricing — you pay $0.05 per company found that passes your quality filters. Platform compute costs are included. Companies that fail your minReviews, minRating, or employeeSizeFilter are never charged.

Scenario	Companies	Cost per company	Total cost
Quick test	10	$0.05	$0.50
One category	50	$0.05	$2.50
Two categories	100	$0.05	$5.00
Five categories	400	$0.05	$20.00
Full market map	1,000	$0.05	$50.00

You can set a maximum spending limit per run in the Apify console. The actor stops cleanly the moment your budget is reached, so you never exceed your target spend.

Compare this to purchasing a G2 Buyer Intent data export or a ZoomInfo list at $300-1,000+ per month. Most users of this actor spend $5-25 per research project with no subscription commitment and no minimum seat requirement.

Scraping G2 companies using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/g2-company-scraper").call(run_input={
    "categories": ["crm", "email-marketing"],
    "maxCompaniesPerCategory": 50,
    "minReviews": 10,
    "minRating": 3.5,
    "deduplicateByDomain": True,
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item.get("type") == "summary":
        print(f"Summary: {item['totalCompaniesFound']} companies found")
    else:
        print(f"{item['companyName']} — {item['domain']} — {item['rating']} stars ({item['reviewCount']} reviews)")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/g2-company-scraper").call({
    categories: ["crm", "email-marketing"],
    maxCompaniesPerCategory: 50,
    minReviews: 10,
    minRating: 3.5,
    deduplicateByDomain: true,
    proxyConfiguration: {
        useApifyProxy: true,
        apifyProxyGroups: ["RESIDENTIAL"]
    }
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    if (item.type === "summary") continue;
    console.log(`${item.companyName} | ${item.domain} | ${item.rating} stars | ${item.employeeCount} employees | ${item.headquarters}`);
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~g2-company-scraper/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "categories": ["crm", "email-marketing"],
    "maxCompaniesPerCategory": 50,
    "minReviews": 10,
    "minRating": 3.5,
    "deduplicateByDomain": true,
    "proxyConfiguration": {
      "useApifyProxy": true,
      "apifyProxyGroups": ["RESIDENTIAL"]
    }
  }'

# Fetch results (replace DATASET_ID from the run response's defaultDatasetId field)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How G2 Company Scraper works

Phase 1: URL construction and request queuing

For each category slug in the input (e.g. crm), the actor constructs the G2 category listing URL using https://www.g2.com/categories/{slug} for page 1 and https://www.g2.com/categories/{slug}?page=N for subsequent pages. All category page 1 requests are enqueued simultaneously at startup, so multiple categories are scraped in parallel up to the maxConcurrency: 2 limit.

Phase 2: HTML parsing with dual-selector fallback

Each category page is fetched by CheerioCrawler using a residential proxy session. The handler targets [data-testid="productListing"] containers — one per software product card. Within each card, every field attempts two or three selector patterns: the primary data-testid attribute selector, then a class-based fallback, then an itemprop attribute where available. This layered approach means the actor continues extracting data correctly through minor G2 layout changes. For example, the rating is extracted from [data-testid="productListingRating"] first, then [itemprop="ratingValue"], then .stars-container[data-rating].

Phase 3: Transformation and filtering

Raw strings from the HTML are passed through dedicated parsing functions: parseRating() extracts the first numeric match from strings like "4.5 out of 5", parseReviewCount() strips all non-digit characters from strings like "1,234 reviews", and cleanEmployeeCount() normalizes G2 size labels like "Mid-Market (51-1000 emp.)" into the plain range 51-1000. Pricing tiers are normalized by keyword matching: strings containing both "free" and "paid" become Freemium, strings containing "contact" become Contact Vendor. Domain extraction uses the WHATWG URL constructor to parse the website field, then strips the www. prefix for a clean registrable domain.

After transformation, each record is checked against minReviews, minRating, and employeeSizeFilter before being pushed to the dataset. Records that fail any filter are logged at debug level and skipped without a PPE charge.

Phase 4: Pagination and PPE charge management

If the per-category limit has not been reached and at least one product card was found on the current page, the next page URL is enqueued. This continues until either the limit is reached, no cards are found (end of category), or the spending limit fires. The PPE company-found charge event is fired after each successful Actor.pushData() call. If chargeResult.eventChargeLimitReached is returned, all category loops are flagged and the crawler exits cleanly after processing the current batch.

Tips for best results

Start with a test run of 10 companies. Run a single category with maxCompaniesPerCategory: 10 to verify the output format and proxy performance before committing to a large batch.
Always use residential proxies. G2 detects and blocks datacenter IP ranges at the CDN layer. Apify Residential proxies are the default and the only reliably effective option for production runs.
Use minReviews to control data quality. Setting minReviews: 25 focuses your dataset on established, actively-used products. Setting it to 0 includes everything — useful for tracking new entrants, but expect more incomplete records.
Combine categories in one run, not multiple runs. Cross-category deduplication only works within a single run. Running 5 categories as one job also reuses session warm-up, reducing proxy cost and total runtime.
Pair with Website Contact Scraper for full lead records. G2 Company Scraper gives you the website domain; Website Contact Scraper turns each domain into email addresses and phone numbers. The domain field is ready to use as direct input.
Schedule weekly runs for fast-moving categories. Categories like generative-ai or ai-writing-assistant gain new listings frequently. A weekly schedule with deduplication ensures your list stays current without re-processing known companies.
Filter by employee size for ICP precision. If your ideal customer profile is mid-market (51-1000 employees), set employeeSizeFilter: ["51-1000"] to exclude SMB tools and enterprise-only platforms from the outset.
Export to CSV for direct CRM import. The Apify dataset export creates a flat CSV with all fields as columns, ready for import into HubSpot, Salesforce, or any spreadsheet-based workflow.

Combine with other Apify actors

Actor	How to combine
Website Contact Scraper	Feed the `domain` field from each G2 record into Website Contact Scraper to extract emails, phone numbers, and LinkedIn profiles for every company found
Website Contact Scraper Pro	Use the JS-rendering version for SaaS company websites that load contact details dynamically via React or Vue
Email Pattern Finder	Submit each domain to detect the company's email naming convention (e.g. `[email protected]`) before building outreach sequences
B2B Lead Qualifier	Score each G2 company 0-100 against 30+ ICP signals to prioritize the highest-value accounts before outreach
Waterfall Contact Enrichment	Run each domain through a 10-step enrichment cascade to find verified contact details across multiple data sources
Bulk Email Verifier	Verify emails found from downstream enrichment via MX and SMTP checks before importing into your sending tool
Website Tech Stack Detector	Detect what technologies each G2 company uses — useful for qualifying based on existing tech stack (e.g. "uses Salesforce + Marketo")
HubSpot Lead Pusher	Push the output dataset directly into HubSpot as contacts or companies without any intermediate steps

Limitations

Static HTML only. This actor uses CheerioCrawler, which parses server-rendered HTML. G2 category pages deliver initial content server-side, but dynamic features like filtered search, sort-by-score, or personalized rankings may not be reflected accurately.
G2 markup changes break selectors. G2 periodically redesigns its category listing pages. The actor uses dual-selector fallback patterns but cannot guarantee coverage through major redesigns. Check the Issues tab if you start seeing empty results.
Residential proxy required. G2 blocks datacenter IPs at the CDN level. Without Apify Residential proxies or equivalent, the majority of requests will return 403 errors or empty product card lists.
No review content extraction. The actor extracts review counts and ratings but does not extract individual review text. For full review data, visit each company's G2 profile URL from the output.
No advanced G2 filters. The actor scrapes category pages in default sort order. G2's UI-based filters (by industry, company size, deployment type) are not replicated — use the employeeSizeFilter and minRating inputs to approximate filtering in post-processing.
Pagination depth limits. G2 category pages typically show 25-30 products per page. Very large categories (e.g. crm) may have dozens of pages. Setting maxCompaniesPerCategory prevents unbounded runs.
Employee count and HQ not always present. G2 only displays employee size and headquarters when the vendor has completed their profile. Expect 20-40% null rates on these fields for smaller or newer vendors.
Rate limited at low concurrency. The crawler is intentionally limited to maxConcurrency: 2 to avoid triggering G2's rate-limiting heuristics. This means large multi-category runs take longer than maximum-concurrency crawlers. Do not increase concurrency without testing against block rates first.

Integrations

Zapier — trigger a G2 scrape on a schedule and push new companies automatically to a HubSpot contact list or Google Sheet
Make — build a multi-step scenario that scrapes G2, enriches contacts via an API, and adds qualified leads to a CRM sequence
Google Sheets — stream G2 company results into a shared spreadsheet for team review and manual qualification
Apify API — trigger runs programmatically from your data pipeline or internal tooling and retrieve results in JSON
Webhooks — receive a POST notification when a run completes, then pull the dataset into your own application
LangChain / LlamaIndex — use G2 company datasets as structured context for AI agents building market research summaries or competitive analysis reports

Troubleshooting

Empty results despite entering a valid category slug. The most common cause is missing or misconfigured proxies. G2 blocks datacenter IPs at the network edge, returning empty HTML bodies or 403 responses. Confirm your proxyConfiguration includes "apifyProxyGroups": ["RESIDENTIAL"]. If results are still empty, the G2 category slug may be incorrect — verify by visiting g2.com/categories/{slug} directly in a browser.

Partial results: fewer companies than expected. If you see fewer companies than the maxCompaniesPerCategory limit, either the category has fewer products than expected, your minReviews or minRating filters are excluding many records, or your spending limit was reached mid-run. Check the run log for "Spending limit reached" messages and the summary record in the dataset for per-category counts.

employeeCount and headquarters are null for many records. These fields are only populated when vendors have completed their G2 profile. Null rates of 20-40% are normal. For enriched company firmographics, pipe the domain field into Waterfall Contact Enrichment which pulls from multiple data sources.

Run is slower than expected. The actor runs at maxConcurrency: 2 deliberately to avoid block rates on G2. Multi-category runs with 100+ companies per category may take 15-30 minutes. If speed is critical, split categories across separate runs triggered in parallel via the Apify API.

G2 profile URLs appear as relative paths. The normalizeG2Url() function converts relative hrefs (e.g. /products/hubspot-crm) to absolute URLs (https://www.g2.com/products/hubspot-crm). If you see relative paths in output, it means the fallback selector returned a non-standard href — report the category in the Issues tab.

Responsible use

This actor accesses publicly available software listing data on G2's category pages.
Respect G2's terms of service. Do not use scraped data to reproduce G2's product database commercially or to compete with G2's own data products.
Review count and rating data should be attributed to G2 when published in research or reports.
Comply with GDPR, CAN-SPAM, and applicable data protection regulations when using company data for outreach campaigns.
Do not use this actor to scrape personal data from G2 reviewer profiles.
For guidance on the legality of web scraping, see Apify's guide to web scraping law.

FAQ

How do I find a G2 category slug to use as input?

Navigate to any G2 category page in your browser. The slug is the path segment after /categories/ in the URL. For example, g2.com/categories/email-marketing has the slug email-marketing. Slugs are always lowercase and hyphenated. Common examples: crm, project-management, marketing-automation, accounting, helpdesk, video-conferencing, e-commerce-platforms.

How many G2 companies can I scrape per category in one run?

Up to 1,000 per category, controlled by the maxCompaniesPerCategory input. G2 category pages typically show 25-30 products per page, so scraping 1,000 companies requires approximately 33-40 page requests per category. For most categories, the practical limit is 200-500 unique products before pagination returns empty pages.

Does G2 Company Scraper extract individual review text?

No. The actor extracts the aggregate review count and average star rating from each product's listing card. It does not visit individual product profile pages or extract the text of individual reviews. For full review content, use the g2ProfileUrl field in the output to visit each product's G2 page directly.

Is it legal to scrape G2 company listings?

Scraping publicly accessible web pages is generally permitted under US law (see the Ninth Circuit's decision in hiQ Labs v. LinkedIn). G2 category listings are publicly viewable without authentication. However, you must respect G2's terms of service, avoid copying their database for commercial redistribution, and comply with applicable data protection laws. See Apify's web scraping legality guide for detailed guidance.

Why does the actor require residential proxies?

G2 uses CDN-level IP reputation filtering that blocks known datacenter IP ranges including AWS, GCP, and Azure cloud egress IPs. Residential proxies route requests through real consumer IP addresses, which pass G2's block lists. Without residential proxies, most requests return 403 errors or empty HTML with no product cards. The default proxy configuration (useApifyProxy: true, apifyProxyGroups: ["RESIDENTIAL"]) handles this automatically.

How accurate is the rating and review count data?

The actor extracts ratings and review counts directly from the HTML served by G2 at the time of the run. The values match what you see when you visit the same category page in a browser. G2 updates ratings and review counts in near-real-time as new reviews are submitted, so data is accurate to within the G2 cache refresh interval (typically a few minutes to a few hours).

How is G2 Company Scraper different from G2's own Buyer Intent data?

G2 Buyer Intent is a paid product that identifies which companies are actively researching your category, using behavioral signals from G2 users. G2 Company Scraper extracts the public vendor listings in a category — a completely different dataset. This actor tells you which software companies exist in a category; G2 Buyer Intent tells you which buyers are looking at those companies. The two datasets are complementary.

Can I scrape multiple G2 categories at the same time?

Yes. Add multiple slugs to the categories array and the actor scrapes them concurrently (up to the maxConcurrency: 2 limit). Deduplication is applied across all categories in a single run, so if a product like HubSpot appears in both crm and marketing-automation, it is only included once in the output.

Can I schedule G2 Company Scraper to run automatically?

Yes. Use the Apify platform's built-in scheduler to run this actor daily, weekly, or on a custom cron schedule. Each scheduled run produces a fresh dataset. Pair scheduling with the Google Sheets or HubSpot integrations to keep your prospect lists automatically updated.

What happens when G2 changes its page markup?

The actor uses dual-selector fallback patterns: primary data-testid selectors plus class-based and itemprop fallbacks for every field. Minor markup changes are typically absorbed by the fallback layer. Major redesigns that remove data-testid attributes entirely will cause empty results — open an issue in the Issues tab and include the category slug and a link to the affected page so the selectors can be updated.

How does deduplication work when I scrape multiple categories?

When deduplicateByDomain is enabled, the actor maintains an in-memory set of all domains seen during the run. The first time a domain is encountered (in any category), the company is saved and charged. Any subsequent card with the same domain — whether in the same category on a later page or in a different category — is skipped without pushing data or charging a PPE event.

Can I use the output with other Apify actors for contact enrichment?

Yes, and this is the recommended workflow. The domain field in every output record is clean (e.g. hubspot.com) and ready to use directly as input to Website Contact Scraper, Email Pattern Finder, or Waterfall Contact Enrichment. Export the dataset as JSON, extract the domain array, and pass it to the next actor in your pipeline.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

Go to Account Settings > Privacy
Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom G2 data extractions, category monitoring pipelines, or enterprise integrations, reach out through the Apify platform.

How it works

Configure

Set your parameters in the Apify Console or pass them via API.

Run

Click Start, trigger via API, webhook, or set up a schedule.

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Related actors

GitHub Repository Search

Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.

99% success

Weather Forecast Search

Get weather forecasts for any location worldwide using the free Open-Meteo API. Returns current conditions, daily and hourly forecasts with temperature, precipitation, wind, UV index, and more. No API key needed.

88% success

EUIPO EU Trademark Search

Search EU trademarks via official EUIPO database. Find registered and pending trademarks by name, Nice class, applicant, or status. Returns full trademark details and filing history.

86% success$0.002/event

Nominatim Address Geocoder

Geocode addresses to GPS coordinates and reverse geocode coordinates to addresses using OpenStreetMap Nominatim. Batch geocoding with rate limiting. Free, no API key needed.

98% success

Not sure which actor to pick?

Try the actor recommender

Ready to try G2 Company Scraper?

Start for free on Apify. No credit card required.

Open on Apify Store