SourceForge & TrustRadius — Software Vendor Leads is an Apify actor on ApifyForge. Scrapes SourceForge and TrustRadius for software company leads by category. Returns vendor name, website, rating, review count, pricing tier, and category tags. Filter by rating or review count. $0.05 per company. It costs $0.05 per company-found. Best for sales teams and marketers who need verified contact data, lead lists, or prospect enrichment at scale. Not ideal for real-time monitoring or historical data analysis. Maintenance pulse: 90/100. Last verified March 27, 2026. Built by Ryan Clinton (ryanclinton on Apify).

LEAD GENERATIONAUTOMATION

SourceForge & TrustRadius — Software Vendor Leads

SourceForge & TrustRadius — Software Vendor Leads is an Apify actor available on ApifyForge at $0.05 per company-found. Scrapes SourceForge and TrustRadius for software company leads by category. Returns vendor name, website, rating, review count, pricing tier, and category tags. Filter by rating or review count. $0.05 per company.

Best for sales teams and marketers who need verified contact data, lead lists, or prospect enrichment at scale.

Not ideal for real-time monitoring or historical data analysis.

Try on Apify Store
$0.05per event
Last verified: March 27, 2026
90
Actively maintained
Maintenance Pulse
$0.05
Per event

What to know

  • Results depend on publicly available data; private or gated contacts may not be found.
  • Email verification accuracy varies by domain and provider policies.
  • Requires an Apify account — free tier available with limited monthly usage.

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

company-founds
Estimated cost:$5.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
company-foundCharged for each unique software company extracted from SourceForge or TrustRadius that passes all quality filters.$0.05

Example: 100 events = $5.00 · 1,000 events = $50.00

Documentation

Software directory scraper that extracts software company leads from SourceForge and TrustRadius by category. Point it at any category slug — crm, project-management, email-marketing — and it returns structured records with vendor name, website, star rating, review count, pricing tier, and category tags. Built for sales teams, marketing agencies, and SaaS founders who need targeted lists of software companies without manual browsing.

The actor runs on CheerioCrawler, which means it is fast, lightweight, and requires no proxies — SourceForge and TrustRadius serve their listing pages to datacenter IPs without blocking. Both sources are scraped simultaneously, results are deduplicated by domain, and quality filters let you exclude low-review or low-rated products before they reach your dataset.

What data can you extract?

Data PointSourceExample
🏢 Company NameSourceForge / TrustRadiusPinnacle CRM Technologies
📦 Product NameSourceForge / TrustRadiusPinnacleCRM Pro
🌐 Website URLSourceForge / TrustRadiushttps://pinnaclecrm.io
🔗 DomainExtracted from websitepinnaclecrm.io
📋 Directory Profile URLSourceForge / TrustRadiushttps://sourceforge.net/software/product/pinnaclecrm/
RatingSourceForge / TrustRadius4.3 (unified 1–5 scale)
💬 Review CountSourceForge / TrustRadius1,842
💰 Pricing TierSourceForge / TrustRadius$29/month
🏷️ CategoriesSourceForge / TrustRadius["crm", "Sales Force Automation"]
🏆 BadgesSourceForge only["Leader", "Top Performer"]
📝 DescriptionSourceForge / TrustRadiusCloud-based CRM for SMB sales teams...
🗂️ SourceActor metadatasourceforge

Why use Software Directory Scraper?

Building a list of software companies in a niche by hand means clicking through dozens of directory pages, copying names into a spreadsheet, and Googling websites one by one. For a single category on SourceForge you might spend two hours to collect 50 companies — with no rating data, no pricing context, and no structured output.

This actor automates the entire process. Provide a list of category slugs and it crawls SourceForge pagination (page-by-page using ?page=N) and TrustRadius product sitemaps (fetching from product-reviews-sitemap-1.xml through sitemap 5, giving 2,500+ product URLs per crawl). Every record is cleaned, normalized, and written to the Apify dataset in minutes.

  • Scheduling — run weekly to refresh your list of active software vendors as new products are added to directories
  • API access — trigger runs from Python, JavaScript, or any HTTP client and pipe results directly into your CRM or enrichment pipeline
  • Proxy rotation — proxies are not required for these sources, but the actor accepts an optional proxy configuration for custom setups
  • Monitoring — get Slack or email alerts when runs fail or produce fewer results than expected via Apify's built-in monitoring
  • Integrations — connect to Zapier, Make, Google Sheets, HubSpot, or webhooks without writing a line of code

Features

  • Dual-source scraping — crawls both SourceForge (100k+ products, paginated category listings) and TrustRadius (B2B-focused, Next.js SSR pages) in a single run, configurable per source
  • Automatic pagination on SourceForge — follows ?page=N links until the per-category limit is reached or no more listing cards are found
  • TrustRadius sitemap discovery — parses product-reviews-sitemap-{1..5}.xml files to collect product URLs, then fetches each product page individually for structured data
  • Dual-extraction strategy for TrustRadius — first attempts to parse the __NEXT_DATA__ JSON embedded in the server-rendered page across three property paths (pageProps.product, pageProps.data.product, pageProps.productReviews.product), then falls back to eight named data-testid HTML selectors
  • Unified rating scale — TrustRadius uses a 10-point scoring system; the actor converts all scores to a 1–5 scale using Math.round((val / 2) * 10) / 10 so ratings from both sources are directly comparable
  • Domain deduplication — strips www. prefixes, normalizes to registrable domain, and tracks seen domains in a shared Set<string> across all categories and sources to prevent duplicate vendor rows
  • Per-category per-source limits — the maxCompaniesPerCategory limit applies independently to each {source}:{category} pair, so a limit of 50 means up to 50 from SourceForge CRM and 50 from TrustRadius CRM
  • Quality filtersminReviews and minRating filters are applied after extraction and before charging; products that fail are logged and skipped at no cost
  • Pricing normalization — raw pricing strings are mapped to standard tiers: Free, Freemium, Open Source, Contact Vendor, or a cleaned price string like $29/month
  • Badge extraction from SourceForge — scrapes award badges from .badge-container .badge and [class*="award"] elements, useful for identifying "Leader" and "Top Performer" products
  • Resilient SourceForge selectors — uses four CSS selector strategies ([class*="project-cell"], .sf-project-listing-item, ul.projects-listing > li, .inner-cell) with a filter for elements containing an h3 a title link, ensuring coverage across markup changes
  • Pay-per-event billing — charged $0.05 per company that passes quality filters; the actor stops automatically when your spending limit is reached and data is always pushed before the charge fires
  • Run summary record — every run ends with a type: "summary" record showing totals by category and source, useful for monitoring and pipeline auditing

Use cases for software directory scraping

Sales prospecting for SaaS tools

Sales development reps building outbound lists can use this actor to find every CRM, help-desk, or marketing-automation vendor in a category. With website and domain data in the output, results feed directly into Website Contact Scraper to find decision-maker emails, or into Waterfall Contact Enrichment for a full contact cascade. A list of 200 CRM vendors takes under 10 minutes and costs $10.

Marketing agency lead generation

Agencies that serve software companies — design studios, content agencies, SEO firms — can scrape target categories to find prospect companies with their websites pre-extracted. Filter by minReviews: 10 to exclude unestablished products and focus on vendors that are already investing in their market presence. Rating data helps prioritize outreach toward well-reviewed products that likely have marketing budgets.

Competitive intelligence and market mapping

Founders and product managers can scrape their own category to map the competitive landscape. The output includes category tags, pricing tiers, and badge data, giving a structured view of which products lead the category. Combine with Website Tech Stack Detector to identify which technology platforms your competitors are built on.

Data enrichment for existing company lists

If you already have a list of software company domains, run this actor to add rating, review count, pricing tier, and category context from SourceForge and TrustRadius. The domain field enables joining with your existing data. Set deduplicateByDomain: false when you want complete coverage across multiple categories for the same company.

Recruiting and talent sourcing

Recruiters targeting software companies in specific verticals can use category data to find employers. A search for project-management or hr-software returns companies with their websites, which feed into contact extraction to find hiring manager contacts. The badge data (Leader, Top Performer) helps identify fast-growing companies likely to be actively hiring.

B2B lead qualification and scoring

The combination of rating, review count, and pricing tier gives enough signal to score leads before enrichment. High-rating, high-review-count companies with paid pricing tiers are indicators of an established, revenue-generating business. Pipe the output into B2B Lead Qualifier to apply a formal 0–100 score before committing to enrichment cost.

How to scrape software company leads from SourceForge and TrustRadius

  1. Enter your target categories — Type the category slugs you want to scrape. Use lowercase, hyphenated slugs that match the directory URL: crm, project-management, email-marketing, accounting, help-desk. You can enter multiple categories in one run.
  2. Configure quality filters — Set minReviews to 5 or 10 to exclude newly listed products with no track record. Set minRating to 3.5 to focus on well-reviewed vendors. Leave both at 0 to collect everything.
  3. Run the actor — Click "Start" and wait. A single category with the default limit of 50 companies per source typically completes in 3–5 minutes.
  4. Download results — Open the Dataset tab, then export to JSON, CSV, or Excel. The dataset includes one row per company plus a summary record at the end showing totals by category and source.

Input parameters

ParameterTypeRequiredDefaultDescription
categoriesarrayYes["crm"]Category slugs to scrape. Use lowercase hyphenated slugs matching directory URLs (e.g. crm, project-management, email-marketing).
sourcesarrayNo["sourceforge", "trustradius"]Which directories to scrape. Options: sourceforge, trustradius. Omit to scrape both.
maxCompaniesPerCategoryintegerNo50Max companies per category per source. 0 = no limit. Range: 0–1000.
minReviewsintegerNo0Minimum number of reviews a product must have to be included.
minRatingnumberNo0Minimum average rating (1.0–5.0 scale) a product must have.
deduplicateByDomainbooleanNotrueRemove duplicate companies when the same domain appears across categories or sources.
proxyConfigurationobjectNononeOptional Apify proxy configuration. SourceForge and TrustRadius work without proxies.

Input examples

Standard scrape — two categories, both sources:

{
  "categories": ["crm", "project-management"],
  "sources": ["sourceforge", "trustradius"],
  "maxCompaniesPerCategory": 50,
  "minReviews": 5,
  "minRating": 3.5,
  "deduplicateByDomain": true
}

Large batch — five categories, SourceForge only, higher limit:

{
  "categories": ["crm", "project-management", "email-marketing", "accounting", "help-desk"],
  "sources": ["sourceforge"],
  "maxCompaniesPerCategory": 200,
  "minReviews": 0,
  "minRating": 0,
  "deduplicateByDomain": true
}

Quick test — one category, minimal filters:

{
  "categories": ["crm"],
  "sources": ["sourceforge"],
  "maxCompaniesPerCategory": 10,
  "deduplicateByDomain": false
}

Input tips

  • Start with a small limit — set maxCompaniesPerCategory: 10 for a first run to verify results match your expectations before scaling up.
  • Use both sources together — SourceForge skews toward SMB and open-source tools; TrustRadius skews toward enterprise B2B. Combined, you get broader coverage of a category.
  • Category slugs must match the SourceForge URL — verify by visiting https://sourceforge.net/software/{your-slug}/ in a browser before running.
  • Batch multiple categories in one run — processing 5 categories in a single run is more efficient than 5 separate runs, because the deduplication set is shared across the entire run.
  • Set a spending limit — use Apify's per-run budget control to cap costs before running against a large category list.

Output example

{
  "companyName": "Pinnacle CRM Technologies",
  "productName": "PinnacleCRM Pro",
  "website": "https://pinnaclecrm.io",
  "domain": "pinnaclecrm.io",
  "profileUrl": "https://sourceforge.net/software/product/pinnaclecrm/",
  "rating": 4.3,
  "reviewCount": 1842,
  "pricingTier": "$29/month",
  "categories": ["crm", "Sales Force Automation", "Contact Management"],
  "badges": ["Leader", "Top Performer Q1 2025"],
  "description": "Cloud-based CRM for SMB sales teams. Includes pipeline management, email sequences, and native Slack integration. Free 14-day trial.",
  "source": "sourceforge",
  "sourceCategory": "crm",
  "scrapedAt": "2026-03-22T09:14:32.451Z"
}

The final record in every dataset is a summary record:

{
  "type": "summary",
  "categoriesScraped": ["crm", "project-management"],
  "sourcesUsed": ["sourceforge", "trustradius"],
  "totalCompaniesFound": 187,
  "totalDeduplicated": 14,
  "companiesByCategory": {
    "crm": 98,
    "project-management": 89
  },
  "companiesBySource": {
    "sourceforge": 94,
    "trustradius": 93
  },
  "scrapedAt": "2026-03-22T09:21:08.772Z"
}

Output fields

FieldTypeDescription
companyNamestring | nullVendor or company name. Falls back to product name when the directory does not list the vendor separately.
productNamestring | nullSoftware product name as listed in the directory.
websitestring | nullVendor website URL as listed in the directory profile.
domainstring | nullRegistrable domain extracted from the website URL (e.g. pinnaclecrm.io). Used for deduplication and CRM join keys.
profileUrlstring | nullDirect link to the product's SourceForge or TrustRadius profile page.
ratingnumber | nullAverage rating on a unified 1.0–5.0 scale. TrustRadius 10-point scores are divided by 2 and rounded to one decimal.
reviewCountnumber | nullTotal number of user reviews or ratings in the directory.
pricingTierstring | nullNormalized pricing: Free, Freemium, Open Source, Contact Vendor, or a price string like $29/month.
categoriesstring[]Category tags from the listing. Always includes the source category slug used to discover the product.
badgesstring[]SourceForge award badges (e.g. Leader, Top Performer). Empty array for TrustRadius results.
descriptionstring | nullShort product description from the listing page.
sourcestringWhich directory this record came from: sourceforge or trustradius.
sourceCategorystringThe category slug used to discover this company (e.g. crm).
scrapedAtstringISO 8601 timestamp when this record was extracted.

How much does it cost to scrape software company leads?

Software Directory Scraper uses pay-per-event pricing — you pay $0.05 per company extracted. Platform compute costs are included. Companies filtered out by minReviews or minRating are not charged.

ScenarioCompaniesCost per companyTotal cost
Quick test10$0.05$0.50
Single category50$0.05$2.50
Two categories, both sources200$0.05$10.00
Five categories500$0.05$25.00
Full market map1,000$0.05$50.00

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached, so a $5 limit will collect up to 100 companies.

Compare this to manually browsing directories at roughly 30 seconds per company — 200 companies would take 100 minutes of manual work. At $10 for the same output, you get clean structured data with no subscription commitment. Tools like ZoomInfo or Apollo charge $100–500/month and still require manual filtering to narrow to a specific software category.

Scrape software company leads using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/g2-company-scraper").call(run_input={
    "categories": ["crm", "project-management"],
    "sources": ["sourceforge", "trustradius"],
    "maxCompaniesPerCategory": 50,
    "minReviews": 5,
    "minRating": 3.5,
    "deduplicateByDomain": True,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item.get("type") == "summary":
        print(f"Total companies found: {item['totalCompaniesFound']}")
    else:
        print(f"{item['productName']} ({item['companyName']}) — {item['domain']} — {item['rating']} stars, {item['reviewCount']} reviews")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/g2-company-scraper").call({
    categories: ["crm", "project-management"],
    sources: ["sourceforge", "trustradius"],
    maxCompaniesPerCategory: 50,
    minReviews: 5,
    minRating: 3.5,
    deduplicateByDomain: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    if (item.type === "summary") {
        console.log(`Total: ${item.totalCompaniesFound} companies`);
    } else {
        console.log(`${item.productName} — ${item.domain} — ${item.rating} stars, ${item.pricingTier}`);
    }
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~g2-company-scraper/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "categories": ["crm", "project-management"],
    "sources": ["sourceforge", "trustradius"],
    "maxCompaniesPerCategory": 50,
    "minReviews": 5,
    "minRating": 3.5,
    "deduplicateByDomain": true
  }'

# Fetch results (replace DATASET_ID from the run response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How Software Directory Scraper works

Phase 1 — Request generation

On startup the actor reads your categories and sources inputs, normalizes category slugs (lowercase, hyphens, trim), then builds start requests. For SourceForge, it constructs one URL per category at https://sourceforge.net/software/{category}/ (page 1). For TrustRadius, it generates five sitemap URLs per category — https://www.trustradius.com/sitemaps/product-reviews-sitemap-{1..5}.xml — to maximize product URL discovery. All requests are handed to a CheerioCrawler instance running at maxConcurrency: 5 with session pooling, persistent cookies per session, a 30-second navigation timeout, and 3 retries per request.

Phase 2 — SourceForge category crawling

The SourceForge route handler (SF_CATEGORY) parses category listing pages using four CSS selector strategies for resilience against markup changes. It extracts product name from h3 a title links, vendor name from .project-company or [class*="company"] elements, star rating from [class*="rating-avg"] or [itemprop="ratingValue"] attributes, review count from [class*="rating-count"], pricing from [class*="price"], category tags from [class*="tag"] a, award badges via the extractSFBadges helper, and the vendor website link identified by data-ga-label="website" or rel="nofollow" links pointing outside sourceforge.net. After processing each page it automatically enqueues the next page (?page=N+1) until the per-category limit is reached or no listing cards are found.

Phase 3 — TrustRadius sitemap and product crawling

The TrustRadius route has two stages. The sitemap handler (TR_SITEMAP) parses XML sitemap files using Cheerio's XML support (enabled via additionalMimeTypes: ['application/xml', 'text/xml']), filters for /products/ URLs that are not comparison, pricing, video, or competitor pages, then enqueues up to targetCount * 2 product URLs to account for items that will be filtered. The product handler (TR_PRODUCT) first attempts to parse structured data from the __NEXT_DATA__ JSON block embedded in the server-rendered page, checking three property paths. If the JSON path yields no product name, it falls back to eight named data-testid HTML selectors — product-name, vendor-name, overall-score, reviews-count, product-description, pricing-summary, category, and vendor-website — plus a meta[name="description"] fallback for descriptions.

Phase 4 — Normalization, filtering, and pay-per-event charging

Every extracted record passes through transformRawToClean(), which applies domain extraction (stripping www. prefixes via URL parsing), rating scale conversion, pricing tier normalization, and whitespace collapsing via regex. The record is then checked against minReviews and minRating in passesFilters(). If it passes, the actor calls Actor.pushData() first, then Actor.charge({ eventName: 'company-found', count: 1 }) — following Apify's data-before-charge rule. The eventChargeLimitReached flag is checked after each charge; if set, all active route handlers stop and the run completes cleanly with a summary record.

Tips for best results

  1. Check category slugs against the SourceForge URL before running. Visit https://sourceforge.net/software/your-slug/ in a browser. If the page returns results, the slug is valid. Invalid slugs return empty pages and produce zero results for the SourceForge source.

  2. Use minReviews: 5 as a baseline filter. Products with fewer than 5 reviews are often newly listed or inactive. Filtering them reduces noise without significantly reducing volume in established categories.

  3. Combine categories strategically to avoid redundancy. Categories on SourceForge overlap significantly — crm and sales-force-automation share many products. Running them together with deduplicateByDomain: true catches products in both without doubling your cost.

  4. Run TrustRadius-only for enterprise B2B focus. TrustRadius skews heavily toward enterprise software with large review counts and detailed scoring. If your target market is enterprise buyers, set sources: ["trustradius"] and minReviews: 20 for a focused list.

  5. Pipe directly into contact enrichment. The domain field is a ready-made key for Website Contact Scraper. Extract the domains from your dataset and run them in a batch to get email addresses and phone numbers for each vendor.

  6. Schedule weekly refreshes for fast-moving categories. Categories like ai-tools or marketing-automation add new products frequently. A weekly scheduled run keeps your lead list current as new vendors appear in the directories.

  7. Use the summary record for run monitoring. Every run ends with a type: "summary" record. If totalCompaniesFound drops significantly week-over-week, that signals a markup change or category rename worth investigating before the next run.

Combine with other Apify actors

ActorHow to combine
Website Contact ScraperFeed the domain or website field from each company record to extract emails, phone numbers, and contact pages from vendor websites
Waterfall Contact EnrichmentRun a 10-step enrichment cascade on the vendor domain to find decision-maker emails and LinkedIn profiles
Email Pattern FinderDetect the email naming convention used by each vendor (e.g. [email protected]) before building outbound sequences
B2B Lead QualifierScore each company 0–100 using rating, review count, pricing tier, and other signals to prioritize enrichment spend
Website Tech Stack DetectorDetect 100+ web technologies on each vendor's website to qualify leads by tech profile or integration fit
HubSpot Lead PusherPush the structured company records directly into HubSpot as contacts or companies after enrichment
Bulk Email VerifierVerify email addresses found via contact scraping before importing into your sending tool
Lead Enrichment PipelineAll-in-one Clay alternative: email discovery, verification, company research, and scoring in one run ($0.12/lead)
AI Outreach PersonalizerGenerate personalized cold emails using your own OpenAI/Anthropic key — zero AI markup ($0.01/lead)
Intent Signal TrackerTrack buying signals: hiring, tech changes, funding, content updates. Prioritize outreach by intent score ($0.05/company)
Lead Data Quality AuditorAudit lead data quality before outreach — email verification, phone validation, domain freshness ($0.005/record)

Limitations

  • TrustRadius category filtering is approximate. The actor discovers TrustRadius products via sitemaps that list all products, not filtered by category. The sourceCategory field reflects the category you searched for, not a TrustRadius taxonomy match. Products from adjacent segments may appear in results.
  • SourceForge badge extraction depends on CSS class naming patterns. Badges are extracted using selectors like [class*="badge"] and [class*="award"]. If SourceForge changes its CSS class naming, badge data may be incomplete while other fields remain accurate.
  • Vendor websites are not always present. Some directory listings do not include a vendor website link. In those cases website and domain will be null, and the record cannot be used for downstream website-based enrichment.
  • No JavaScript rendering. The actor uses CheerioCrawler (HTTP-based), not a browser. Pages that require client-side JavaScript to render their content will return incomplete data. Both SourceForge and TrustRadius use server-rendered HTML so this does not currently affect results, but any future sources added to this actor that require browser execution would need a separate implementation.
  • TrustRadius sitemap coverage is 5 shards out of approximately 25. The actor fetches sitemaps 1–5, covering thousands of product URLs. Products in higher-numbered shards are not discovered in a standard run. For complete TrustRadius coverage across all shards, contact us about a custom configuration.
  • Deduplication is run-scoped. The seenDomains set is created fresh each run. If you run the actor twice against the same categories, the same companies can appear in both datasets. Use the domain field as a unique key in your downstream storage to handle cross-run deduplication.
  • No employee count, funding, or HQ data. Neither SourceForge nor TrustRadius consistently exposes firmographic data in their listing HTML. Use Company Deep Research or a downstream enrichment actor to add firmographic context.
  • Rating scale conversion is a linear approximation. TrustRadius 10-point scores are divided by 2. This does not account for distribution differences between the two rating systems; a TrustRadius 8.6 becomes 4.3, but the populations rated by each platform differ.

Integrations

  • Zapier — trigger a Zap when a run completes to push new software companies into a Google Sheet or CRM automatically
  • Make — build a multi-step scenario that scrapes companies, enriches contacts, and adds leads to your outbound sequence tool
  • Google Sheets — export the dataset directly to a sheet for manual review and prioritization before enrichment
  • Apify API — trigger runs programmatically from your sales or marketing automation platform and receive results via webhook
  • Webhooks — post the completed dataset URL to a Slack channel or internal dashboard when a run finishes
  • LangChain / LlamaIndex — use scraped software company descriptions and category data as a knowledge base for AI-powered market research agents

Troubleshooting

Zero results despite providing a valid category. The most common cause is a category slug that does not match the SourceForge URL structure. Verify by visiting https://sourceforge.net/software/your-slug/ directly. If the page shows no products, try a more general slug (e.g. crm instead of crm-software). For TrustRadius, results depend on sitemap coverage — if the category has few matching products in the first five sitemaps, output will be low.

All results have null website and domain fields. Some SourceForge categories list products without a vendor website link in the listing card. This is more common in open-source or niche categories. The profileUrl still links to the directory listing and can be used as a secondary identifier for manual lookup.

TrustRadius results are empty or very few. TrustRadius products are discovered via sitemap, not via category-filtered listings. Lowering minReviews to 0 and minRating to 0 confirms whether any records can be found. The actor enqueues targetCount * 2 product URLs to account for filtering, but the absolute maximum is bounded by what appears in the first five sitemap shards.

Run completes faster than expected with fewer results than the limit. This means the actor exhausted all available listing pages before reaching your maxCompaniesPerCategory limit. SourceForge categories vary in size — smaller niches may have fewer than 50 products total. Check the summary record's companiesByCategory field to see how many were found per category.

Duplicate companies appearing across multiple runs. Deduplication only operates within a single run. Across multiple runs, the same company can appear again. Use the domain field as a unique key in your downstream storage — a Google Sheets VLOOKUP, CRM deduplication rule, or a database unique constraint on domain will handle this cleanly.

Responsible use

  • This actor only accesses publicly available software directory listings on SourceForge and TrustRadius.
  • Respect each platform's terms of service and robots.txt directives.
  • Comply with GDPR, CAN-SPAM, and other applicable data protection laws when using scraped company data for outreach.
  • Do not use extracted data to send unsolicited bulk email or for spam campaigns.
  • For guidance on web scraping legality, see Apify's guide.

FAQ

How many software companies can I scrape in one run? There is no hard cap from the actor. The maxCompaniesPerCategory parameter (default 50, max 1000) controls per-category volume, and you can run as many categories as you like in a single run. Your practical limit is your Apify spending budget — at $0.05 per company, a $50 budget yields up to 1,000 companies.

Does Software Directory Scraper work for any software category? It works for any category that has a valid slug on SourceForge (https://sourceforge.net/software/{slug}/). Common slugs include crm, project-management, email-marketing, accounting, help-desk, marketing-automation, hr-software, erp, business-intelligence, and video-conferencing. TrustRadius coverage depends on sitemap inclusion and is not category-filtered.

How accurate is the rating data from this scraper? Ratings are taken directly from the directory listings and reflect each platform's own aggregated scores. TrustRadius 10-point scores are converted to a 5-point scale by dividing by 2. The accuracy of the underlying ratings is determined by each directory's own review processes — the actor extracts them without modification beyond scale normalization.

What is the difference between SourceForge and TrustRadius results? SourceForge has 100k+ products including many open-source and SMB-focused tools, with paginated category listings, explicit pricing, and badge data. TrustRadius focuses on enterprise B2B software with in-depth review scoring. Using both sources together gives broader category coverage across company sizes and market segments.

How is this different from scraping G2, Capterra, or GetApp? G2, Capterra, and GetApp aggressively block HTTP scrapers with Cloudflare's JS challenge — extracting data from them requires a full browser with anti-detection measures, which is slower and more expensive. SourceForge and TrustRadius serve their listing pages to datacenter IPs without blocking, making this actor fast, reliable, and proxy-free.

Can I scrape software company leads from multiple categories at once? Yes. Pass multiple slugs in the categories array: ["crm", "project-management", "email-marketing"]. The actor processes all categories in parallel using a shared crawler queue. Deduplication operates across the entire run, so a company appearing in two categories is only returned once when deduplicateByDomain: true.

How long does a typical software directory scraping run take? A single category at maxCompaniesPerCategory: 50 from both sources typically completes in 3–6 minutes. Five categories at the same limit take 10–20 minutes. TrustRadius runs slightly longer because each product requires an individual page fetch after sitemap parsing.

Can I filter out free and open-source software from the results? There is no dedicated filter for this, but you can filter the output dataset by the pricingTier field. Records where pricingTier is "Free" or "Open Source" can be excluded in post-processing in Excel, Google Sheets, or your pipeline code.

Is it legal to scrape SourceForge and TrustRadius? Scraping publicly available data from software directories is generally considered lawful in most jurisdictions. Both SourceForge and TrustRadius publish their listings publicly without authentication requirements. Always respect the platforms' terms of service and use the data responsibly. See Apify's web scraping legality guide for a detailed overview.

Can I schedule this actor to run automatically every week? Yes. Use Apify's built-in scheduler to run on any cron schedule — daily, weekly, or monthly. Weekly runs against fast-moving categories like ai-tools or marketing-automation keep your lead list current as new products are added to the directories.

What happens if the same company appears on both SourceForge and TrustRadius? With deduplicateByDomain: true (the default), the first occurrence is kept and the duplicate is skipped. The source field on the kept record shows which directory found it first. With deduplicateByDomain: false, both records are returned so you can compare ratings and review counts across sources.

Can I connect the output directly to HubSpot or Salesforce? Yes. Use HubSpot Lead Pusher to push company records into HubSpot, or use Apify's Zapier or Make integrations to route data to Salesforce, Pipedrive, or any other CRM. The domain field is a reliable unique key for CRM deduplication.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.

Last verified: March 27, 2026

Ready to try SourceForge & TrustRadius — Software Vendor Leads?

Start for free on Apify. No credit card required.

Open on Apify Store