Agency Directory Scraper
**Agency directory scraper** that harvests Clutch, Sortlist, AgencySpotter, and DesignRush in a single run — giving you a deduplicated list of marketing, design, and technology agencies with their website, services, location, team size, minimum budget, review count, and star rating. Built for sales teams, consultants, and researchers who need structured agency data at scale without manual browsing.
Maintenance Pulse
90/100Documentation
Agency directory scraper that harvests Clutch, Sortlist, AgencySpotter, and DesignRush in a single run — giving you a deduplicated list of marketing, design, and technology agencies with their website, services, location, team size, minimum budget, review count, and star rating. Built for sales teams, consultants, and researchers who need structured agency data at scale without manual browsing.
Pull up to 500 agencies per source (2,000 total across all four directories) in one run. The actor uses a Playwright browser with residential proxy rotation and browser fingerprint spoofing to navigate directories that block datacenter IPs. Every record is deduplicated by domain before it reaches your dataset, so the same agency appearing on multiple directories counts only once.
What data can you extract from agency directories?
| Data Point | Source | Example |
|---|---|---|
| 📛 Agency name | All four directories | Pinnacle Digital Group |
| 🌐 Website URL | All four directories | https://pinnacledg.com |
| 🔗 Domain | Extracted from website | pinnacledg.com |
| 🏷️ Services | All four directories | ["SEO", "PPC", "Content Marketing"] |
| 📍 Location | All four directories | Chicago, IL |
| 👥 Employee count | Clutch, DesignRush, AgencySpotter | 10–49 |
| 💰 Min project size | Clutch, DesignRush | $5,000+ |
| ⭐ Star rating | Clutch, Sortlist, DesignRush | 4.8 |
| 💬 Review count | Clutch, Sortlist, DesignRush | 143 |
| 📂 Source directory | All records | clutch |
| 🔎 Source profile URL | All four directories | https://clutch.co/profile/pinnacle-digital |
| 🕐 Scraped timestamp | All records | 2026-03-22T10:14:33.000Z |
Why use Agency Directory Scraper?
Manually browsing Clutch or DesignRush for agency leads is a multi-hour slog. Clutch alone lists thousands of agencies across dozens of paginated category pages. There is no export button, no bulk download, and no API. Copy-pasting agency details one by one is error-prone and takes the better part of a day to collect even 200 records — a day you could spend actually talking to prospects.
This actor automates the entire agency directory scraping process — crawling all four directories simultaneously, turning a full day of manual research into a 15–20 minute automated run for under $15.
- Scheduling — run daily, weekly, or monthly to refresh your agency database as new firms join the directories
- API access — trigger runs from Python, JavaScript, or any HTTP client to integrate with your CRM pipeline
- Proxy rotation — residential proxy support keeps scraping reliable even on directories with aggressive bot detection
- Monitoring — get Slack or email alerts when runs fail or produce fewer results than expected
- Integrations — connect to Zapier, Make, Google Sheets, HubSpot, or webhooks to push results directly into your workflow
Features
- Four-directory coverage — Clutch, Sortlist, AgencySpotter, and DesignRush all in one run, with per-source page crawling up to 500 agencies each
- Domain-based deduplication — a shared
seenDomainsSet across all sources ensures each agency website is output only once, even if the same firm appears on multiple directories - Playwright browser rendering — full headless Chromium execution handles JavaScript-heavy listing pages that HTTP-only scrapers cannot reach
- Browser fingerprint spoofing — randomised Chrome fingerprints across Linux, Windows, and macOS profiles with
--disable-blink-features=AutomationControlledto evade bot detection - Automatic pagination — detects
a[rel="next"]and pagination controls on each directory and queues subsequent pages automatically until the per-source limit is met - Service tag extraction — collects up to 8 service and specialty tags per agency card from every directory
- Normalised website URLs — raw href values are cleaned into canonical absolute URLs; relative paths, protocol-relative URLs, and fragment-only values are handled or discarded
- Structured numeric fields — review counts like "1,234 reviews" and ratings like "4.8/5 stars" are parsed into clean integers and floats using dedicated parsers
- Per-source result cap —
maxAgenciesPerSourceis enforced independently per directory, so you can cap Clutch at 100 while pulling 500 from DesignRush - Spending limit enforcement — PPE charges halt when your configured budget ceiling is reached; remaining records are not pushed and a summary record marks the early stop
- Graceful partial results — if the crawl encounters an error mid-run, all agencies collected so far are pushed to the dataset rather than discarded
- Run summary record — a final
type: "summary"record is always appended with source breakdown, total count, and metadata
Use cases for agency directory scraping
Sales prospecting for SaaS and technology vendors
Technology vendors targeting digital marketing agencies — from SEO software to white-label ad platforms — need current, segmented agency lists to fuel outbound. Manually building a list of 300 SEO agencies in North America could take two days of browsing. With this actor, a sales team pulls that list in minutes, filtered by service category and location, then feeds it directly into their outreach sequence or CRM.
Agency market mapping and competitive research
Strategy consultants and M&A researchers use agency directories to map the competitive landscape: who operates in a given city, what services they offer, how large they are, and how they are reviewed. Running this actor across multiple service slugs — seo, ppc, web-design — produces a structured market map that would take weeks to assemble manually.
Recruiting and talent sourcing
Recruiters placing senior marketing hires often want to identify mid-size agencies (10–49 employees) in specific locations as target employers. The employeeCount and location fields make it straightforward to filter the dataset to exactly that segment and then enrich the records with decision-maker contact details using Waterfall Contact Enrichment.
Vendor evaluation and procurement
Procurement teams comparing agencies before a pitch process use directory listings to generate a long-list quickly. The rating, reviewCount, and minProjectSize fields provide the first-pass scoring criteria without requiring individual website visits. Export to Google Sheets and share with stakeholders for collaborative shortlisting.
White-label agency partnership development
Larger agencies looking for white-label partners in specialist disciplines — video production, accessibility auditing, PR, translation — can filter results by service category and location to identify candidates, then review the sourceUrl profile links to assess social proof before outreach.
Data enrichment for existing CRM records
If your CRM already has agency company names but is missing website, location, or service data, the scraped dataset can serve as a reference lookup to fill gaps — especially effective when combined with Website Contact Scraper to add email addresses from each agency's website.
How to scrape agency directories
- Choose your directories — select one or more of
clutch,sortlist,agencyspotter,designrushin the Sources field. Leave all four checked to maximise coverage. - Set a service category — type a slug like
seo,ppc,web-design, ordigital-marketing. This maps directly to the URL path on each directory (e.g.clutch.co/agencies/seo). - Run the actor — click Start and wait. A run pulling 50 agencies from each of four sources typically completes in 15–20 minutes.
- Download results — open the Dataset tab and export as JSON, CSV, or Excel. Filter by
source,location, orratingin the dataset UI before exporting.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
sources | array | Yes | ["clutch","sortlist","agencyspotter","designrush"] | Which directories to crawl. Valid values: clutch, sortlist, agencyspotter, designrush. |
services | string | No | "digital-marketing" | Service category slug to filter by. Maps to the directory URL path (e.g. seo, ppc, web-design). |
location | string | No | "" | City or country to filter agencies by. Leave blank for global results. |
maxAgenciesPerSource | integer | No | 50 | Maximum agencies to collect per directory. Range: 1–500. Total output can be up to 4× this value. |
proxyConfiguration | object | No | {"useApifyProxy":true,"apifyProxyGroups":["RESIDENTIAL"]} | Proxy settings. All four directories block datacenter IPs — residential proxies are required for reliable results. |
Input examples
Most common: all four directories, SEO agencies, global:
{
"sources": ["clutch", "sortlist", "agencyspotter", "designrush"],
"services": "seo",
"location": "",
"maxAgenciesPerSource": 50,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}
Targeted: Clutch and DesignRush, PPC agencies in the United States, large batch:
{
"sources": ["clutch", "designrush"],
"services": "ppc",
"location": "United States",
"maxAgenciesPerSource": 200,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}
Quick test: single source, small cap, fast feedback:
{
"sources": ["clutch"],
"services": "digital-marketing",
"location": "",
"maxAgenciesPerSource": 10,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}
Input tips
- Start with the default service slug —
digital-marketingis the broadest category and a good smoke test that the proxy and selector configuration is working before narrowing to a niche. - Use location sparingly — location filtering is appended to the directory URL where the directory supports it, but all four directories handle it differently. For reliable coverage, omit
locationand filter the output dataset by thelocationfield after the run. - Set a spending limit for large batches — at 4 sources × 500 agencies = 2,000 records, the maximum cost is $100. Configure a spending limit in the Apify run settings to cap spend if you intended a smaller batch.
- Always use the RESIDENTIAL proxy group — all four directories detect and block datacenter IPs. Removing the proxy group is the most common cause of zero-result runs.
- Batch service categories in separate runs — if you need both
seoandppcagencies, run them as two separate inputs. Each run maintains its own deduplication state.
Output example
{
"agencyName": "Meridian Growth Partners",
"website": "https://meridiangrowthpartners.com",
"domain": "meridiangrowthpartners.com",
"services": ["SEO", "PPC", "Content Marketing", "Email Marketing"],
"location": "Austin, TX",
"employeeCount": "10-49",
"minProjectSize": "$5,000+",
"reviewCount": 87,
"rating": 4.9,
"source": "clutch",
"sourceUrl": "https://clutch.co/profile/meridian-growth-partners",
"scrapedAt": "2026-03-22T10:14:33.121Z"
}
The final record in every dataset is a run summary:
{
"type": "summary",
"totalAgencies": 186,
"sourcesScraped": ["clutch", "sortlist", "agencyspotter", "designrush"],
"sourceBreakdown": {
"clutch": 50,
"sortlist": 47,
"agencyspotter": 44,
"designrush": 45
},
"service": "seo",
"location": null,
"maxAgenciesPerSource": 50,
"spendingLimitReached": false,
"scrapedAt": "2026-03-22T10:31:07.448Z"
}
Output fields
| Field | Type | Description |
|---|---|---|
agencyName | string | Agency display name as it appears on the directory listing |
website | string | null | Normalised absolute URL of the agency's website |
domain | string | null | Registrable domain extracted from website (e.g. example.com) used for deduplication |
services | string[] | Service and specialty tags from the agency card, up to 8 per source |
location | string | null | City and/or country as shown on the directory listing |
employeeCount | string | null | Team size range, e.g. 10-49, 50-249, 250+ |
minProjectSize | string | null | Minimum project budget, e.g. $5,000+, $10,000+ (Clutch and DesignRush only) |
reviewCount | number | null | Total number of client reviews on the directory listing |
rating | number | null | Average star rating parsed as a float, e.g. 4.8 |
source | string | Directory the record came from: clutch, sortlist, agencyspotter, or designrush |
sourceUrl | string | Direct URL to the agency's profile page on the source directory |
scrapedAt | string | ISO 8601 timestamp of when the record was extracted |
How much does it cost to scrape agency directories?
Agency Directory Scraper uses pay-per-event pricing — you pay $0.05 per agency extracted and deduplicated. Platform compute costs are included. You are never charged for duplicates removed during deduplication or for failed page loads.
| Scenario | Agencies | Cost per agency | Total cost |
|---|---|---|---|
| Quick test (1 source, 10 agencies) | 10 | $0.05 | $0.50 |
| Small batch (2 sources, 25 each) | 50 | $0.05 | $2.50 |
| Standard run (4 sources, 50 each) | ~200 | $0.05 | ~$10.00 |
| Large run (4 sources, 100 each) | ~400 | $0.05 | ~$20.00 |
| Maximum batch (4 sources, 500 each) | ~2,000 | $0.05 | ~$100.00 |
You can set a maximum spending limit per run in the Apify console to control costs. The actor stops pushing records when your budget is reached and always outputs a summary record indicating whether the limit was hit.
Compare this to B2B data platforms like Apollo or ZoomInfo at $49–$199/month for similar (but less agency-specific) data. Most users building or refreshing an agency prospecting list spend $5–$20 per run with no subscription commitment.
Agency directory scraping using the API
Python
from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/agency-directory-scraper").call(run_input={
"sources": ["clutch", "sortlist", "agencyspotter", "designrush"],
"services": "seo",
"location": "",
"maxAgenciesPerSource": 50,
"proxyConfiguration": {
"useApifyProxy": True,
"apifyProxyGroups": ["RESIDENTIAL"]
}
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
if item.get("type") == "summary":
print(f"Total agencies: {item['totalAgencies']}")
continue
print(f"{item['agencyName']} | {item['domain']} | {item.get('rating')} stars | {item.get('location')}")
JavaScript
import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/agency-directory-scraper").call({
sources: ["clutch", "sortlist", "agencyspotter", "designrush"],
services: "seo",
location: "",
maxAgenciesPerSource: 50,
proxyConfiguration: {
useApifyProxy: true,
apifyProxyGroups: ["RESIDENTIAL"]
}
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
if (item.type === "summary") continue;
console.log(`${item.agencyName} | ${item.domain} | ${item.services.join(", ")}`);
}
cURL
# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~agency-directory-scraper/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sources": ["clutch", "sortlist", "agencyspotter", "designrush"],
"services": "seo",
"location": "",
"maxAgenciesPerSource": 50,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}'
# Fetch results once the run completes (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How Agency Directory Scraper works
Phase 1 — URL construction and seed requests
For each selected source, dedicated URL builder functions construct the correct directory listing URL from the services slug and page number. Clutch uses clutch.co/agencies/{service}?page={n-1} (page 1 omits the parameter), DesignRush uses /agency/{service}/p/{n}, Sortlist uses /agencies/{service}?p={n}, and AgencySpotter uses /agencies/{service}?page={n}. All seed URLs for all selected sources are queued simultaneously at the start of the run, so directories are crawled in parallel at a concurrency of 2 to avoid triggering rate limits.
Phase 2 — Playwright rendering and card extraction
Each route handler waits for its directory's specific agency card selector before proceeding: Clutch waits for [class*="provider-row"], Sortlist for [class*="agency-card"], AgencySpotter for [class*="listing-card"], and DesignRush for [class*="firm-card"]. Selectors use CSS substring class matching to tolerate minor HTML changes. Each card element is iterated with Playwright's .locator() API, extracting agency name, website href, rating text, review text, location, employee count, minimum project size, and up to 8 service tag elements. Every field extraction uses .catch(() => null) so a single missing element never drops the entire card.
Phase 3 — Normalisation and deduplication
Raw extracted strings pass through pure utility functions in extractors.ts. normaliseWebsite canonicalises relative URLs to absolute form using the WHATWG URL API. extractDomain strips www. prefixes and returns the registrable domain (e.g. pinnacledg.com). parseReviewCount strips commas and non-numeric characters from strings like "1,234 reviews". parseRating handles formats including 4.8, 4.8/5, and 4.8 stars. A shared seenDomains Set in RouteContext is checked before each record is registered — if the domain is already present from any source, the card is skipped.
Phase 4 — PPE charging and output
After the crawl completes, collected records are iterated and pushed one by one to the Apify dataset. In pay-per-event mode, Actor.charge({ eventName: 'agency-found', count: 1 }) is called after each push. If eventChargeLimitReached is returned true, the loop exits and remaining records are not pushed. A final summary record is always appended — whether the run completed normally, hit a spending limit, or encountered a crawl error — so you always know how many records were collected and from which sources.
Tips for best results
-
Always use the RESIDENTIAL proxy group. All four directories detect and block datacenter IPs. Running without
"apifyProxyGroups": ["RESIDENTIAL"]— or with a misconfigured proxy — is the leading cause of zero-result runs. The defaultproxyConfigurationis already set correctly; do not remove it. -
Test with a single source first. Before committing to a large 4-source run, test with
"sources": ["clutch"]andmaxAgenciesPerSource: 10. This verifies your service slug is valid and your proxy is working before spending on a full batch. -
Service slugs must match directory URL paths. Slugs like
seo,digital-marketing, andppcare consistent across all four directories. Niche slugs likeinfluencer-marketingmay not exist on AgencySpotter or Sortlist — those sources will return zero results for unrecognised slugs rather than erroring out. -
Filter by location post-run for best coverage. Passing a
locationvalue narrows results by modifying the directory URL, but the four directories handle location strings inconsistently. Omittinglocationand filtering the output dataset by thelocationfield gives you full coverage plus filtering flexibility. -
Combine with Website Contact Scraper for outreach lists. Once you have a set of agency domains, feed the
domaincolumn into Website Contact Scraper to extract email addresses and phone numbers from each agency's website — turning a directory list into a full outreach-ready dataset in one pipeline. -
Schedule weekly runs for a living database. New agencies list on these directories regularly. A weekly scheduled run with downstream deduplication by domain keeps your prospecting list current without manual effort.
-
Set a spending limit for exploratory runs. When experimenting with a new service slug or location string, set a $3–$5 spending limit in the run settings. The actor stops cleanly when the limit is reached and outputs whatever it has collected, so you can assess quality before committing to a full run.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Website Contact Scraper | Feed the domain output into Website Contact Scraper to add email addresses and phone numbers to each agency record for outreach |
| Email Pattern Finder | Run Email Pattern Finder on each domain to detect the naming convention (e.g. [email protected]) before crafting personalised outreach |
| Waterfall Contact Enrichment | Enrich each agency domain through a 10-step contact enrichment cascade to find decision-maker names and emails |
| Bulk Email Verifier | Verify email addresses found for agencies before adding them to outreach sequences to protect sender reputation |
| HubSpot Lead Pusher | Push the completed agency dataset directly into HubSpot as company records with associated contact data |
| Website Tech Stack Detector | Detect which marketing tools each agency runs — useful for targeting agencies that use a specific platform your product integrates with |
| B2B Lead Qualifier | Score the scraped agency list on 30+ signals to prioritise outreach to the highest-fit prospects first |
Limitations
- No JavaScript-free fallback. The actor uses Playwright for full browser rendering. This adds startup overhead — runs are slower than HTTP-only scrapers. Expect 15–25 minutes for 200 agencies across four sources.
- Directory HTML changes can break extraction. Selectors use class-name substring matching to tolerate minor changes, but a full directory redesign may require selector updates. Open an issue in the Issues tab if results suddenly drop to zero.
- Sortlist does not consistently expose minimum project size. The
minProjectSizefield will benullfor all Sortlist records. - AgencySpotter does not consistently expose ratings or review counts. The
ratingandreviewCountfields will benullfor most AgencySpotter records. - Location filtering is best-effort. The
locationvalue is appended to the directory URL where supported but the actor does not apply its own post-filter. Non-English city names may be ignored or misinterpreted by some directories. - Hard cap of 500 agencies per source. For directories with thousands of listings, the actor covers the first N pages sorted by the directory's default ranking — typically by review count or featured status. The highest-reviewed agencies appear first.
- Deduplication is domain-based only within a single run. Two agencies at different domains that are the same company (e.g. after a rebrand) will both appear. Merging datasets across multiple runs will introduce duplicates — filter by
domainin your downstream tooling. - No individual profile page crawling. Data is extracted from listing cards only. Agency profile pages contain additional case studies, client lists, and award data that this actor does not visit.
- Residential proxy costs are separate. Apify residential proxy bandwidth is billed at standard Apify platform rates in addition to the per-agency PPE charge of $0.05.
Integrations
- Zapier — trigger a Zap when a run completes to route high-rated agencies directly into a CRM deal stage or sales sequence
- Make — build a scenario that pulls agency results after each run and cross-references them against existing CRM contacts before creating new records
- Google Sheets — append scraped agency rows to a shared spreadsheet for team review and manual qualification before outreach
- Apify API — trigger runs programmatically from your internal tooling and retrieve results in JSON or CSV for downstream processing
- Webhooks — post the completed dataset URL to a Slack channel or internal endpoint the moment a run finishes
- LangChain / LlamaIndex — load agency records into a vector store to power an AI assistant that answers questions about the agency landscape in a given market
Troubleshooting
-
Zero results from one or more sources — The most common cause is a missing or misconfigured proxy. Confirm
proxyConfiguration.useApifyProxyistrueandapifyProxyGroupsincludes"RESIDENTIAL". Datacenter IPs are blocked by all four directories. The second common cause is an unrecognised service slug — verify the slug exists as a URL path on the target directory before running at scale. -
Run completes but most fields are null — Individual fields like
reviewCount,rating, andminProjectSizewill be null when the directory does not display that information on its listing cards. AgencySpotter does not consistently show ratings. This is expected behaviour, not a scraping error. -
Fewer agencies than expected from a source — Each directory's default sort order shows featured and sponsored listings first. If the chosen service category has fewer agencies than
maxAgenciesPerSource, the actor returns all available listings and stops paginating. This is normal. -
Run timing out on large batches — Playwright-based scraping with residential proxies is slower than HTTP scraping. Runs collecting 500 agencies per source may take 60–90 minutes. Increase the actor's timeout in the run settings if needed, or split the work across multiple smaller runs.
-
Duplicate agencies in merged datasets — Deduplication operates within a single run by domain. If you merge datasets from multiple runs, duplicates will appear. Filter by
domainin your downstream tooling to deduplicate across runs.
Responsible use
- This actor accesses only publicly available agency listing data from directories whose core business model is based on public discovery of agency firms.
- Respect the terms of service of each directory. Do not use this actor to systematically republish directory content or create a competing agency database.
- When using scraped agency data for outreach, comply with CAN-SPAM, GDPR, and all other applicable data protection regulations in your jurisdiction.
- Do not use extracted data for spam, harassment, or any unsolicited commercial contact that violates applicable law.
- For guidance on web scraping legality, see Apify's guide.
FAQ
How many agencies can I scrape from agency directories in one run? Up to 500 agencies per source directory across four sources — giving a maximum of approximately 2,000 deduplicated agency records per run. In practice, most runs targeting a specific service category return fewer because not every directory has 500 listings for every niche.
Which agency directories does this actor support?
The actor supports Clutch, Sortlist, AgencySpotter, and DesignRush. You can scrape all four simultaneously or any subset by setting the sources input parameter.
Does agency directory scraping work without a proxy?
No. All four directories block datacenter IP addresses. You must use residential proxies — the default proxyConfiguration in the input is already set to Apify's RESIDENTIAL proxy group. Do not remove it or expect zero results.
What service category slugs can I use for agency directory scraping?
Common slugs include digital-marketing, seo, ppc, web-design, web-development, social-media-marketing, content-marketing, branding, video-production, and email-marketing. The slug maps to the URL path used on each directory. Not all slugs exist on all four directories — unrecognised slugs produce zero results for that source without breaking the run.
How long does a typical agency directory scraping run take? A standard run with 4 sources at 50 agencies each takes 15–25 minutes. Runs at the maximum of 500 agencies per source may take 60–90 minutes due to Playwright browser overhead and residential proxy latency.
How accurate is the extracted agency data? Agency names, websites, and service tags are reliably extracted from listing cards. Ratings and review counts are present on Clutch, Sortlist, and DesignRush but not consistently on AgencySpotter. Minimum project size is available only from Clutch and DesignRush.
How is this actor different from scraping Clutch directly with a generic scraper? This actor handles all four directories with a single run and a single input configuration. It includes directory-specific selectors for each site's HTML structure, domain-based deduplication across sources, parsed and normalised numeric fields, and pre-configured residential proxy integration. A generic scraper would require you to build and maintain all of that separately for each directory.
Can I filter agency directory results by location?
You can pass a location string in the input and the actor will attempt to apply it to each directory's URL. However, the four directories handle location filtering inconsistently in their URL structures. For the most reliable results, omit the location filter and filter the output dataset by the location field after the run completes.
Can I schedule this actor to run periodically? Yes. Apify's scheduler lets you run the actor on any cron schedule — daily, weekly, or monthly. Each run produces a fresh dataset. Use the Apify API or a Make/Zapier integration to merge new results into your CRM while deduplicating by domain.
Is it legal to scrape agency directories like Clutch and DesignRush? These directories publish agency information publicly as their core business model — the data is intentionally available for anyone to browse. Scraping publicly visible business information for research and prospecting is generally lawful in most jurisdictions. Review each directory's terms of service before large-scale use. For a detailed analysis of web scraping legality, see Apify's guide.
Can I use the output with other Apify actors to get contact emails?
Yes. Feed the domain field from this actor into Website Contact Scraper to extract emails and phone numbers, or into Waterfall Contact Enrichment for a broader multi-step enrichment pipeline. The domain field is structured specifically to serve as input for downstream contact extraction actors.
What happens if a directory changes its HTML structure and breaks extraction?
The selectors use broad CSS class-name substring matching (e.g. [class*="provider-row"]) to tolerate minor HTML changes. A full directory redesign may break extraction for that source. If a source starts returning zero results without a proxy issue, open an issue in the Issues tab with your run ID so the selectors can be updated.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Weather Forecast Search
Get weather forecasts for any location worldwide using the free Open-Meteo API. Returns current conditions, daily and hourly forecasts with temperature, precipitation, wind, UV index, and more. No API key needed.
EUIPO EU Trademark Search
Search EU trademarks via official EUIPO database. Find registered and pending trademarks by name, Nice class, applicant, or status. Returns full trademark details and filing history.
Nominatim Address Geocoder
Geocode addresses to GPS coordinates and reverse geocode coordinates to addresses using OpenStreetMap Nominatim. Batch geocoding with rate limiting. Free, no API key needed.
Ready to try Agency Directory Scraper?
Start for free on Apify. No credit card required.
Open on Apify Store