DEVELOPER TOOLS

Schema Registry

Central registry for actor output schemas. Store, version, and share dataset schemas across your actor portfolio. Detect schema drift between builds.

Try on Apify Store
$0.25per event
1
Users (30d)
2
Runs (30d)
90
Actively maintained
Maintenance Pulse
$0.25
Per event

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

schema-registereds
Estimated cost:$25.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
schema-registeredCharged per schema registration.$0.25

Example: 100 events = $25.00 · 1,000 events = $250.00

Documentation

Actor output schema registry for Apify lets you search and browse every dataset field across your entire actor fleet in one run. Instead of opening each actor's build page to check what it outputs, this tool builds a cross-fleet index and answers "which of my actors produce an email field?" in seconds.

Built for Apify developers who manage multiple actors, this tool scans up to 500 actors, indexes every field from each actor's dataset schema definition, and returns either a full registry or targeted search results. It requires no browser, no proxies, and no manual clicking — just a single API-driven scan of your account's latest builds.

What data can you extract?

Data PointSourceExample
📊 Total actors scannedAccount API scan52 actors in account
Actors with schemasBuild schema detection34 actors have dataset definitions
🔢 Unique field countCross-fleet field index412 unique field names indexed
🔍 Search match countField name substring match6 matches for "email"
📋 Field nameActor build definitionemail, contactEmail, emailAddress
🏷️ Field typeSchema type annotationstring, number, boolean, array
📝 Field descriptionSchema description text"Primary contact email address"
🎭 Actor full nameAccount actor listryanclinton/website-contact-scraper
🔑 Actor IDAccount actor listabc123XYZ789
📁 Category filter appliedInput parameterLEAD_GENERATION
🕐 Registry timestampRun completion2026-03-20T14:32:00.000Z

Why use Actor Schema Registry?

Managing more than a handful of actors means losing track of what each one outputs. Finding a specific field means opening the Apify console, navigating to each actor, locating the build, and checking the schema definition — manually, one actor at a time. With 20+ actors, that process takes 30-45 minutes and you still might miss variants like contactEmail vs emailAddress.

This actor automates the entire process: one run scans your full fleet, builds a searchable index, and tells you exactly which actors output the field you need — along with the field type and actor ID for immediate API use.

  • Scheduling — run weekly to keep your schema index fresh as you build and update actors
  • API access — trigger registry scans from Python, JavaScript, or any HTTP client to power internal tooling
  • Monitoring — get Slack or email alerts when runs fail or the schema count changes unexpectedly
  • Integrations — pipe registry output into Google Sheets, Notion, or webhooks to maintain living documentation
  • Low cost — $0.20 per scan covers your entire fleet with no per-actor charges

Features

  • Fleet-wide schema indexing — calls the Apify Builds API for every actor in your account, reads actorDefinition.storages.dataset.fields from each latest build, and assembles a unified index in a single run
  • Field name search — case-insensitive substring matching across all indexed field names, so searching email matches email, contactEmail, emailAddress, and any other variant
  • Sorted search results — matching fields ranked by the number of actors that contain them, so your most-used fields appear first
  • Full browse mode — omit searchField to get a complete per-actor registry showing every field name, type, and description for each actor that has a dataset schema defined
  • Category filtering — supply an Apify Store category string (e.g., LEAD_GENERATION, SEO_TOOLS) to narrow the scan to a subset of actors before indexing
  • Type reporting — extracts the JSON Schema type annotation for each field: string, number, boolean, array, object, or unknown when the type is not declared
  • Timeout protection — all API calls use AbortSignal.timeout(30000), so a slow or unresponsive build endpoint never hangs the run indefinitely
  • Error resilience — actors with missing builds, failed build fetches, or no dataset schema are silently skipped rather than causing run failure
  • Minimal footprint — runs in 128MB memory, makes only Apify API calls (no external HTTP), and produces a single compact report object
  • Pay-per-event pricing — charges only on successful schema-search completion; no charge if the run errors before producing output

Use cases for actor output schema registry

Actor pipeline design

When you need to chain actors together — for example, feeding one actor's output into another's input — you need to know the exact field names both actors use. The schema registry answers this immediately: search for the output field name from actor A, confirm it matches the expected input field of actor B, and build your pipeline with confidence. Combine with Pipeline Builder to automate the entire workflow design step.

Schema documentation and auditing

Growing actor portfolios quickly become undocumented. This actor generates a structured inventory of every dataset field across your fleet, usable as living documentation for your team. Run it weekly on a schedule and push the output to a Google Sheet to maintain an always-current field reference without any manual work.

Data integration planning

Before connecting an actor's output to a downstream system — HubSpot, a data warehouse, a Zapier workflow — you need to know the exact field names and types the actor produces. The schema registry gives you this in a structured JSON format you can parse programmatically rather than reading through each actor's README.

Actor quality audit

Use the full browse mode to spot actors in your fleet that have no dataset schema defined. Actors without schemas produce undocumented output that is harder to integrate and harder to validate. The actorsWithSchema vs totalActors gap in the report immediately surfaces which actors need schema annotations added. Pair with Schema Validator to enforce field contracts on those actors.

Finding overlapping outputs across actors

When multiple actors in your fleet potentially produce similar data (e.g., several lead generation actors all extracting email addresses), searching for email shows every actor that outputs that field, the exact field name variant each uses, and the actor IDs. This is the fastest way to identify redundancy, inconsistent naming, or opportunities to standardize output fields across your portfolio.

Developer onboarding

When a new developer joins your team and needs to understand which actors produce which data, the schema registry gives them a complete field inventory in one run instead of requiring them to read through dozens of READMEs and build pages.

How to search actor output schemas

  1. Go to the actor input panel — navigate to the Actor Schema Registry page on Apify and open the input panel. No configuration is required to get started.
  2. Enter a field name to search — type a field name like email, price, or rating in the Search Field box. Leave it blank to get the full registry of all actors and their fields. Optionally add a category like LEAD_GENERATION to narrow the scan.
  3. Click Start and wait — the actor scans all actors in your account, fetches their latest build schemas, and builds the index. A typical run covering 50 actors completes in under 60 seconds.
  4. Download results — open the Dataset tab, download as JSON or CSV, or read the single report object directly from the run output.

Input parameters

ParameterTypeRequiredDefaultDescription
searchFieldstringNo(none)Field name to search for across all actor dataset schemas. Case-insensitive substring match — email matches email, emailAddress, contactEmail. Omit to get the full registry.
categorystringNo(none)Apify Store category to filter actors before scanning. Examples: LEAD_GENERATION, SEO_TOOLS, SOCIAL_MEDIA. Omit to scan all actors.

Input examples

Search for all actors that output an email field:

{
  "searchField": "email"
}

Browse the full schema registry for lead generation actors only:

{
  "category": "LEAD_GENERATION"
}

Search within a specific category:

{
  "searchField": "price",
  "category": "ECOMMERCE"
}

Input tips

  • Omit both fields for a full inventory — running with no inputs returns the complete registry of every actor with a dataset schema, which is the best starting point for documentation or auditing.
  • Use short substrings for broader matches — searching url will match url, pageUrl, sourceUrl, profileUrl, giving you the full picture of URL-type fields across your fleet.
  • Use category filtering on large accounts — if you have 200+ actors, adding a category cuts the number of build fetches and speeds up the run noticeably.
  • Pipe results into Schema Diff — once you find two actors that both output an email field, pass their IDs to Schema Diff to see how their complete schemas compare.

Output example

{
  "totalActors": 52,
  "actorsWithSchema": 34,
  "totalFields": 412,
  "searchField": "email",
  "searchResults": [
    {
      "field": "email",
      "type": "string",
      "foundIn": [
        { "actor": "ryanclinton/website-contact-scraper", "actorId": "tF3mNxKpWqR8vBzL" },
        { "actor": "ryanclinton/google-maps-email-extractor", "actorId": "gM9cYjSdR2xKpNwV" },
        { "actor": "ryanclinton/event-lead-extractor", "actorId": "hQ5rXbLnD4tPmCwZ" },
        { "actor": "ryanclinton/b2b-lead-qualifier", "actorId": "kR7sZdMqN3yJvBxT" }
      ]
    },
    {
      "field": "emailAddress",
      "type": "string",
      "foundIn": [
        { "actor": "ryanclinton/waterfall-contact-enrichment", "actorId": "pN2wVxLkR6cDtYqB" },
        { "actor": "ryanclinton/email-pattern-finder", "actorId": "mJ4cBsXrT8nKpLwV" }
      ]
    },
    {
      "field": "contactEmail",
      "type": "string",
      "foundIn": [
        { "actor": "ryanclinton/company-deep-research", "actorId": "vL6nTrKxB9mQsYdP" }
      ]
    }
  ],
  "registry": null,
  "registryAt": "2026-03-20T14:32:17.841Z"
}

Output fields

FieldTypeDescription
totalActorsnumberCount of actors scanned (after category filter applied)
actorsWithSchemanumberCount of actors that had a readable dataset schema in their latest build
totalFieldsnumberCount of unique field names found across all schemas
searchFieldstring or nullThe search term supplied in input, or null if browse mode was used
searchResultsarray or undefinedPresent when searchField was provided. Array of matching field objects sorted by occurrence count descending.
searchResults[].fieldstringExact field name as declared in the actor's schema
searchResults[].typestringJSON Schema type of the field (string, number, boolean, array, object, unknown)
searchResults[].foundInarrayList of actors that contain this field
searchResults[].foundIn[].actorstringFull actor name in username/actorname format
searchResults[].foundIn[].actorIdstringApify actor ID, usable directly in API calls
registryarray or undefinedPresent when no searchField was provided. Per-actor listing of all fields.
registry[].actorNamestringFull actor name in username/actorname format
registry[].actorIdstringApify actor ID
registry[].fieldsarrayAll dataset fields declared in the actor's schema
registry[].fields[].namestringField name
registry[].fields[].typestringJSON Schema type
registry[].fields[].descriptionstring or undefinedField description if declared in the schema
registryAtstringISO 8601 timestamp of when the registry was built

How much does it cost to search actor output schemas?

Actor Schema Registry uses pay-per-event pricing — you pay $0.20 per schema search. Platform compute costs are included. The actor charges once per successful run, regardless of how many actors it scans.

ScenarioRunsCost per runTotal cost
Quick test1$0.20$0.20
Daily schema search7$0.20$1.40
Weekly documentation refresh4$0.20$0.80/month
Team of 5 developers20$0.20$4.00/month
Automated CI pipeline checks100$0.20$20.00/month

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.

There is no comparable self-service tool for schema discovery across Apify actor fleets. The alternative is manual — opening each actor's build page in the console — which takes 30-60 seconds per actor. For a 50-actor fleet, that is 25-50 minutes of manual work replaced by a $0.20 run.

Search actor output schemas using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/actor-schema-registry").call(run_input={
    "searchField": "email",
    "category": "LEAD_GENERATION"
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"Scanned {item['totalActors']} actors, found schema in {item['actorsWithSchema']}")
    if item.get("searchResults"):
        for result in item["searchResults"]:
            actors = ", ".join(r["actor"] for r in result["foundIn"])
            print(f"  Field '{result['field']}' ({result['type']}) found in: {actors}")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/actor-schema-registry").call({
    searchField: "email",
    category: "LEAD_GENERATION"
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    console.log(`Scanned ${item.totalActors} actors, ${item.actorsWithSchema} with schemas`);
    if (item.searchResults) {
        for (const result of item.searchResults) {
            const actors = result.foundIn.map(r => r.actor).join(", ");
            console.log(`  '${result.field}' (${result.type}) → ${actors}`);
        }
    }
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~actor-schema-registry/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"searchField": "email", "category": "LEAD_GENERATION"}'

# Fetch results (replace DATASET_ID from the run response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How Actor Schema Registry works

Phase 1: Actor fleet enumeration

The actor calls GET /v2/acts?token=...&limit=500&my=true to retrieve all actors owned by the token's account, up to 500 at a time. If a category filter was supplied, the list is filtered in memory against each actor's categories array before any build fetches begin. This means the category filter happens client-side against the actor metadata, not as an API-level query parameter.

Phase 2: Build schema extraction

For each actor in the filtered list, the code reads actor.taggedBuilds?.latest?.buildId from the actor metadata. If no latest build exists (the actor has never been built), it is skipped. For actors with a build ID, the actor calls GET /v2/acts/{actorId}/builds/{buildId} and reads the data.actorDefinition.storages.dataset.fields object from the response. This object contains the actor's declared output schema. Each field entry is an object with optional type, title, and description properties following JSON Schema conventions. All fetch calls use AbortSignal.timeout(30000) to enforce a 30-second timeout per request. Any build fetch that fails or times out is silently skipped — the run continues with remaining actors.

Phase 3: Index construction

Two data structures are built in parallel. The registry array accumulates one entry per actor that had a parseable schema, containing the actor name, actor ID, and the full array of field objects. The fieldIndex map inverts this structure: keys are field names, values are arrays of { actor, actorId, type } objects. This inverted index is what powers the fast field name search.

Phase 4: Search or browse output

When searchField is provided, the code filters fieldIndex keys using a case-insensitive includes() check against the lowercased search term. Matching entries are mapped to search result objects and sorted descending by foundIn.length — fields present in more actors appear first. The final report object sets searchResults and leaves registry as undefined. In browse mode (no searchField), the report sets registry with the full per-actor listing and leaves searchResults as undefined. The report is pushed to the dataset as a single item via Actor.pushData().

Tips for best results

  1. Run in browse mode first. Start your first run with no inputs to get a complete inventory. This shows you both how many actors have schemas and which ones are missing definitions entirely.
  2. Use short search terms for discovery. Searching url returns all URL-variant fields across your fleet. Use longer terms like contactEmail when you know the exact field name you are looking for.
  3. Combine category filter with search. If you have actors across many categories, adding category: "ECOMMERCE" before searching for price eliminates noise from non-commerce actors and speeds up the run.
  4. Schedule weekly for documentation. A $0.20 weekly scheduled run keeps a living record of your fleet's output fields. Connect the output to a Google Sheet via Apify integrations to maintain auto-updating documentation.
  5. Use actor IDs from results directly. The actorId field in every result is the exact ID you need for Apify API calls — no additional lookup required. Copy it directly into your API requests or Pipeline Builder configurations.
  6. Run before pipeline design. Before building a multi-actor pipeline, run the schema registry to confirm which actors produce the fields you need and what those fields are typed as. This prevents integration failures caused by field name assumptions.
  7. Track actorsWithSchema vs totalActors as a quality metric. A large gap between these two numbers means many actors in your fleet have undocumented output. Use this as a signal to add storages.dataset definitions to your actor builds.

Combine with other Apify actors

ActorHow to combine
Schema DiffUse Schema Registry to find all actors outputting a target field, then pass two actor IDs to Schema Diff to compare their complete dataset schemas side by side.
Pipeline BuilderRegistry identifies which actors produce the output fields you need; Pipeline Builder chains those actors into an automated multi-step workflow.
Schema ValidatorRegistry surfaces actors with missing schemas; Schema Validator tests whether an actor's actual output matches its declared schema definition.
Actor Quality AuditRun Schema Registry to find actors without dataset schemas, then feed those actor IDs into Quality Audit for a full compliance review.
B2B Lead Gen SuiteVerify that all lead generation actors in your fleet output consistent field names (e.g., email vs emailAddress) before connecting them to a shared CRM pipeline.
Website Contact ScraperUse Schema Registry to confirm the exact field names this actor outputs before writing downstream processing code.
Waterfall Contact EnrichmentSearch the registry for email variants to identify which enrichment actors produce which field names, ensuring consistent field mapping in your enrichment pipeline.

Limitations

  • Maximum 500 actors per scan. The Apify actors list API is called with limit=500. Accounts with more than 500 actors will only have the first 500 scanned.
  • Only actors with a latest build are scanned. Actors that have been created but never built are skipped silently. Build the actor at least once to make its schema discoverable.
  • Only dataset schemas are indexed. The actor reads actorDefinition.storages.dataset.fields only. Key-value store schemas, request queue schemas, and other storage types are not indexed.
  • Schema must be declared in the build. If an actor outputs data but has not declared a storages.dataset block in its actor definition, it will appear in the totalActors count but not in actorsWithSchema. Many actors on the Apify Store omit formal schema declarations.
  • Field types depend on schema quality. If an actor's schema does not include a type annotation for a field, the type will be reported as unknown. This is a property of the source schema, not a bug in this actor.
  • No pagination of build results. The actor fetches only the latest tagged build per actor. Historical builds and their schemas are not accessible.
  • Category filtering is client-side. Category filtering happens after fetching the full actor list. It does not reduce the number of API calls to the actors list endpoint.
  • Actor must be owned by the token's account. This actor uses my=true on the actors list call, so it only scans actors you own. It cannot scan another user's actors or actors from a shared organization account unless the token belongs to that account.

Integrations

  • Zapier — trigger an actor schema search when a new actor is deployed, then post the field inventory to a Slack channel or Notion page
  • Make — schedule weekly schema registry runs and pipe the output into a Google Sheet to maintain a living schema documentation table
  • Google Sheets — export the full registry as a spreadsheet where each row is one actor-field combination, giving your team a searchable field reference
  • Apify API — call the actor programmatically from CI/CD pipelines to verify schema presence before deploying new actor versions
  • Webhooks — notify your team's Slack or Teams channel when a schema registry run completes or when the actorsWithSchema count drops below a threshold
  • LangChain / LlamaIndex — feed the schema registry output as structured context into an LLM workflow to let AI assistants answer questions about which actors in your fleet produce specific data types

Troubleshooting

  • actorsWithSchema is much lower than totalActors — This is expected for many Apify actor portfolios. Most actors do not declare a formal storages.dataset block in their actor definition. To fix this, add a storages.dataset.fields declaration to your actor's actor.json or .actor/actor.json and rebuild. After rebuilding, a fresh registry scan will include the newly declared fields.

  • Run returns 0 actors — Verify that the APIFY_TOKEN environment variable is available. When running via the Apify platform this is set automatically. If running locally, ensure the token is configured. Also confirm the token has access to the actors you expect — the scan uses my=true, which returns only actors owned by the token's account.

  • Category filter returns fewer actors than expected — Category values are case-sensitive strings matching the Apify Store categories exactly: LEAD_GENERATION, SEO_TOOLS, SOCIAL_MEDIA, ECOMMERCE, etc. Check the actor's category assignment in the Apify console if an actor you expect to appear is missing from filtered results.

  • Run times out on a large fleet — Each build fetch carries a 30-second timeout. For accounts with many actors and slow build API responses, total run time grows proportionally. Applying a category filter reduces the number of build fetches and cuts run time. If the issue persists, contact support with run sharing enabled (see below).

  • Search returns no results despite knowing a field exists — Confirm the actor has been built (a latest build must exist) and that its actor definition includes a storages.dataset.fields block. If the field is declared but the actor was built before the schema was added, rebuild the actor to update the stored build definition.

Responsible use

  • This actor only accesses actors and build data owned by the API token's account.
  • It does not access any public or third-party actor schemas without authorization.
  • Schema data belongs to the actor developer — treat extracted field definitions as internal intellectual property.
  • For guidance on Apify API usage limits and fair use, see Apify's documentation.

FAQ

How does Actor Schema Registry find actor output fields? It calls the Apify Builds API for each actor's latest build and reads the actorDefinition.storages.dataset.fields object. This is the formal dataset schema declaration, separate from the actor's README or output examples. Only actors that have this block defined in their build will appear in the actorsWithSchema count.

How many actors can the schema registry scan in one run? Up to 500 actors per run. The Apify actors list API is queried with limit=500&my=true. If your account has more than 500 actors, only the first 500 returned by the API will be scanned.

Does searching actor schemas require running the actors themselves? No. The registry reads static schema definitions from build metadata only. No actor runs are triggered, no scraping happens, and no external websites are accessed. The scan is purely API-driven against your Apify account data.

What is the difference between search mode and browse mode? When you provide a searchField, the actor returns only the fields that match your search term, sorted by how many actors contain them. When searchField is omitted, the actor returns the complete registry — one entry per actor, listing all of that actor's declared fields. Browse mode is better for auditing and documentation; search mode is better when you know what field you are looking for.

How accurate is the field type information? Type accuracy depends entirely on the quality of the source schema. Fields declared with a type property (e.g., "type": "string") are reported accurately. Fields with no type declaration are reported as unknown. The registry does not validate or infer types from actual actor output — it only reads what is declared in the schema definition.

How long does a schema registry run take? A 50-actor fleet typically completes in under 60 seconds. Each actor requires one build API call with a 30-second timeout. Total time scales roughly linearly with actor count. Applying a category filter reduces the number of actors processed and speeds up the run proportionally.

Can I use the schema registry to search another developer's actors? No. The actor uses my=true on the actors list call, which returns only actors owned by the API token's account. To search another developer's actors, you would need a token that belongs to their account.

How is the schema registry different from browsing the Apify console manually? The console requires you to open each actor individually, navigate to the build page, and inspect the actor definition. For a 50-actor fleet, that is approximately 30-50 minutes of manual navigation. This actor completes the same task in under 60 seconds and returns structured, searchable JSON output.

Can I schedule the schema registry to run automatically? Yes. Use Apify's built-in scheduler to run this actor on any interval — daily, weekly, or monthly. The output can be connected to Google Sheets or webhooks to maintain automatically updated schema documentation.

What happens if an actor has no latest build? Actors with no taggedBuilds.latest.buildId are skipped silently. They are counted in totalActors but not in actorsWithSchema. Build the actor at least once to make its schema available for indexing.

Does the schema registry support searching by field type? Not currently. Search is by field name only (case-insensitive substring match). You can filter results by type after downloading the output if you need type-based filtering.

Is it legal to scan actor schemas using the API? Yes. You are accessing your own account's data using your own API token. The Apify API is designed for programmatic account access. Scanning your own actors' build metadata is explicitly supported and within normal API usage.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Schema Registry?

Start for free on Apify. No credit card required.

Open on Apify Store