DEVELOPER TOOLS

Schema Diff

Compare schema versions between actor builds. Detects added, removed, and changed fields in input and output schemas. Helps prevent breaking changes.

Try on Apify Store
$0.25per event
1
Users (30d)
4
Runs (30d)
90
Actively maintained
Maintenance Pulse
$0.25
Per event

Maintenance Pulse

90/100
Last Build
Today
Last Version
1d ago
Builds (30d)
8
Issue Response
N/A

Cost Estimate

How many results do you need?

schema-diffs
Estimated cost:$25.00

Pricing

Pay Per Event model. You only pay for what you use.

EventDescriptionPrice
schema-diffCharged per schema comparison.$0.25

Example: 100 events = $25.00 · 1,000 events = $250.00

Documentation

Actor Schema Diff compares the dataset output schemas of any two Apify actors side by side, giving you a field-by-field compatibility report in seconds. Built for developers who maintain multi-actor pipelines, evaluate actor replacements, or need to merge outputs from two data sources without surprises. Swap actors confidently — know exactly what breaks before you touch any code.

The actor calls the Apify REST API directly against both actors' latest builds, extracts their dataset field definitions, and computes a compatibility score from 0 to 100. No actor runs are triggered, no credits are burned on actual scraping. The entire comparison costs $0.20 per pair.

What data can you extract from an actor schema comparison?

Data PointSourceExample
📊 Compatibility scoreField overlap ratio72%
Shared fieldsName + type matchurl (string), emails (array)
⚠️ Type mismatchesSame field name, different typesphone: string vs array
🅰️ Fields unique to Actor APresent in A, absent in Bdomain, socialLinks, pagesScraped
🅱️ Fields unique to Actor BPresent in B, absent in AbusinessName, rating, address
📝 Migration notesAuto-generated guidance'phone' needs transformation: string → array
🔢 Field countsPer-actor totalsActor A: 14 fields, Actor B: 9 fields
🕐 Comparison timestampISO 86012026-03-20T14:32:00.000Z

Why use Actor Schema Diff to compare Apify actor outputs?

Integrating two actors into a pipeline means trusting their outputs fit together. Without a schema comparison tool, you discover mismatches at runtime — a field you expected as a string arrives as an array, a field your downstream code depends on simply does not exist in the replacement actor, or a migration sends data to the wrong place. Debugging those failures takes hours.

This actor automates the entire comparison in under 10 seconds. Paste in two actor IDs, get back a structured compatibility report with a numeric score, a categorised field breakdown, and migration notes written in plain English.

  • Scheduling — run Schema Diff on a daily schedule to detect when an upstream actor's schema changes under you
  • API access — trigger comparisons from Python, JavaScript, or any HTTP client before deploying pipeline changes
  • Proxy rotation — not needed; this actor calls the Apify API directly with your token
  • Monitoring — get Slack or email alerts when a comparison returns a compatibility score below your threshold
  • Integrations — connect to Zapier or Make to notify your team whenever a schema drift is detected

Features

  • Field-by-field comparison — inspects every field in both actors' latest-build dataset schemas; no sampling, no approximation
  • Four-category classification — fields are sorted into: shared (name and type match), type mismatches (same name, incompatible types), unique to Actor A, unique to Actor B
  • Compatibility score 0–100 — calculated as sharedFields / (totalUniqueFields) expressed as a percentage; 100 means the schemas are identical, 0 means no overlap
  • Auto-generated migration notes — for each type mismatch, a note of the form 'fieldName' needs transformation: typeA → typeB; for unique fields, a count of unmapped fields per side
  • Latest build resolution — fetches taggedBuilds.latest.buildId from the actor metadata endpoint, then reads actorDefinition.storages.dataset.fields from that build; always reflects the current published schema
  • Actor ID normalisation — accepts both username/actor-name and raw Apify actor IDs; converts / to ~ automatically for the REST API
  • No actor runs triggered — the comparison is pure metadata analysis; you are charged $0.20 per comparison, not per field or per actor run
  • Minimal footprint — runs in 128MB of memory; completes in under 30 seconds for any valid actor pair
  • Parallel fetching — both actor schemas are fetched simultaneously via Promise.all, halving round-trip time
  • Structured error handling — if either actor ID is invalid or inaccessible, a structured error record is pushed to the dataset rather than an unhandled crash

Use cases for Apify actor schema comparison

Pipeline compatibility validation before deployment

Developers building multi-step pipelines need to confirm that the output of Step N feeds correctly into Step N+1. Running Schema Diff before deploying a pipeline change catches field name mismatches and type incompatibilities before they cause data loss or downstream errors in production.

Actor replacement and migration planning

When an actor you depend on is deprecated or you find a better alternative, Schema Diff tells you exactly what changed. A 72% compatibility score with three migration notes tells you the swap is manageable; a 15% score with 20 unique fields tells you a full ETL rewrite is needed. Make that decision in 10 seconds, not after a failed migration.

Continuous schema drift detection

Actor schemas can change between builds without notice. Schedule Schema Diff to run daily against actor pairs in your live pipelines. If the compatibility score drops below a set threshold, trigger a Slack alert or webhook to notify your team before bad data reaches downstream systems.

Data merging and output unification

When combining outputs from two actors that cover the same data domain — for example, two contact scrapers or two business data sources — Schema Diff identifies the exact overlap. Fields in sharedFields can be merged directly; fields in uniqueToA and uniqueToB tell you what new columns your merged schema needs.

Evaluating third-party actors before adoption

Before committing to a third-party actor from the Apify Store, compare its output schema against the actor it will replace in your workflow. If the compatibility score is high, integration work is minimal. If it is low, the migration notes give you a pre-written checklist of transformation tasks.

Building actor selection tools and dashboards

Platform teams building internal tooling around the Apify Store can use Schema Diff programmatically to power actor comparison UIs, compatibility matrices, and pipeline health dashboards. The structured JSON output integrates directly into any analytics or observability stack.

How to compare Apify actor output schemas

  1. Get the actor IDs — find the two actors you want to compare. You can use either the numeric Apify actor ID or the username/actor-name format from the Store URL (for example, ryanclinton/website-contact-scraper).
  2. Paste both IDs into the input — enter the first actor in the "Actor A" field and the second in the "Actor B" field. No other configuration is needed.
  3. Run the actor — click "Start". The comparison completes in under 30 seconds. No credit is spent on running the source actors themselves.
  4. Download the report — open the Dataset tab to see the full JSON comparison. Export as JSON or CSV to use in your own tooling.

Input parameters

ParameterTypeRequiredDefaultDescription
actorIdAstringYesryanclinton/fred-economic-dataFirst actor to compare. Accepts username/actor-name or a raw Apify actor ID.
actorIdBstringYesryanclinton/usgs-earthquake-searchSecond actor to compare. Accepts username/actor-name or a raw Apify actor ID.

Input examples

Compare two contact scrapers before swapping one for the other:

{
    "actorIdA": "ryanclinton/website-contact-scraper",
    "actorIdB": "ryanclinton/google-maps-email-extractor"
}

Compare a third-party actor against your current actor before adoption:

{
    "actorIdA": "ryanclinton/email-pattern-finder",
    "actorIdB": "apify/web-scraper"
}

Compare two actors using raw Apify actor IDs:

{
    "actorIdA": "s2FM5uFH9uFH9F2eZ4pZ9~actor-schema-diff",
    "actorIdB": "s2FM5uFH9uFH9F2eZ4pZ9~actor-schema-validator"
}

Input tips

  • Use username/actor-name format — it is easier to read and debug than a raw ID; the actor normalises it automatically
  • Compare against yourself — running a diff between two versions of your own actor (if you have a staging variant) is the fastest way to validate a schema change before publishing
  • Both actors must be public or accessible with your token — private actors belonging to other users will return a fetch error

Output example

{
    "actorA": "ryanclinton/website-contact-scraper",
    "actorB": "ryanclinton/google-maps-email-extractor",
    "actorIdA": "ryanclinton/website-contact-scraper",
    "actorIdB": "ryanclinton/google-maps-email-extractor",
    "fieldsInA": 14,
    "fieldsInB": 11,
    "compatibilityScore": 58,
    "sharedFields": [
        { "field": "url", "type": "string" },
        { "field": "emails", "type": "array" },
        { "field": "phones", "type": "array" },
        { "field": "domain", "type": "string" }
    ],
    "typeMismatches": [
        { "field": "phone", "typeA": "string", "typeB": "array" },
        { "field": "address", "typeA": "object", "typeB": "string" }
    ],
    "uniqueToA": [
        "socialLinks",
        "pagesScraped",
        "linkedInUrl",
        "twitterUrl",
        "facebookUrl",
        "contactForms"
    ],
    "uniqueToB": [
        "businessName",
        "rating",
        "reviewCount",
        "category",
        "googleMapsUrl"
    ],
    "migrationNotes": [
        "Field 'phone' needs transformation: string -> array",
        "Field 'address' needs transformation: object -> string",
        "6 fields in ryanclinton/website-contact-scraper have no equivalent in ryanclinton/google-maps-email-extractor",
        "5 fields in ryanclinton/google-maps-email-extractor are new (not in ryanclinton/website-contact-scraper)"
    ],
    "comparedAt": "2026-03-20T14:32:05.812Z"
}

Output fields

FieldTypeDescription
actorAstringResolved display name of the first actor (username/name)
actorBstringResolved display name of the second actor (username/name)
actorIdAstringActor A ID as provided in the input
actorIdBstringActor B ID as provided in the input
fieldsInAnumberTotal number of dataset fields defined in Actor A's latest build
fieldsInBnumberTotal number of dataset fields defined in Actor B's latest build
compatibilityScorenumberPercentage of fields shared between both schemas (0–100)
sharedFieldsarrayFields with matching name and type in both actors
sharedFields[].fieldstringField name
sharedFields[].typestringField type (e.g. string, array, object, number)
typeMismatchesarrayFields with the same name but different types
typeMismatches[].fieldstringField name
typeMismatches[].typeAstringField type in Actor A
typeMismatches[].typeBstringField type in Actor B
uniqueToAarrayField names present in Actor A but absent in Actor B
uniqueToBarrayField names present in Actor B but absent in Actor A
migrationNotesarrayPlain-English notes describing required transformations
comparedAtstringISO 8601 timestamp of when the comparison ran
errorstringPresent only when one or both actors could not be fetched

How much does it cost to compare Apify actor schemas?

Actor Schema Diff uses pay-per-event pricing — you pay $0.20 per comparison. Platform compute costs are included. No actor runs are triggered, so there are no additional scraping charges.

ScenarioComparisonsCost per comparisonTotal cost
Quick test1$0.20$0.20
Small batch10$0.20$2.00
Medium batch50$0.20$10.00
Large batch200$0.20$40.00
Enterprise1,000$0.20$200.00

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.

Apify's free tier includes $5 of monthly platform credits — enough for 25 comparisons at no cost. Compare this to building and maintaining your own schema comparison tooling, which requires custom API integration code, build pipeline parsing logic, and ongoing maintenance whenever the Apify API changes.

Compare Apify actor schemas using the API

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/actor-schema-diff").call(run_input={
    "actorIdA": "ryanclinton/website-contact-scraper",
    "actorIdB": "ryanclinton/google-maps-email-extractor"
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"Compatibility: {item['compatibilityScore']}%")
    print(f"Shared fields: {len(item['sharedFields'])}")
    print(f"Type mismatches: {len(item['typeMismatches'])}")
    for note in item.get("migrationNotes", []):
        print(f"  - {note}")

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/actor-schema-diff").call({
    actorIdA: "ryanclinton/website-contact-scraper",
    actorIdB: "ryanclinton/google-maps-email-extractor"
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    console.log(`Compatibility score: ${item.compatibilityScore}%`);
    console.log(`Shared: ${item.sharedFields.length} fields`);
    console.log(`Type mismatches: ${item.typeMismatches.length}`);
    item.migrationNotes?.forEach(note => console.log(`  ${note}`));
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~actor-schema-diff/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"actorIdA": "ryanclinton/website-contact-scraper", "actorIdB": "ryanclinton/google-maps-email-extractor"}'

# Fetch results (replace DATASET_ID from the run response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How Actor Schema Diff works

Phase 1 — Actor metadata resolution

The actor calls GET /v2/acts/{actorId}?token=... for both actors simultaneously via Promise.all. The username/actor-name format is normalised to username~actor-name before the request. From the actor metadata response, the actor extracts the name, username, and taggedBuilds.latest.buildId fields. If either actor is unreachable or the ID is invalid, a structured error record is pushed to the dataset and the run exits cleanly.

Phase 2 — Schema extraction from latest build

For each actor, the actor calls GET /v2/acts/{actorId}/builds/{buildId}?token=... using the resolved build ID. The dataset schema is read from buildData.actorDefinition.storages.dataset.fields — the canonical location where Apify stores field definitions for each build. If a build exists but has no dataset schema defined, an empty fields object is returned rather than an error, allowing the comparison to proceed and surface a 0% compatibility score.

Phase 3 — Field classification

Fields from both schemas are loaded into JavaScript Set objects for O(1) membership testing. A single pass through Actor A's fields classifies each as: shared (present in B with the same type), a type mismatch (present in B with a different type), or unique to A. A second pass through Actor B's fields collects anything not already seen in A as unique to B.

Phase 4 — Scoring and migration note generation

The compatibility score is calculated as Math.round(sharedFields.length / totalUniqueFieldCount * 100), where totalUniqueFieldCount is the union of both field sets minus the shared count. Migration notes are generated programmatically: one note per type mismatch ('field' needs transformation: typeA → typeB), plus summary notes for unique field counts on each side. The full report is pushed as a single dataset record.

Tips for best results

  1. Run Schema Diff before any pipeline change. Treat it as a pre-flight check: compare the current actor and its replacement before touching any downstream code. A five-second comparison can save hours of debugging.
  2. Interpret the compatibility score in context. A score of 60% between two contact scrapers may be fine if the shared fields include emails, phones, and url. A score of 95% with one type mismatch on a critical field may require more transformation work than the number suggests. Read the typeMismatches array alongside the score.
  3. Use migrationNotes as a task checklist. Each note maps directly to a transformation step in your pipeline code. Copy them into a GitHub issue or Jira ticket to track migration progress.
  4. Schedule daily comparisons for production pipelines. Actor schemas can change silently between builds. A daily scheduled run with a Slack webhook alert when compatibilityScore drops is a low-cost schema monitoring strategy.
  5. Combine with Schema Validator for full coverage. Schema Diff tells you whether two actors are compatible with each other; Actor Schema Validator tells you whether a specific run's output matches the expected schema. Use both together for end-to-end pipeline integrity.
  6. Diff before and after a major actor update. When you publish a new build of your own actor, run Schema Diff comparing the old slug to the new one to generate an automatic changelog of field-level changes for downstream users.
  7. A score of 0 is informative, not an error. Two actors can have completely non-overlapping schemas and still be combinable — uniqueToA and uniqueToB tell you exactly what columns a merge operation needs to accommodate.

Combine with other Apify actors

ActorHow to combine
Actor Schema ValidatorUse Schema Diff to confirm two actors are compatible, then use Schema Validator to verify actual run output matches the expected schema at runtime.
Actor Schema RegistrySearch the registry for actors with fields matching your requirements, then use Schema Diff to compare the top candidates before choosing one.
Actor Pipeline BuilderAfter confirming compatibility with Schema Diff, use Pipeline Builder to wire the two actors into a live data pipeline.
Website Contact ScraperDiff against Google Maps Email Extractor to plan a merge of both contact sources into a unified lead dataset.
Google Maps Email ExtractorCompare against Website Contact Scraper to identify overlapping fields before combining results.
B2B Lead Gen SuiteBefore plugging a new data source into the suite, run Schema Diff to confirm its output fields align with what the suite expects.
Waterfall Contact EnrichmentDiff the enrichment actor's output schema against your CRM's import schema to plan field mapping before ingestion.

Limitations

  • Only compares dataset output schemas — input schemas, key-value store schemas, and request queue schemas are not compared. If you need to verify input schema compatibility, check the actors' input_schema.json directly.
  • Requires a defined dataset schema in the build — actors that do not declare a dataset schema in their actorDefinition (many older actors) will show 0 fields and a 0% compatibility score. This is a limitation of the Apify build system, not of this actor.
  • Always compares latest builds — there is no option to compare a specific build tag or version. If you need to compare a staging build against production, publish the staging build as latest first.
  • Field type information may be coarse — Apify dataset schemas record types at the JSON Schema level (string, array, object, number, boolean). Fine-grained type differences (e.g., array of strings vs array of objects) are not distinguished in the type mismatch analysis.
  • Private actors require your own token — the actor uses APIFY_TOKEN from the environment. It can only access actors that are public or owned by the account whose token is in use. Comparing two actors from different private accounts is not supported.
  • No historical schema tracking — this actor produces a point-in-time comparison. It does not store a history of past schemas or generate diffs between build versions over time. For historical tracking, schedule runs and store results in your own dataset.
  • Actor ID must be exact — partial name matches and fuzzy search are not supported. If the actor is not found, a structured error is returned. Use the exact username/actor-name from the Apify Store URL.

Integrations

  • Zapier — trigger a schema comparison automatically whenever an actor in your pipeline receives a new build, and route the result to Slack or email
  • Make — build a scenario that runs Schema Diff on a schedule and posts compatibility score changes to a monitoring channel
  • Google Sheets — export comparison reports to a sheet to maintain a compatibility matrix across your actor portfolio
  • Apify API — integrate schema comparison into your CI/CD pipeline so compatibility is verified before any deployment goes live
  • Webhooks — fire a webhook when a run completes and pass the compatibilityScore to a downstream alerting system
  • LangChain / LlamaIndex — use schema comparison reports as structured context when asking an LLM to generate transformation code between two actor output formats

Troubleshooting

Compatibility score is 0% but both actors clearly produce similar data. The most common cause is that one or both actors do not define a dataset schema in their actorDefinition. Many actors on the Apify Store predate the dataset schema feature or were built without a schema declaration. In that case, fieldsInA or fieldsInB will be 0 in the output. This is a limitation of the source actor's build, not an error in Schema Diff. Check the actor's source code or contact the author to confirm whether a schema is defined.

"Failed to fetch one or both actors" error in the output. This means the Apify API returned a non-200 response for one of the actor IDs. Verify the actor ID is correct using the exact format from the Store URL (e.g., ryanclinton/website-contact-scraper). Confirm the actor is public, or that you are using a token from the account that owns it.

actorIdA and actorIdB are the same actor and the score is 100%. This is expected behaviour. Comparing an actor against itself is a valid way to verify that the schema is consistently defined. Use it to confirm a schema was not accidentally cleared in a recent build.

The run completed but the dataset is empty. If both required input fields are missing, the actor pushes an error record and exits. Check the run logs for the message Missing: actorIdA and actorIdB. Ensure both fields are present in your input JSON.

Score is lower than expected after an actor update. The comparison always uses the latest build. If an actor was recently updated and its schema changed, the score reflects the new schema. Run Schema Diff again and review migrationNotes to understand what changed.

Responsible use

  • This actor only accesses publicly available actor metadata via the Apify REST API.
  • It does not scrape websites, access private data, or trigger actor runs on external systems.
  • Ensure that any actor IDs you provide belong to actors you have permission to inspect.
  • Use the comparison results to improve pipeline reliability, not to reverse-engineer proprietary actor implementations.

FAQ

How does Actor Schema Diff compare Apify actor output schemas? It calls the Apify REST API to fetch the latest build of each actor, reads the actorDefinition.storages.dataset.fields object from that build, and classifies every field into one of four categories: shared (same name and type), type mismatch, unique to Actor A, or unique to Actor B. The compatibility score is the percentage of fields that appear in both schemas with matching types.

What does a compatibility score of 100% mean? Both actors define exactly the same dataset fields with exactly the same types. Their outputs can be used interchangeably without any transformation or field mapping. A score below 100% means at least one field differs by name, type, or existence.

Does Actor Schema Diff trigger actual runs of the actors being compared? No. The comparison is pure metadata analysis. It reads build definitions from the Apify API. The actors being compared are never executed, and no additional Apify compute credits are consumed beyond the $0.20 fixed charge for the comparison itself.

How accurate is the field type comparison? Types are compared at the JSON Schema primitive level: string, array, object, number, boolean. If both actors declare a field as array but one stores strings and the other stores objects, that difference is not visible in the type mismatch analysis. Review the uniqueToA and uniqueToB arrays alongside actual actor output samples for full type confidence.

Can I compare a private actor against a public actor? Yes, as long as the private actor belongs to the account whose API token is running the comparison. The actor uses the APIFY_TOKEN environment variable that Apify injects automatically. You cannot compare two private actors from different accounts.

How is Actor Schema Diff different from reading the input_schema.json manually? Schema Diff compares dataset output schemas, not input schemas. It also resolves the schema from the compiled build rather than the source file, which means it reflects the schema as it actually exists in the deployed version. Manual inspection of source files does not account for build-time transformations or missing schema declarations.

How long does a typical schema comparison run take? Under 30 seconds for any valid actor pair. Both schemas are fetched in parallel. The computation itself is near-instant. Total run time is dominated by two sequential Apify API calls (actor metadata + build detail) per actor.

Can I compare more than two actors at once? Each run compares exactly one pair. To compare multiple pairs, trigger separate runs — either manually or via the API in a loop from your own code. Batch the runs in parallel to compare an entire actor catalogue efficiently.

What happens if one of the actors has no dataset schema defined? The actor returns a result with fieldsInA or fieldsInB set to 0 and a compatibility score of 0%. This is not an error — it means the actor does not declare a schema in its build definition. The migrationNotes will indicate how many fields are missing.

Is it legal to inspect Apify actor schemas this way? Yes. Actor schemas are part of the public Apify API. Fetching build metadata for publicly listed actors is within normal API usage. For private actors, you are only accessing data you already have permission to view via your own API token.

Can I schedule Actor Schema Diff to run automatically? Yes. Use Apify's built-in scheduling to run the actor daily, weekly, or on any custom interval. Combine with a webhook to receive alerts when the compatibility score drops below a threshold. This is the recommended approach for monitoring schema stability in production pipelines.

How is this different from the Actor Schema Validator? Actor Schema Validator checks whether a specific actor run's actual output matches a predefined expected schema. Schema Diff compares the defined schemas of two different actors to measure compatibility between them. Use Schema Diff for migration planning and compatibility checks; use Schema Validator for runtime output quality assurance.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.

How it works

01

Configure

Set your parameters in the Apify Console or pass them via API.

02

Run

Click Start, trigger via API, webhook, or set up a schedule.

03

Get results

Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.

Use cases

Sales Teams

Build targeted lead lists with verified contact data.

Marketing

Research competitors and identify outreach opportunities.

Data Teams

Automate data collection pipelines with scheduled runs.

Developers

Integrate via REST API or use as an MCP tool in AI workflows.

Ready to try Schema Diff?

Start for free on Apify. No credit card required.

Open on Apify Store