Catch schema violations before Apify does
Output Guard is a data quality tool that runs your Apify actor with test input and compares every output field against its declared dataset_schema.json. It checks 5 validation categories — type mismatches, missing fields, undeclared fields, nullable violations, and type consistency — and produces a 0-100 compliance score in under 60 seconds for $0.35 per validation.
Schema compliance is a requirement for the Apify Store publication checklist. Output Guard automates checks that would take 15-30 minutes per actor when done manually — catching issues that trigger maintenance flags before Apify's automated quality reviews do.
Maps 6 declared schema types (string, integer, number, boolean, array, object) to JavaScript runtime types and flags every mismatch with the exact field path and expected vs. actual type.
Lists every output field not defined in the dataset_schema.json. Undeclared fields do not appear in the Apify Store's table view and may indicate data leaks or schema drift.
Checks every schema-declared field exists in at least one output item. Missing fields indicate broken extractors, changed data sources, or incomplete crawl coverage.
Identifies fields with null values where the schema does not declare nullable: true — a common source of downstream pipeline errors and Apify Store maintenance flags.
Flags fields with mixed types across output items (e.g., a rating field that is sometimes a string and sometimes a number) regardless of schema declarations.
Weighted scoring: type mismatches deduct 10 points, type inconsistencies deduct 5, nullable violations deduct 5, warnings deduct 3, and undeclared fields deduct 2. A score of 90+ indicates minor issues only.
There are 4 common approaches to validating Apify actor output against dataset schemas. Each has trade-offs in speed, coverage, and automation level.
| Method | Time per actor | Checks performed | Automation | Cost |
|---|---|---|---|---|
| Output Guard | Under 60 seconds | 5 categories: types, missing, undeclared, nullable, consistency | Fully automated (paste actor ID, click run) | $0.35/run |
| Manual Apify Console check | 15-30 minutes | Visual inspection of output table vs. schema file | Fully manual | Free (time cost only) |
| Custom AJV script | 2-4 hours (initial setup), seconds per run | Type validation only (no undeclared fields, no consistency) | Automated after setup | Free (development time cost) |
| Apify Store automated review | Hours to days (runs on Apify's schedule) | Full publication checklist including schema | Automated but reactive (flags issues after publish) | Free (but flags are public) |
No single validation method catches every issue — the most reliable approach combines automated schema validation with manual output review for edge cases.
{
"actorName": "ryanclinton/website-contact-scraper",
"schemaFound": true,
"schemaFields": 12,
"outputFields": 15,
"totalItems": 3,
"mismatches": [
{ "path": "price", "expected": "number", "actual": "string", "severity": "error" },
{ "path": "email", "expected": "non-null", "actual": "null values found", "severity": "warning" }
],
"undeclaredFields": ["_debug", "scrapedAt", "rawHtml"],
"missingRequired": ["phoneNumber"],
"typeInconsistencies": [
{ "path": "rating", "types": ["string", "number"], "severity": "warning" }
],
"score": 72,
"passed": false
}Connect your Apify token and enter the actor ID to validate
Output Guard runs the actor on your account and reads its dataset_schema.json
Get a compliance report with every violation, a 0-100 score, and actionable fixes — results cached for free
There are several approaches to validating Apify actor output, from fully manual to fully automated. The right choice depends on how many actors you maintain and how often you ship updates.
Open the actor run in the Apify Console, switch to the "Table" view, and visually compare each column against the dataset_schema.json file. Works for one-off checks but scales poorly beyond 5 actors.
Best for: occasional checks on a single actor before publishing.
Write a Node.js script using the AJV JSON Schema validator library to validate actor output programmatically. Requires 2-4 hours of initial development and ongoing maintenance as schemas evolve.
Best for: teams with existing CI/CD pipelines who want free, customizable validation.
Apify runs automated quality checks on published actors, including schema compliance. Catches issues but only after publication — flags are visible to Store users and can affect actor ranking.
Best for: a safety net after publishing, not a pre-publish validation step.
Generic JSON Schema validators like jsonschemavalidator.net can check output against a schema. However, they do not understand Apify-specific conventions like dataset_schema.json format, nullable declarations, or the distinction between declared and undeclared fields.
Best for: validating generic JSON Schema compliance outside the Apify ecosystem.
Automated end-to-end validation: runs the actor, reads the schema, checks 5 validation categories, and produces a 0-100 compliance score with specific field-level issues. No scripting required. $0.35 per run.
Best for: developers who maintain multiple actors and want fast, repeatable pre-publish validation.
Each approach has trade-offs in setup time, coverage depth, and maintenance burden. The right choice depends on your team size and deployment frequency.
Every validation run executes on your own Apify account at the standard pay-per-event rate of $0.35 per validation. The ApifyForge platform itself is free — no subscription, no premium tier. The charge appears in your Apify console like any other actor run. Apify's free plan includes $5/month in credits, enough for approximately 14 validations per month.
Output Guard runs your actor with test input on your own Apify account, then compares every field in the output against the actor's declared dataset_schema.json. It checks 5 categories: type mismatches (string vs number), missing required fields, undeclared fields not in the schema, nullable violations (null values where nullable: true is not declared), and type consistency across multiple output items.
Apify's Store quality requirements, documented in the Apify Actor publication checklist, require that actor output matches its declared dataset schema. Type mismatches, missing fields, and undeclared properties can trigger maintenance flags or prevent publication. Output Guard catches these issues before Apify's automated checks do.
Each Output Guard run costs $0.35, charged as a pay-per-event (PPE) fee on your own Apify account. ApifyForge has no platform fee or subscription. Apify's free tier includes $5/month in credits, enough for approximately 14 schema validations per month.
Output Guard calculates a weighted compliance score: type mismatches deduct 10 points each, nullable violations deduct 5, undeclared fields deduct 2, and type inconsistencies deduct 5. A score of 90 or above indicates minor issues only. A score below 70 indicates structural problems that are likely to trigger maintenance flags.
Yes. Output Guard works with any public Apify actor that has a dataset_schema.json file. Enter any actor ID (e.g., apify/web-scraper) and the tool will run it on your account with minimal input, then validate the output. This is useful for evaluating actor quality before integrating third-party scrapers into your pipeline.
If the actor lacks a dataset_schema.json, Output Guard reports schemaFound: false and cannot perform field-level validation. The Apify Store publication checklist recommends every actor include a dataset schema. Output Guard will still analyze the output structure and report field types for reference.
Manual validation requires downloading output from the Apify Console, opening the dataset schema file, and comparing fields by hand — a process that takes 15-30 minutes per actor and is error-prone for actors with 20+ output fields. Output Guard automates all 5 check categories in a single run that takes under 60 seconds.
No. Schema compliance is one of several Store quality requirements. Apify also checks README completeness, input schema quality, error handling, and output volume. ApifyForge offers additional tools for these checks: Input Tester validates input schemas, and the Quality Audit actor covers the full publication checklist.