Output Guard (Schema Validator)

Catch schema violations before Apify does

Output Guard is a data quality tool that runs your Apify actor with test input and compares every output field against its declared dataset_schema.json. It checks 5 validation categories — type mismatches, missing fields, undeclared fields, nullable violations, and type consistency — and produces a 0-100 compliance score in under 60 seconds for $0.35 per validation.

Schema compliance is a requirement for the Apify Store publication checklist. Output Guard automates checks that would take 15-30 minutes per actor when done manually — catching issues that trigger maintenance flags before Apify's automated quality reviews do.

Sign in to use
$0.35/validation

What Output Guard checks

Type mismatch detection

Maps 6 declared schema types (string, integer, number, boolean, array, object) to JavaScript runtime types and flags every mismatch with the exact field path and expected vs. actual type.

Undeclared field detection

Lists every output field not defined in the dataset_schema.json. Undeclared fields do not appear in the Apify Store's table view and may indicate data leaks or schema drift.

Missing required fields

Checks every schema-declared field exists in at least one output item. Missing fields indicate broken extractors, changed data sources, or incomplete crawl coverage.

Nullable violation checks

Identifies fields with null values where the schema does not declare nullable: true — a common source of downstream pipeline errors and Apify Store maintenance flags.

Type consistency analysis

Flags fields with mixed types across output items (e.g., a rating field that is sometimes a string and sometimes a number) regardless of schema declarations.

0-100 compliance score

Weighted scoring: type mismatches deduct 10 points, type inconsistencies deduct 5, nullable violations deduct 5, warnings deduct 3, and undeclared fields deduct 2. A score of 90+ indicates minor issues only.

Schema validation methods compared

There are 4 common approaches to validating Apify actor output against dataset schemas. Each has trade-offs in speed, coverage, and automation level.

MethodTime per actorChecks performedAutomationCost
Output GuardUnder 60 seconds5 categories: types, missing, undeclared, nullable, consistencyFully automated (paste actor ID, click run)$0.35/run
Manual Apify Console check15-30 minutesVisual inspection of output table vs. schema fileFully manualFree (time cost only)
Custom AJV script2-4 hours (initial setup), seconds per runType validation only (no undeclared fields, no consistency)Automated after setupFree (development time cost)
Apify Store automated reviewHours to days (runs on Apify's schedule)Full publication checklist including schemaAutomated but reactive (flags issues after publish)Free (but flags are public)

No single validation method catches every issue — the most reliable approach combines automated schema validation with manual output review for edge cases.

Example Output Guard output

{
  "actorName": "ryanclinton/website-contact-scraper",
  "schemaFound": true,
  "schemaFields": 12,
  "outputFields": 15,
  "totalItems": 3,
  "mismatches": [
    { "path": "price", "expected": "number", "actual": "string", "severity": "error" },
    { "path": "email", "expected": "non-null", "actual": "null values found", "severity": "warning" }
  ],
  "undeclaredFields": ["_debug", "scrapedAt", "rawHtml"],
  "missingRequired": ["phoneNumber"],
  "typeInconsistencies": [
    { "path": "rating", "types": ["string", "number"], "severity": "warning" }
  ],
  "score": 72,
  "passed": false
}

How Output Guard works

1

Connect your Apify token and enter the actor ID to validate

2

Output Guard runs the actor on your account and reads its dataset_schema.json

3

Get a compliance report with every violation, a 0-100 score, and actionable fixes — results cached for free

Alternatives to Output Guard

There are several approaches to validating Apify actor output, from fully manual to fully automated. The right choice depends on how many actors you maintain and how often you ship updates.

Manual Apify Console inspection

Open the actor run in the Apify Console, switch to the "Table" view, and visually compare each column against the dataset_schema.json file. Works for one-off checks but scales poorly beyond 5 actors.

Best for: occasional checks on a single actor before publishing.

Custom AJV validation script

Write a Node.js script using the AJV JSON Schema validator library to validate actor output programmatically. Requires 2-4 hours of initial development and ongoing maintenance as schemas evolve.

Best for: teams with existing CI/CD pipelines who want free, customizable validation.

Apify Store automated review

Apify runs automated quality checks on published actors, including schema compliance. Catches issues but only after publication — flags are visible to Store users and can affect actor ranking.

Best for: a safety net after publishing, not a pre-publish validation step.

JSON Schema online validators

Generic JSON Schema validators like jsonschemavalidator.net can check output against a schema. However, they do not understand Apify-specific conventions like dataset_schema.json format, nullable declarations, or the distinction between declared and undeclared fields.

Best for: validating generic JSON Schema compliance outside the Apify ecosystem.

Output Guard

Automated end-to-end validation: runs the actor, reads the schema, checks 5 validation categories, and produces a 0-100 compliance score with specific field-level issues. No scripting required. $0.35 per run.

Best for: developers who maintain multiple actors and want fast, repeatable pre-publish validation.

Each approach has trade-offs in setup time, coverage depth, and maintenance burden. The right choice depends on your team size and deployment frequency.

Limitations

  • 1.Schema validation only. ApifyForge Schema Validator checks output structure against the declared schema. It does not validate data accuracy, completeness, or business logic — a field can be the correct type but contain wrong data.
  • 2.Requires dataset_schema.json. Actors without a dataset schema file cannot be validated. The tool reports schemaFound: false and provides output structure analysis instead.
  • 3.Sample-based validation. The tool validates output from a single test run. Edge cases that appear only with specific inputs or at scale may not surface in a single validation run.
  • 4.Not a full quality audit. Schema compliance is one of several Apify Store publication requirements. README quality, input schema correctness, error handling, and output volume are checked separately by ApifyForge's Input Tester and other tools.
  • 5.Requires Apify account. Validation runs execute on your own Apify account at PPE rates. You need a valid Apify API token to use the tool.

What Output Guard costs

Every validation run executes on your own Apify account at the standard pay-per-event rate of $0.35 per validation. The ApifyForge platform itself is free — no subscription, no premium tier. The charge appears in your Apify console like any other actor run. Apify's free plan includes $5/month in credits, enough for approximately 14 validations per month.

Frequently asked questions

What does Output Guard actually check?

Output Guard runs your actor with test input on your own Apify account, then compares every field in the output against the actor's declared dataset_schema.json. It checks 5 categories: type mismatches (string vs number), missing required fields, undeclared fields not in the schema, nullable violations (null values where nullable: true is not declared), and type consistency across multiple output items.

Why do Apify actors get maintenance flags for schema issues?

Apify's Store quality requirements, documented in the Apify Actor publication checklist, require that actor output matches its declared dataset schema. Type mismatches, missing fields, and undeclared properties can trigger maintenance flags or prevent publication. Output Guard catches these issues before Apify's automated checks do.

How much does a single schema validation cost?

Each Output Guard run costs $0.35, charged as a pay-per-event (PPE) fee on your own Apify account. ApifyForge has no platform fee or subscription. Apify's free tier includes $5/month in credits, enough for approximately 14 schema validations per month.

What is the 0-100 compliance score?

Output Guard calculates a weighted compliance score: type mismatches deduct 10 points each, nullable violations deduct 5, undeclared fields deduct 2, and type inconsistencies deduct 5. A score of 90 or above indicates minor issues only. A score below 70 indicates structural problems that are likely to trigger maintenance flags.

Can I validate actors I did not build?

Yes. Output Guard works with any public Apify actor that has a dataset_schema.json file. Enter any actor ID (e.g., apify/web-scraper) and the tool will run it on your account with minimal input, then validate the output. This is useful for evaluating actor quality before integrating third-party scrapers into your pipeline.

What happens if my actor has no dataset_schema.json?

If the actor lacks a dataset_schema.json, Output Guard reports schemaFound: false and cannot perform field-level validation. The Apify Store publication checklist recommends every actor include a dataset schema. Output Guard will still analyze the output structure and report field types for reference.

How is this different from manually checking actor output?

Manual validation requires downloading output from the Apify Console, opening the dataset schema file, and comparing fields by hand — a process that takes 15-30 minutes per actor and is error-prone for actors with 20+ output fields. Output Guard automates all 5 check categories in a single run that takes under 60 seconds.

Does schema validation prevent all Apify Store rejections?

No. Schema compliance is one of several Store quality requirements. Apify also checks README completeness, input schema quality, error handling, and output volume. ApifyForge offers additional tools for these checks: Input Tester validates input schemas, and the Quality Audit actor covers the full publication checklist.