Dataset Schema
An Apify Dataset Schema is a JSON Schema file at .actor/dataset_schema.json that defines the exact structure of an actor's output data. It specifies field names, data types (string, integer, number, boolean, array, object), required fields, field descriptions, and display configuration (which fields to show in which views). Apify uses this schema for two critical purposes: runtime validation of every Actor.pushData() call, and generating the output preview table in the Console UI. Dataset schemas matter enormously because they are the contract between your actor and its users. When someone integrates your actor into a pipeline, they depend on the output structure being consistent and predictable. A schema guarantees that every run produces the same fields in the same types, or fails loudly if something is wrong. Without a schema, your actor might silently output null values, missing fields, or wrong types that break downstream systems days or weeks later when someone finally notices. More critically, dataset schema validation failures are the number one cause of actors going under maintenance on the Apify platform. If your code pushes a field that is not in the schema, or pushes a value with the wrong type (null where string is expected, string where integer is expected), the pushData call fails with a 400 error. If this happens consistently on the default input, Apify's automated quality checks flag the actor as UNDER_MAINTENANCE within 3 days, displaying a warning banner to all users and dropping the actor in Store search rankings. Here is a dataset schema example: { 'actorSpecification': 1, 'fields': { 'title': { 'type': 'string', 'description': 'Product title', 'required': true }, 'price': { 'type': 'number', 'description': 'Product price in USD', 'required': true }, 'currency': { 'type': 'string', 'description': 'Currency code (e.g. USD, EUR)', 'required': true }, 'url': { 'type': 'string', 'description': 'Product page URL', 'format': 'uri', 'required': true }, 'inStock': { 'type': 'boolean', 'description': 'Whether the product is in stock' }, 'rating': { 'type': 'number', 'description': 'Average user rating (0-5)' }, 'reviewCount': { 'type': 'integer', 'description': 'Number of user reviews' }, 'images': { 'type': 'array', 'description': 'Product image URLs' }, 'scrapedAt': { 'type': 'string', 'description': 'ISO 8601 timestamp of when the data was scraped', 'format': 'date-time', 'required': true } }, 'views': { 'overview': { 'title': 'Overview', 'transformation': { 'fields': ['title', 'price', 'inStock', 'url'] } }, 'detailed': { 'title': 'Full Details', 'transformation': { 'fields': ['title', 'price', 'currency', 'rating', 'reviewCount', 'inStock', 'url', 'scrapedAt'] } } } }. The views section controls how data appears in the Console's dataset preview table. Define an 'overview' view with the most important fields (4-6) for quick scanning, and a 'detailed' view with all fields. Users can switch between views in the Console UI. Common mistakes include defining a schema but not keeping it in sync with your code. When you add a new output field in your code, you must also add it to the schema. When you change a field type (e.g., from string to number), update the schema. A mismatch causes runtime validation failures. Another critical mistake is having an empty properties object in the schema — Apify interprets this as an undefined output structure and penalizes heavily. Always define at least your core output fields. The most insidious mistake is handling optional fields incorrectly. If a field is not marked as required in the schema but your code sometimes pushes null for that field, ensure the schema type allows null or simply omit the field entirely from records where it has no value (do not push { rating: null }, instead omit the rating key). Test your schema by running the actor against multiple inputs and verifying all output records pass validation. Use a JSON Schema validator locally before pushing to catch errors before they reach production. Related concepts: Dataset, Actor, Actor Build, Maintenance Flag, Store Quality Score.