Quality

The ApifyForge Testing Suite

Five cloud-powered testing tools for Apify actors: Schema Validator, Test Runner, Cloud Staging, Regression Suite, and MCP Debugger. How they work together and when to use each one.

By Ryan ClintonLast updated: March 19, 2026

The ApifyForge Testing Suite is a set of five cloud-powered actors that test your Apify actors before, during, and after deployment. Each tool targets a specific failure mode: schema violations, functional regressions, production environment issues, MCP protocol errors, and output quality drift. Together they form a complete quality pipeline that catches problems before your users do.

Every tool in the suite runs as an Apify actor on your account. You trigger them through the ApifyForge dashboard or via the Apify API. Each tool charges a flat PPE fee per run — you pay once regardless of how many checks the tool performs internally. Results are cached in your dashboard so you never pay twice to view previous reports.

The five tools at a glance

| Tool | What it checks | When to use it | Cost | |------|---------------|----------------|------| | **Schema Validator** | Output fields match declared schema types | Before every push | $0.35/run | | **Test Runner** | Multiple test cases with assertions | Before every push, in CI/CD | $0.35/suite | | **Cloud Staging** | Full production environment validation | Before publishing to Store | $0.50/run | | **Regression Suite** | Historical comparison — what changed since last run | After code changes, weekly | $0.35/suite | | **MCP Debugger** | MCP server handshake, tools, latency | After MCP deploy, for troubleshooting | $0.15/debug |

Tool 1: Schema Validator ($0.35/run)

The Schema Validator fetches your actor's declared dataset schema from its latest build, runs the actor with your test input, then compares every output field against the schema definition. It checks:

- **Type mismatches** — schema says number, actor outputs "$19.99" as a string - **Missing required fields** — schema declares phoneNumber but no output item has it - **Undeclared fields** — output contains _debug or scrapedAt not in the schema - **Nullable violations** — field has null values but schema doesn't declare nullable: true - **Type inconsistencies** — rating is sometimes a string, sometimes a number

The report includes a **0-100 compliance score** weighted by severity. Errors deduct 10 points, warnings 3, undeclared fields 2, type inconsistencies 5. A score of 90+ means minor issues only. Below 70 means serious violations that will trigger maintenance flags.

When to use the Schema Validator

- **Before every apify push** — catch schema drift before it reaches production - **After changing output structure** — verify new fields are declared in the schema - **When building a new dataset schema** — iterate: add fields, validate, fix, repeat - **When evaluating third-party actors** — check if their schema matches actual output

Example: validating a scraper

{
  "targetActorId": "ryanclinton/website-contact-scraper",
  "testInput": {
    "urls": ["https://example.com"],
    "maxPagesPerDomain": 3
  }
}
json

The validator runs the actor, fetches the schema from the latest build, compares every field, and returns a report like:

Score: 72/100 — FAIL
Mismatches:
  [error] price: expected number, got string
  [warning] email: null values found, schema says non-null
Undeclared: _debug, scrapedAt, rawHtml
Missing: phoneNumber

Tool 2: Test Runner ($0.35/suite)

The Test Runner executes your actor multiple times with different inputs, each with its own assertion set. This catches functional issues that single-input testing misses: edge cases, boundary conditions, and input-specific bugs.

Assertion types

| Assertion | What it checks | Example | |-----------|---------------|---------| | minResults | Dataset has at least N items | "minResults": 3 | | maxResults | Dataset has at most N items | "maxResults": 100 | | requiredFields | Fields exist with non-null values | ["name", "url"] | | fieldTypes | Field values match declared types | {"rating": "number"} | | maxDuration | Test completes within N seconds | "maxDuration": 60 | | noEmptyFields | No null, empty string, or empty array | ["name", "email"] |

Example: multi-case test suite

{
  "targetActorId": "ryanclinton/google-maps-email-extractor",
  "testCases": [
    {
      "name": "Basic search",
      "input": { "query": "plumbers Chicago", "maxResults": 5 },
      "assertions": {
        "minResults": 3,
        "requiredFields": ["businessName", "address"],
        "maxDuration": 60
      }
    },
    {
      "name": "Single result",
      "input": { "query": "Statue of Liberty", "maxResults": 1 },
      "assertions": {
        "minResults": 1,
        "maxResults": 1,
        "requiredFields": ["businessName", "rating"]
      }
    },
    {
      "name": "Performance check",
      "input": { "query": "restaurants NYC", "maxResults": 20 },
      "assertions": {
        "minResults": 15,
        "maxDuration": 120,
        "noEmptyFields": ["businessName"]
      }
    }
  ]
}
json

Test cases run **sequentially** to avoid overwhelming the target actor. One PPE charge covers the entire suite regardless of how many test cases you include.

When to use the Test Runner

- **Before every deploy** — run your standard test suite as a quality gate - **In CI/CD pipelines** — trigger via API, parse the JSON report, block deploys on failure - **When onboarding a new actor** — establish baseline test cases that define "working correctly" - **For edge case coverage** — test empty inputs, special characters, boundary values

Tool 3: Cloud Staging ($0.50/run)

Cloud Staging runs your actor in Apify's actual production environment — the same Docker container, network, and proxy infrastructure your users will see. It validates:

- **Docker build success** — your Dockerfile compiles on Apify's infrastructure - **Schema compliance** — output matches the declared dataset schema in production - **Structural validation** — field consistency, type consistency, empty array detection - **Custom assertions** — minResults, requiredFields, fieldTypes (same as Test Runner) - **Run success** — the actor completes without crashing

The local-vs-cloud gap

Your actor works locally but fails in the cloud. This happens because:

- **Missing dependencies** — a package in devDependencies is used in production code - **Docker build issues** — Dockerfile installs packages in a different order than local npm - **Proxy differences** — local runs use your IP, cloud runs use Apify's proxy pool - **Memory limits** — local machines have 16GB RAM, Apify actors get 256MB-4096MB - **Network routing** — some websites block Apify's IP ranges but not your home IP

Cloud Staging catches all of these by running in the real environment.

When to use Cloud Staging

- **Before publishing to the Store** — the highest-stakes moment for your actor - **After Dockerfile changes** — verify the build works on Apify's infrastructure - **After dependency updates** — catch breaking changes from package upgrades - **When switching proxy types** — verify the new proxy works in production

Tool 4: Regression Suite ($0.35/suite)

The Regression Suite extends the Test Runner with historical comparison. It runs the same test cases and adds a classification layer: was this test passing before? Is it failing now? Each test gets one of six statuses:

| Previous | Current | Classification | What it means | |----------|---------|---------------|---------------| | pass | pass | **pass** | Stable — no change | | pass | fail | **regression** | Something broke | | fail | pass | **resolved** | Something got fixed | | fail | fail | **fail** | Known issue — unchanged | | (new) | pass | **new_pass** | New test, passes | | (new) | fail | **new_fail** | New test, fails |

Automatic previous result injection

When you use the Regression Suite through the ApifyForge dashboard, previous results are **automatically loaded** from your last cached run. You don't need to manually track or pass previous results — the API route handles it.

On first run, all tests are classified as new_pass or new_fail. On subsequent runs, the system compares against the prior run and highlights regressions and resolutions.

When to use the Regression Suite

- **After every code change** — detect regressions before they reach users - **Weekly scheduled runs** — catch upstream changes (website redesigns, API changes) - **After migrations** — switching scraping approach? Run the suite before and after - **For release notes** — "2 regressions fixed, 1 new test added, 0 regressions introduced"

Tool 5: MCP Debugger ($0.15/debug)

The MCP Debugger sends a real MCP protocol handshake to any standby URL and diagnoses connection issues. It performs two requests:

1. **Initialize** — sends a JSON-RPC initialize request with protocol version 2025-03-26 2. **Tools list** — sends a tools/list request to discover available tools

The report includes connection status (healthy/degraded/unhealthy/unreachable), latency, transport type, protocol version, server name, tool count, tool names with descriptions, detected issues, and actionable fix suggestions.

Diagnostic mappings

| Symptom | Diagnosis | Fix | |---------|-----------|-----| | HTTP 404 | Endpoint not found | Check webServerMcpPath is "/mcp" in actor.json | | HTTP 401/403 | Auth failed | Provide a valid API token | | Timeout | Server not running | Enable usesStandbyMode: true, run actor once to warm up | | DNS failure | URL doesn't resolve | Check URL for typos | | 0 tools | No tools registered | Register tools before transport.handleRequest() | | >5000ms latency | Cold start | First request after idle triggers warmup |

When to use the MCP Debugger

- **After deploying a new MCP server** — verify the handshake works before sharing the URL - **When users report connection issues** — reproduce and diagnose with a structured report - **For health monitoring** — schedule periodic checks across all your MCP servers - **Before publishing MCP updates** — verify tools are still registered after code changes

Combining the tools: the recommended workflow

The five tools work best as a pipeline, not in isolation. Here is the recommended workflow for a typical actor deployment:

Pre-push (catches 80% of issues)

1. **Schema Validator** — Run against your actor with test input. Fix any type mismatches or undeclared fields. This takes 1-2 minutes and costs $0.35. 2. **Test Runner** — Run your standard test suite (3-5 test cases). Fix any assertion failures. This takes 2-5 minutes and costs $0.35.

Pre-publish (catches the remaining 20%)

3. **Cloud Staging** — Run in Apify's production environment. Verify Docker build, schema compliance, and output quality in the real environment. This takes 2-5 minutes and costs $0.50.

Post-publish (ongoing quality)

4. **Regression Suite** — Run weekly or after every code change. Compare results against previous runs. Investigate any regressions immediately. This costs $0.35 per run. 5. **MCP Debugger** — For MCP servers only. Run after every deploy and on a weekly schedule. This costs $0.15 per debug.

Total cost per deployment cycle

| Step | Tool | Cost | |------|------|------| | Pre-push | Schema Validator | $0.35 | | Pre-push | Test Runner | $0.35 | | Pre-publish | Cloud Staging | $0.50 | | Post-publish | Regression Suite | $0.35 | | **Total** | | **$1.55** |

For context, a single maintenance flag on the Apify Store can reduce your actor's visibility for weeks, costing far more in lost PPE revenue than $1.55 spent on pre-deploy testing.

API integration

Every tool in the suite can be triggered via the Apify API, making them ideal for CI/CD pipelines.

Python example: CI/CD quality gate

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

# Step 1: Schema validation
schema_run = client.actor("ryanclinton/actor-schema-validator").call(run_input={
    "targetActorId": "your-username/your-actor",
    "testInput": {"query": "test", "maxResults": 3},
})
schema_report = list(client.dataset(schema_run["defaultDatasetId"]).iterate_items())[0]

if not schema_report["passed"]:
    print(f"Schema validation FAILED (score: {schema_report['score']})")
    for m in schema_report["mismatches"]:
        print(f"  [{m['severity']}] {m['path']}: expected {m['expected']}, got {m['actual']}")
    exit(1)

# Step 2: Test suite
test_run = client.actor("ryanclinton/actor-test-runner").call(run_input={
    "targetActorId": "your-username/your-actor",
    "testCases": [
        {"name": "Basic", "input": {"query": "test"}, "assertions": {"minResults": 1}},
    ],
})
test_report = list(client.dataset(test_run["defaultDatasetId"]).iterate_items())[0]

if test_report["failed"] > 0:
    print(f"Test suite FAILED: {test_report['failed']}/{test_report['totalTests']} failed")
    exit(1)

print("All checks passed — safe to deploy")
python

Dashboard access

All five tools are available in the ApifyForge dashboard under the **Tools** section in the sidebar:

- **/dashboard/tools/schema-validator** — Schema Validator - **/dashboard/tools/test-runner** — Test Runner - **/dashboard/tools/cloud-staging** — Cloud Staging - **/dashboard/tools/regression-tests** — Regression Suite - **/dashboard/tools/mcp-debugger** — MCP Debugger

Each page follows the same pattern: configure inputs, click Run, view results. Previous results are cached and loaded automatically on page load.

Related guides

- **Actor Testing Best Practices** (/learn/actor-testing) — Local testing strategies, pre-push hooks, and debugging failed runs - **Store SEO Optimization** (/learn/store-seo) — How quality score (which testing improves) affects Store ranking - **Schema Tools** (/learn/schema-tools) — Deep dive into schema validation and the Schema Registry - **PPE Pricing** (/learn/ppe-pricing) — How to price your actors and track revenue

Related guides

Beginner

Getting Started with Apify Actors

A complete walkthrough from zero to your first deployed actor. Covers project structure, Actor.main(), input schema, Dockerfile, and your first Apify Store listing.

Essential

Understanding PPE Pricing

How Pay Per Event works, how to set prices that attract users while covering costs, and common pricing mistakes that leave money on the table.

Revenue

How to Monetize Your Actors

Revenue strategies beyond basic PPE. Tiered pricing, free-tier funnels, bundling actors into MCP servers, and tracking revenue with ApifyForge analytics.

Quality

Actor Testing Best Practices

Use the ApifyForge test runner and regression suite to validate actors before every deploy. Define test cases, set assertions, and integrate with CI/CD.

Growth

Store SEO Optimization

How Apify Store search works, what metadata matters, and how to write READMEs that rank. Includes the quality score breakdown and how ApifyForge tracks it.

Scale

Managing Multiple Actors

Fleet management strategies for 10, 50, or 200+ actors. Bulk operations, shared configs, maintenance monitoring, and the ApifyForge dashboard workflow.

Essential

Cost Planning Tools: Calculator, Plan Advisor & Proxy Analyzer

How to use ApifyForge's cost planning tools to estimate actor run costs, choose the right Apify subscription plan, and pick the most cost-effective proxy type for each scraper.

Essential

AI Agent Tools: MCP Debugger, Pipeline Builder & LLM Optimizer

How to use ApifyForge's AI agent tools to debug MCP server connections, design multi-actor pipelines, optimize actor output for LLM token efficiency, and generate integration templates.

Quality

Schema Tools: Diff, Registry & Input Tester

How to use ApifyForge's schema tools to compare actor output schemas, browse the field registry, and test actor inputs before running — preventing wasted credits and broken pipelines.

Essential

Compliance Scanner, Actor Recommender & Comparisons

How to use ApifyForge's compliance risk scanner to assess legal exposure, the actor recommender to find the best tool for your task, and head-to-head comparisons to evaluate competing actors.

Essential

The Complete ApifyForge Tool Suite

All 14 developer tools in one guide: testing, schema analysis, cost planning, compliance scanning, LLM optimization, and pipeline building. What each tool does, when to use it, and how they work together.