Output Completeness Monitor
Track output volume trends and detect data quality drift across your actor fleet. Alerts when actors start producing fewer results or missing fields.
Maintenance Pulse
90/100Cost Estimate
How many results do you need?
Pricing
Pay Per Event model. You only pay for what you use.
| Event | Description | Price |
|---|---|---|
| completeness-check | Charged per completeness analysis. | $0.25 |
Example: 100 events = $25.00 · 1,000 events = $250.00
Documentation
Detect silent data quality degradation across your Apify actor fleet. When an actor stops throwing errors but starts returning fewer results than it used to — 100 results last week, 30 this week — that is the hardest kind of failure to catch. This actor catches it automatically by comparing current result counts against historical baselines for every actor in your account.
Why this matters: Actors break in two ways. Loud failures throw errors and show up in your run history as FAILED. Silent failures succeed with status SUCCEEDED but return partial or empty data. A site changes its pagination, an API starts rate-limiting without returning errors, or a selector stops matching — the run completes fine, but your data is incomplete. This actor detects those silent failures.
Features
- Automatic baseline comparison — Compares recent successful run output counts against historical averages for each actor
- Configurable sensitivity — Set your own degradation threshold (default 50% drop) and minimum runs required for analysis
- Trend detection — Identifies whether degradation is declining, stable, volatile, or improving over time
- Smart data splitting — Works even when all runs are within the analysis window by splitting data proportionally
- Fleet-wide coverage — One run checks every actor in your account, regardless of fleet size
- Actionable recommendations — Each degraded actor gets a specific recommendation based on severity and trend
Use cases
Scheduled data quality monitoring
Run weekly on a schedule. Get alerted when any scraper starts returning fewer results than its baseline. Catch site changes before your downstream data consumers notice gaps.
Post-deployment validation
After pushing actor updates, run this to verify output volume hasn't dropped. Compare before and after by adjusting the hoursBack window.
SLA compliance
If you deliver data to clients who expect consistent volumes, use this to verify your actors maintain output levels. A 50% drop in results could mean a 50% drop in data delivery.
Multi-actor fleet management
When you run dozens or hundreds of actors, manually checking output counts is impractical. This actor surfaces degradation across your entire fleet in one dataset.
Scraper maintenance prioritization
Sort degraded actors by drop percentage to prioritize which scrapers need immediate attention versus which can wait.
Input
| Field | Type | Required | Description | Default |
|---|---|---|---|---|
apiToken | String (secret) | Yes | Your Apify API token | — |
hoursBack | Integer (24-2160) | No | How far back to analyze run history | 168 (7 days) |
degradationThreshold | Number (0.05-0.95) | No | Alert if results dropped below this fraction of average (0.5 = 50%) | 0.5 |
minRunsToAnalyze | Integer (2-50) | No | Skip actors with fewer successful runs than this | 3 |
Example input
{
"apiToken": "apify_api_YOUR_TOKEN_HERE",
"hoursBack": 168,
"degradationThreshold": 0.5,
"minRunsToAnalyze": 3
}
Output
Each actor with detected degradation produces one record. The final record is always a fleet summary.
Degradation report example
{
"actorName": "website-contact-scraper",
"actorId": "BCq991ez5HObhS5n0",
"status": "DEGRADED",
"currentAvgResults": 3.2,
"historicalAvgResults": 8.5,
"dropPercentage": 62,
"recentRuns": 5,
"totalSuccessfulRuns": 18,
"trend": "declining",
"latestRunResults": 2,
"recommendation": "Result count dropped 62% and continues to decline — possible site structure change, rate limiting, or partial blocking",
"checkedAt": "2026-03-18T14:30:00.000Z"
}
Fleet summary example
{
"type": "summary",
"totalActors": 294,
"actorsAnalyzed": 180,
"actorsDegraded": 4,
"actorsHealthy": 162,
"actorsInsufficientData": 14,
"worstDegradation": "website-contact-scraper (62% drop)",
"checkedAt": "2026-03-18T14:30:00.000Z"
}
Output fields — Degradation report
| Field | Type | Description |
|---|---|---|
actorName | String | Actor name |
actorId | String | Apify actor ID |
status | String | DEGRADED, HEALTHY, or INSUFFICIENT_DATA |
currentAvgResults | Number | Average results per successful run in the recent period |
historicalAvgResults | Number | Average results per successful run in the historical period |
dropPercentage | Integer | Percentage drop from historical to current (62 = 62% fewer results) |
recentRuns | Integer | Number of recent successful runs analyzed |
totalSuccessfulRuns | Integer | Total successful runs across both periods |
trend | String | declining, stable, improving, or volatile |
latestRunResults | Integer | Result count from the most recent successful run |
recommendation | String | Actionable suggestion based on severity and trend |
checkedAt | String | ISO timestamp |
Output fields — Fleet summary
| Field | Type | Description |
|---|---|---|
type | String | Always "summary" |
totalActors | Integer | Total actors in your account |
actorsAnalyzed | Integer | Actors with enough data to analyze |
actorsDegraded | Integer | Actors with output below threshold |
actorsHealthy | Integer | Actors with stable or improving output |
actorsInsufficientData | Integer | Actors skipped due to zero historical baseline |
worstDegradation | String | Actor with the largest percentage drop |
checkedAt | String | ISO timestamp |
How to use the API
Python
from apify_client import ApifyClient
client = ApifyClient(token="YOUR_API_TOKEN")
run = client.actor("ryanclinton/output-completeness-monitor").call(
run_input={
"apiToken": "YOUR_API_TOKEN",
"hoursBack": 168,
"degradationThreshold": 0.5,
}
)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
if item.get("type") == "summary":
print(f"Fleet: {item['actorsAnalyzed']} analyzed, {item['actorsDegraded']} degraded")
else:
print(f"[{item['status']}] {item['actorName']}: {item['dropPercentage']}% drop ({item['trend']})")
print(f" {item['recommendation']}")
JavaScript / Node.js
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('ryanclinton/output-completeness-monitor').call({
apiToken: 'YOUR_API_TOKEN',
hoursBack: 168,
degradationThreshold: 0.5,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
const summary = items.find(i => i.type === 'summary');
console.log(`Fleet: ${summary.actorsAnalyzed} analyzed, ${summary.actorsDegraded} degraded`);
const degraded = items.filter(i => i.status === 'DEGRADED');
degraded.forEach(r => {
console.log(`${r.actorName}: ${r.dropPercentage}% drop (${r.trend})`);
console.log(` ${r.recommendation}`);
});
cURL
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~output-completeness-monitor/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"apiToken": "YOUR_API_TOKEN",
"hoursBack": 168,
"degradationThreshold": 0.5,
"minRunsToAnalyze": 3
}'
How it works
- Fetches all your actors via the Apify API using your token
- Pulls run history for each actor (up to 200 most recent runs)
- Filters to successful runs only — ignores FAILED, TIMED-OUT, and ABORTED runs since those are a separate problem
- Splits runs into two periods — recent (within
hoursBack) and historical (beforehoursBack). If all runs fall within the window, splits proportionally (60% historical, 40% recent) - Computes averages — average dataset item count per successful run in each period
- Detects degradation — flags actors where
currentAvg / historicalAvgfalls below thedegradationThreshold - Analyzes trends — splits the full run history into halves and compares averages. High coefficient of variation indicates volatility
- Generates recommendations — based on drop severity (30%, 50%, 80%) and trend direction
Understanding the results
Drop percentage
- 30-50% drop — Gradual degradation. May indicate soft rate limiting, reduced content on target sites, or pagination issues
- 50-80% drop — Significant degradation. Likely a site structure change, API endpoint modification, or partial blocking
- 80%+ drop — Near-total failure. Almost certainly a breaking change: new anti-bot measures, complete site redesign, or API deprecation
Trend values
| Trend | Meaning |
|---|---|
declining | Second half of runs produces fewer results than first half — getting worse |
stable | Output volume consistent (but may still be below historical baseline) |
improving | Second half produces more results than first half — recovering |
volatile | High variance between runs — inconsistent extraction |
Limitations
- 200 run limit per actor — Only fetches the 200 most recent runs per actor. Very high-frequency actors may not have enough historical data outside the analysis window.
- Successful runs only — Only compares SUCCEEDED runs. If an actor starts failing entirely, use the Actor Health Monitor instead.
- Zero-result baseline — Actors that historically return zero results are skipped, since there is no meaningful baseline to compare against.
- No per-field analysis — Compares total result counts, not individual field completeness. An actor could return the same number of items but with missing fields — this monitor would not catch that.
- Stats availability — Relies on run stats (
outputItems,resultCount,datasetItemCount) being populated. Some very old runs may not have these stats.
FAQ
Q: How is this different from the Actor Health Monitor? A: The Actor Health Monitor catches loud failures — runs that crash, timeout, or return errors. This monitor catches silent failures — runs that succeed but return less data than expected. Use both together for complete fleet coverage.
Q: What if an actor intentionally returns different amounts of data?
A: Set the degradationThreshold lower (e.g., 0.2) for actors with naturally variable output. Or increase minRunsToAnalyze to smooth out variability.
Q: How much does it cost to run? A: $0.05 per completeness check. One run covers your entire fleet. A weekly check costs about $0.22/month.
Q: Can I monitor specific actors only? A: Currently it monitors all actors in your account. Filter the output dataset to focus on specific actors.
Q: What if I just deployed a new actor with only 2 runs?
A: It will be skipped (default minRunsToAnalyze is 3). Once it accumulates enough runs, it will be included in future checks.
Q: Does this work with PPE-priced actors? A: Yes. It compares dataset item counts regardless of pricing model.
Integrations
Use this actor with:
- Zapier for automated workflows when degradation is detected
- Make for complex remediation automations
- Google Sheets for tracking output trends over time
- The Apify API for programmatic access
- Actor Health Monitor for complete fleet monitoring (failures + degradation)
Related actors
- ryanclinton/actor-health-monitor — Monitor actor failures, diagnose root causes, track trends
- ryanclinton/cost-watchdog — Monitor and control Apify spending
- ryanclinton/actor-portfolio-analytics — Analyze your actor portfolio performance
Pricing
- $0.05 per completeness check — one check covers your entire fleet
| Frequency | Monthly cost |
|---|---|
| Weekly | ~$0.22 |
| Daily | ~$1.50 |
| Twice daily | ~$3.00 |
Changelog
v1.0.0 (2026-03-18)
- Initial release
- Baseline comparison: current vs historical result counts
- Trend detection with declining/stable/improving/volatile classification
- Smart data splitting for actors with limited historical runs
- Fleet-wide summary with worst degradation identification
- Actionable recommendations based on severity and trend
How it works
Configure
Set your parameters in the Apify Console or pass them via API.
Run
Click Start, trigger via API, webhook, or set up a schedule.
Get results
Download as JSON, CSV, or Excel. Integrate with 1,000+ apps.
Use cases
Sales Teams
Build targeted lead lists with verified contact data.
Marketing
Research competitors and identify outreach opportunities.
Data Teams
Automate data collection pipelines with scheduled runs.
Developers
Integrate via REST API or use as an MCP tool in AI workflows.
Related actors
Bulk Email Verifier
Verify email deliverability at scale. MX record validation, SMTP mailbox checks, disposable and role-based detection, catch-all flagging, and confidence scoring. No external API costs.
GitHub Repository Search
Search GitHub repositories by keyword, language, topic, stars, forks. Sort by stars, forks, or recently updated. Returns metadata, topics, license, owner info, URLs. Free API, optional token for higher limits.
Website Content to Markdown
Convert any website to clean Markdown for RAG pipelines, LLM training, and AI apps. Crawls pages, strips boilerplate, preserves headings, tables, and code blocks. GFM support.
Website Tech Stack Detector
Detect 100+ web technologies on any website. Identifies CMS, frameworks, analytics, marketing tools, chat widgets, CDNs, payment systems, hosting, and more. Batch-analyze multiple sites with version detection and confidence scoring.
Ready to try Output Completeness Monitor?
Start for free on Apify. No credit card required.
Open on Apify Store