Question 1

What does ApifyForge LLM Output Optimizer do?

Accepted Answer

ApifyForge LLM Output Optimizer analyzes every field in your Apify actor's output and scores it by information density for LLM consumption. It classifies fields as high-value (names, URLs, emails), medium-value (generic text), or low-value (raw HTML, internal IDs, timestamps, debug data). Then it recommends which fields to keep and which to drop, with exact token savings calculations. Typical savings are 40-70% of token consumption.

Question 2

How much does an LLM optimization analysis cost?

Accepted Answer

Each ApifyForge LLM Output Optimizer run costs $0.20, charged as a pay-per-event (PPE) fee on your own Apify account. The tool reads actor output data from a recent run — it does not trigger new actor runs. The optimization report pays for itself if your LLM API costs exceed $0.50/month, since a 40-70% token reduction compounds across every subsequent LLM call.

Question 3

How is token count estimated?

Accepted Answer

ApifyForge LLM Output Optimizer uses character-based approximation at approximately 4 characters per token, which aligns with tokenizer behavior for GPT-3.5/GPT-4 and Claude on English text. While exact token counts vary by model and content, the approximation is accurate within 10-15% for typical web scraping output. The relative savings percentage (original vs optimized) is consistent regardless of the exact tokenizer used.

Question 4

What qualifies as a low-value field?

Accepted Answer

ApifyForge LLM Output Optimizer flags fields as low-value based on several signals: raw HTML content (>500 characters average), internal system IDs, timestamps that don't add semantic meaning, debug/meta fields, fields with >80% null values, and fields that repeat identical content across items. These fields consume tokens without adding information value for LLM processing tasks like summarization, extraction, or classification.

Question 5

Can I use the optimized schema directly?

Accepted Answer

Yes. ApifyForge LLM Output Optimizer outputs an optimizedSchema array listing only the fields recommended for LLM consumption. You can use this list directly in your pipeline to filter actor output before sending it to the LLM API. The optimized schema is a subset of the original fields — no data transformation required, just field filtering.

Question 6

Does this work with any LLM provider?

Accepted Answer

Yes. ApifyForge LLM Output Optimizer reduces the data volume before it reaches any LLM API. The token savings apply equally to OpenAI (GPT-4, GPT-3.5), Anthropic (Claude), Google (Gemini), and any other LLM that charges per token. The optimization is model-agnostic — it reduces input data, not model-specific tokens.

Question 7

What if I need all the fields for my use case?

Accepted Answer

ApifyForge LLM Output Optimizer provides recommendations, not requirements. If your use case genuinely needs raw HTML or timestamp fields, keep them. The value classification helps you make informed decisions: you might keep a field classified as low-value if it's critical for your specific LLM task. The savings calculation shows the impact of each field so you can decide individually.

Question 8

How does null ratio analysis work?

Accepted Answer

ApifyForge LLM Output Optimizer checks the percentage of items where each field is null, empty string, or undefined. Fields with >80% null values are flagged for removal because they consume tokens on empty data across most items while providing value in only a minority of cases. For example, a 'faxNumber' field that is null in 95% of items wastes tokens on 95 null representations for every 5 actual values.

Method	Typical savings	Setup time	Cost
ApifyForge LLM Output Optimizer	40-70% with field-level analysis	Under 30 seconds	$0.20/analysis
Manual field review	20-50% (varies by developer knowledge)	30-60 minutes per actor	Free (time cost)
Custom preprocessing script	Variable — depends on implementation	1-3 hours per actor	Free (development time)
No optimization (raw output to LLM)	0% — full token waste	Zero	2-3x higher LLM API costs

ApifyForge LLM Output Optimizer

What ApifyForge LLM Output Optimizer analyzes

Per-field token estimation

Value classification

Null ratio analysis

Long field detection

Optimized schema output

Savings calculation

Token optimization approaches compared

Example ApifyForge LLM Output Optimizer output

How ApifyForge LLM Output Optimizer works

Alternatives to ApifyForge LLM Output Optimizer

Manual field review

Custom preprocessing script

LLM context window management tools

No optimization (send raw output)