01The Measurement Problem
Vendors report automation rates, deflection percentages, and efficiency gains based on their own telemetry. These metrics often conflate AI-initiated work with AI-completed work.
Example: Inflated automation rates
A regional bank deploys an AI agent for handling routine account inquiries — balance checks, transaction history, payment due dates. The vendor reports a 78% automation rate: 78% of customer inquiries were "handled by AI." The bank's operations team investigates. Of the 78%, roughly 15% were inquiries where the AI retrieved account data but could not answer the customer's actual question. The customer was then transferred to a human agent who resolved the issue. Another 8% were cases where the AI provided an answer the customer rejected, resulting in a callback handled by a human. The actual full-automation rate — inquiries where the AI independently produced an accepted outcome with no human involvement — is 55%.
Without independent measurement, organizations cannot distinguish genuine automation — where AI independently produces accepted outcomes — from expensive suggestion engines that generate work for human agents to finish.
02Actual vs Claimed Outcomes
Accurate ROI measurement requires tracking the full lifecycle of each task. The relevant questions are:
Trusted Work Units capture this full lifecycle. Each TWU records the initiating event, the sequence of actor contributions, the modifications applied, and the final outcome classification. This enables organizations to calculate true automation rates based on accepted outcomes rather than initiated interactions.
Example: True automation classification
A TWU ledger for a telecom's support operation shows the following breakdown for 10,000 AI-involved interactions in January: 5,200 fully automated (AI produced the accepted outcome, no human involvement), 2,800 AI-assisted (AI drafted a response, human made minor modifications — formatting, tone adjustments — before delivery), 1,400 AI-initiated, human-completed (AI's draft was substantially rewritten or the AI's action was reversed by a human), 600 AI-failed (AI could not produce any output, interaction was routed to a human from the start). The vendor reported all 10,000 as "AI-handled interactions." The verified full-automation rate: 52%, not 100%.
03The Rework Factor
Rework is the hidden cost of AI automation. When human agents spend time correcting, completing, or redoing AI-generated work, the cost of that rework must be factored into ROI calculations. Most ROI models ignore it entirely.
Example: Quantifying rework cost
A BPO operates a 200-seat contact center for a healthcare client. The AI agent handles first-response on insurance eligibility inquiries. The vendor reports a 75% automation rate across 40,000 monthly interactions. Independent TWU analysis reveals that 12,000 of those "automated" interactions required significant human rework — agents spending an average of 5.2 minutes correcting AI-generated responses. At a fully loaded agent cost of $22 per hour, the monthly rework cost is $22,880. The vendor's ROI model shows $34,000 in AI costs producing $68,000 in labor savings (30,000 automated interactions at $2.27 saved per interaction). The actual ROI model: $34,000 in AI costs plus $22,880 in rework costs producing $45,120 in net labor savings. The real ROI is 66% lower than the vendor's claim.
Veratrace's rework detection quantifies this hidden cost by comparing AI-generated outputs against delivered outputs within each TWU. When the difference exceeds configurable thresholds, the attribution engine recalculates the AI contribution accordingly.
04Cost Per Verified Outcome
The most meaningful ROI metric is cost per verified outcome (CPVO) — the total cost to produce a task result that was accepted without rework. CPVO accounts for:
CPVO provides the basis for honest vendor reconciliation. When CPVO exceeds the cost of fully human execution for specific task categories, the AI deployment is destroying value rather than creating it. This signal is invisible without independent measurement.
05From Vendor Metrics to Enterprise Truth
Enterprise AI leaders need metrics they can present to CFOs and boards with confidence.
Example: Board-ready reporting
A VP of Operations presents AI automation results to the executive committee. The vendor-sourced version: "We automated 75% of customer interactions, saving an estimated $816,000 annually." The TWU-verified version: "Independent measurement shows 52% full automation and 23% partial automation requiring an average of 5.2 minutes of human rework per interaction. After accounting for rework costs, retry duplicates, and partial completions, verified annual savings are $541,440 at a cost per verified outcome of $3.47 versus $4.12 for fully human execution. Net ROI: 15.8%. We recommend renegotiating the vendor contract to exclude reworked interactions from billing."
The difference is evidence. Vendor metrics are assertions. Independently measured ROI is verifiable. The governance infrastructure that produces this measurement is not overhead — it is the prerequisite for making AI investment decisions based on reality rather than vendor narratives.
