A practical checklist to prevent silent quality drops from tiny system prompt/product changes—using eval gates, ablations, golden sets, canaries, and rollbacks.
Voice AI Agent Evaluation Checklist (Vapi/Retell)
A practical checklist to evaluate Voice AI agents: latency, interruptions, ASR/WER, NLU, tool calls, safety/PII, containment, handoff, and test harnesses.