SIMULATE · EVALUATE · IMPROVEVoice agents for booking, dispatch, scheduling & after-hours

Your booking agent has to get the address right.

Dispatch calls turn on the exact street, the exact window, and a quote the company can actually honor. One misheard address and the truck rolls to the wrong house. Roark scores every call — on the audio.

Book a demo See what breaks

Backed by YC

Live · scoring every call1,284 today

Caller: It's 80 Maple, not 18.

Agent: Got it — sending the tech to 18 Maple.

Misheard address — Street number flattened by the ASR — heard by the audio model, invisible to the transcript.Address capture

Scoring production voice AI for teams at

radiantgraph

§01 · When the call goes wrong

Here's how a booking call goes wrong.

Each one is a wasted truck roll, a no-show window, or a quote you have to eat at the door — and most are inaudible to a tool that only reads the transcript.

01Booking

The 'Eighty' vs 'Eighteen' address

Caller: It's 80 Maple, not 18.

Agent: Got it — sending the tech to 18 Maple.

The transcript reads clean; the audio is wrong. Only an audio model hears the digit the ASR flattened — and that truck rolls to the wrong house.

Address capture

02Scheduling

The wrong arrival window

Caller: Morning only — I work after noon.

Agent: Perfect, you’re booked 2–4pm.

The constraint was stated and ignored. Roark fails any booking where the confirmed window contradicts what the caller asked for.

Window accuracy

03Quoting

The price you can’t honor

Caller: So how much for the drain clear?

Agent: That’ll be a flat $89, all in.

A number invented on the call that no price book backs. The customer holds you to it at the door — Roark flags any quote the agent can’t source.

Quote integrity

04After-hours

Dead air on the emergency call

Caller: My basement is flooding right now.

Agent: …

A four-second silence on a 2am burst pipe and the caller hangs up and dials a competitor. Roark scores response time and dead air the transcript never shows.

Response time

05Upsell

The pushy add-on tone

Caller: I just need the one repair, thanks.

Agent: You really should add the $400 membership today.

The words might pass; the delivery is aggressive. The audio model scores tone and pressure the transcript reads as a polite suggestion.

Tone

§02 · From caught to fixed

Roark catches every one of these — and proves the fix.

Each failure above is filed with its evidence, becomes a repeatable simulation until a candidate passes, and is verified on your next thousand live calls.

01 · Catch

The ledger above — every failure filed live, evidence attached.

See what breaks

02 · Simulate

Your fix, replayed against the exact failures above.

Testing your candidates82 / 240

prompt · confirm address read-backfail

asr · numeric-bias decodingfail

voice · slower digit pacingfail

03 · Review

Every change explicit and diffed — you apply it.

Your fix, diffedsupport_v3 v4

PromptToolVoice

− Book the appointment for the caller.

+ Read the full address back and confirm the digits before booking anything.

04 · Verify

You ship — Roark confirms the metric moved on live calls.

Verifying support_v4 in production126 calls scored

Address capture — since your deploy71→78

Issue recurrencewatching…

Quality score78 ↑

Regressions on other metricsnone

you ship it — Roark verifies every call, same booking flow, every channel

…and the loop runs again on the next call.

§03 · Simulate before launch

Break it in staging,
not in production.

Run your agent against hundreds of simulated callers — realistic personas, accents, background noise and edge cases — and get every conversation scored before a customer ever dials in.

Scenarios & personas

Hundreds of simulated callers — the angry one, the rambler, the interrupter — built from your real call types.

45 languages & accents

Native accents, code-switching and background noise — in every market your agent answers.

Load & health tests

Peak-volume concurrency and always-on health checks, so the agent that passed in staging survives launch day.

Run it in CI

Every prompt or model change runs the suite before it merges — quality gates for conversations, not just code.

Pre-launch suite · home_services_v1182 / 200 passed

Booking · The 'Eighty' vs 'Eighteen' addresspass · 92

Scheduling · The wrong arrival windowpass · 88

Quoting · The price you can’t honorfail · 61

After-hours · Dead air on the emergency callpass · 85

Upsell · The pushy add-on tonepass · 90

1 failure filed as an issue — fix it before launch, not after

Run your first suite

§04 · Evals & observability

64+ metrics. Your models,
not just an LLM.

Every production call scored as it lands — issues filed, alerts fired, dashboards and OTEL traces on tap, for voice calls and chat threads alike. And where most tools grade a transcript with an LLM, Roark runs purpose-built audio models on the call itself, measuring what your customer actually heard.

Everyone else

LLM reads the transcript

“The agent said the right words.” Misses how it sounded — the mispronounced drug name, the flat apology, the rushed close.

“…refund within three business days.” ✓ text-match

Roark · audio modelhears the call

Audio models hear the call

Pronunciation, accent, emotion and vocal stress measured from the waveform — the signal an LLM grading text can never see.

emotion · pace: rushed close

Empathy

Audio-native

custom models

Address capture
Accent clarity
Pronunciation
Dead air
Vocal stress
Interruptions

Booking accuracy

task

Window accuracy
Quote integrity
Service-type match
Slot confirmation

Conversational

LLM + rules

Tone
Task success
Hallucination
Repetition
Upsell pressure
Empathy

Performance

latency

Response time
Time-to-first-word
ASR WER
Barge-in handling

64+metrics out of the box

∞custom metrics, your rules

Audio + LLMmodels on every call

§05 · Get started

First call scored in under a minute.

One click on any platform below and production calls stream in on their own — or send any recording with three lines of code.

Read the quickstart

evaluate.ts

import Roark from '@roarkhq/sdk'
const roark = new Roark({ apiKey })
await roark.calls.evaluate({
  recordingUrl, agent: 'support_v2',
}) // scored in seconds

Node · Python · Go — plus a REST API for CI/CD and webhooks the instant a call is scored

Works with