SignalWire
← HOME/ARCHITECTURE

The control plane owns the call.

Most voice platforms are middleware: orchestration that sits outside the media, reassembling state from webhooks. Call Fabric is the opposite shape. The kernel lives inside the media path. State lives in one place. Four surfaces let you program it without leaving the substrate.

01 · MEDIA SUBSTRATE

The kernel lives with the audio.

SignalWire inherits the FreeSWITCH lineage. Voice, video, conferencing, messaging, and AI all run as first-class capabilities in one real-time media substrate. The LLM call, the STT decode, the TTS render: they happen inside the process that's handling the RTP packets.

Bolt-on stacks take a different shape. A media server accepts the call, serializes audio over a socket, ships it to a middleware process, which ships it again to a provider. Every turn crosses the network two or three times. The measured latency is that many round-trips of ceremony.

media-path.txt
# Bolt-on stack — every turn crosses the network twice
caller ──▶ telco ──▶ rtp server ──▶ webhook ──▶ middleware

             ┌─────────────────────────────────────┘

        orchestration ──▶ STT ──▶ LLM ──▶ TTS

             └─▶ webhook ──▶ media server ──▶ telco ──▶ caller

# SignalWire — the kernel lives with the media
caller ──▶ telecom stack (FreeSWITCH lineage) ──▶ caller

                       └─▶ AI kernel: STT / LLM / TTS / SWAIG tools
interaction.json
// one interaction, one UUID, one state object
{
  "call_id": "31a0-...-f9e2",
  "state": "in_progress",
  "resources": ["+14155551234", "agent:reception", "conference:waiting_room"],
  "history": [
    { "t": 0.0,  "turn": "caller", "text": "I'd like to book an appointment" },
    { "t": 1.05, "turn": "agent",  "tool": "search_slots", "result": [...] },
    { "t": 3.8,  "turn": "agent",  "text": "I have Tuesday at 2pm or..." }
  ],
  "transfers": [],
  "recording": "s3://.../31a0.wav"
}
02 · CONTROL PLANE

One interaction. One UUID. One state model.

Call Fabric owns the state. Media, transfer history, tool-call results, recording location, the caller's phone: it all lives under one UUID that moves with the call as it transfers, bridges, or escalates.

Middleware-based stacks rebuild this reality in their own database from a stream of webhook events. That stream is lossy, ordering-sensitive, and always slightly behind. The moment state drifts, the caller feels it.

03 · ADDRESSABLE RESOURCES

Everything inside Call Fabric has an address.

Phone numbers, agents, SIP endpoints, conference rooms, SWML scripts, subscribers, queues: each is a resource. Resources compose. A number routes to an agent, which transfers to a conference, which bridges to a SIP endpoint. One substrate, one resource graph.

Call FabricCONTROL PLANEone UUID per interactionphone numbersE.164AI agentsSWAIGconference roomsWebRTC + SIPSIP endpointstrunks + UAsSWML scriptsdeclarativesubscribersidentities
04 · PROGRAMMABLE SURFACES

Four doors into the same substrate.

Which one you pick depends on what you're doing. Declarative call markup, a live WebSocket, code in the language you use, or REST to administer it from ops. Every surface resolves to the same kernel.

SWML · DECLARATIVE

Describe the call.

JSON or YAML. 50+ methods. Write what the platform should do (answer, play, record, transfer, prompt the AI) and let Call Fabric walk the graph.

WHENBest when the flow is known ahead of time: IVRs, scripted greetings, approval bots.

RELAY · REAL-TIME

Hold a live WebSocket.

Subscribe to every event. Inject prompts mid-turn. Transfer. Bridge. Pause recording. Relay hands you the same call that SWML describes, only now you drive.

WHENBest for agents whose behavior branches on live context: concierge, legal intake, triage.

SDK · CODE

Write the agent.

Python, TypeScript, Go, Java, C#, Rust, Ruby, PHP, Perl, C++. SDKs produce SWML and open Relay connections. You never see the wire format unless you want to.

WHENBest when the agent pattern lives in a codebase with tests and CI.

REST · ADMIN

Administer by UUID.

Provision numbers, swap agents on a live address, query recordings, rotate keys. Every resource in Call Fabric is reachable by its UUID.

WHENBest for ops dashboards, provisioning scripts, audit pipelines.

05 · WHAT THIS ENABLES

What this architecture enables.

The architecture choice determines what becomes possible, and what stays out of reach.

01

Millisecond turn latency

Tuned SignalWire tops out around 1.10s end-to-end on a conversational turn. LiveKit tuned averages 1.75s. Vapi averages 1.85s. The gap is the network ceremony a bolt-on stack pays per turn.

02

Mid-turn interventions

Inject a prompt, transfer the call, bridge a supervisor, pause recording, while the caller is still speaking. All four land as state mutations on the one UUID.

03

System-Directed AI, not prompt-and-pray

Prompts, tools, and context scope per step, per role, per function. Sensitive data passes through a layer the model never sees. Agents can't talk their way out of a scoped step, because the step owns the boundary, not the prompt.

04

Structured observability

Every turn emits a structured event with per-component timing: STT latency, LLM first-token, TTS first-audio, external tool RTT. Not a log, a trace.

NEXT

Everything else is this shape.

The landing page has the product view: benchmark numbers, interfaces, pricing, migration. Come back once the architecture makes sense.