WebCallHub.aiContact Sales

How WebCallHub AI works

Browser → AI voice → Browser in <800ms. Built on best-in-class open infrastructure.

The voice loop (real-time)

🌐
Browser
📡
WebRTC + DTLS-SRTP
🎙️
STT Deepgram
🧠
LLM Claude/GPT
🌐
Browser
📡
WebRTC
🔊
TTS streaming
📝
Response
End-to-end latency target: < 800ms

Technical details

Real-time transport

WebRTC via JsSIP + Asterisk PBX. DTLS-SRTP end-to-end encryption (same as Google Meet). Adaptive bitrate. Geo-distributed TURN/STUN. Sub-100ms round-trip on most networks.

Speech-to-text

Primary: Deepgram Nova-2 (real-time, 7+ languages). Fallback: Whisper-large via self-hosted GPU. ~200ms transcription latency. Auto-detect language.

LLM orchestration

Multi-model routing: Claude Sonnet for nuanced conversations, GPT-4o for tool use, fine-tuned Llama for high-volume tier-1. Per-customer routing rules. RAG via pgvector + Qdrant.

Text-to-speech

ElevenLabs Turbo (lowest latency) for premium voices. Azure Neural TTS as fallback. Custom voice cloning available (Enterprise). Sentence-level streaming, not wait-for-complete.

Data & compliance

PostgreSQL primary store, encrypted at rest (AES-256). EU-residency option via Finnish-hosted infrastructure. Audio never trains external models. SOC2 Type II in progress.

Reliability

99.9% uptime SLA (99.99% Enterprise). Multi-region active-active. Automatic failover between STT/LLM/TTS providers. Real-time status page. PagerDuty 24/7.

Self-host? Bring your own models?

Enterprise customers can bring their own OpenAI / Anthropic keys, route via Azure OpenAI, or run entirely on-prem.

Talk to Enterprise →