Classes

Concrete class implementations — providers, orchestrators, and core SDK components.

AgentStateMachine — Orchestrator state machine that derives a high-level AgentState from three underlying sub-machines: capture, playback, and processing.
AnthropicLLM — Anthropic LLM provider for Claude models.
AssemblyAISTT — AssemblyAI real-time STT provider using a raw WebSocket connection.
AudioBufferQueue — Bounded FIFO queue that buffers audio chunks between pipeline stages.
AudioCapture — Manages microphone audio capture using the Web Audio API.
AudioHeaderCache — Caches the audio container header from a stream for re-injection on reconnect.
AudioPlayer — Manages audio playback using the Web Audio API with support for both complete and streaming playback modes.
BrowserAudioOutput — Browser audio output provider that plays audio through the Web Audio API.
BufferInput — Server-side audio input provider that accepts pushed audio buffers.
CartesiaTTS — Cartesia TTS provider for low-latency real-time streaming text-to-speech via WebSocket.
CompositeVoice — The primary class of the CompositeVoice SDK, orchestrating a complete 5-role audio pipeline from input capture through speech recognition, language model…
DeepgramFlux — Deepgram Flux (V2) real-time STT provider — DISABLED.
DeepgramSTT — Deepgram real-time STT provider using native WebSocket (no SDK required).
DeepgramTTS — Deepgram TTS provider for real-time streaming text-to-speech via native WebSocket.
ElevenLabsSTT — ElevenLabs STT provider for real-time streaming speech-to-text via WebSocket.
ElevenLabsTTS — ElevenLabs TTS provider for real-time streaming text-to-speech via WebSocket.
EventEmitter — A type-safe event emitter with support for wildcard listeners and both synchronous and asynchronous event dispatch.
GeminiLLM — Google Gemini LLM provider.
GroqLLM — Groq LLM provider for ultra-fast inference.
Logger — Structured logger with context-aware formatting and configurable levels.
MicrophoneInput — Browser audio input provider that captures audio from the microphone.
MistralLLM — Mistral LLM provider.
NativeSTT — Native browser STT provider backed by the Web Speech API (SpeechRecognition).
NativeTTS — Native browser TTS provider using the Web Speech API (SpeechSynthesis).
NullOutput — No-op audio output provider that discards all audio.
OpenAICompatibleLLM — Base LLM provider for any service that speaks the OpenAI chat completions format.
OpenAILLM — OpenAI LLM provider for GPT models.
OpenAITTS — OpenAI TTS provider using the official OpenAI SDK for text-to-speech synthesis.
WebLLMLLM — WebLLM in-browser LLM provider.
WebSocketManager — Managed WebSocket connection with automatic reconnection and exponential backoff.