Classes
Concrete class implementations — providers, orchestrators, and core SDK components.
Concrete class implementations — providers, orchestrators, and core SDK components.
- AgentStateMachine — Orchestrator state machine that derives a high-level AgentState from three underlying sub-machines: capture, playback, and processing.
- AnthropicLLM — Anthropic LLM provider for Claude models.
- AssemblyAISTT — AssemblyAI real-time STT provider using a raw WebSocket connection.
- AudioBufferQueue — Bounded FIFO queue that buffers audio chunks between pipeline stages.
- AudioCapture — Manages microphone audio capture using the Web Audio API.
- AudioHeaderCache — Caches the audio container header from a stream for re-injection on reconnect.
- AudioPlayer — Manages audio playback using the Web Audio API with support for both complete and streaming playback modes.
- BrowserAudioOutput — Browser audio output provider that plays audio through the Web Audio API.
- BufferInput — Server-side audio input provider that accepts pushed audio buffers.
- CartesiaTTS — Cartesia TTS provider for low-latency real-time streaming text-to-speech via WebSocket.
- CompositeVoice — The primary class of the CompositeVoice SDK, orchestrating a complete 5-role audio pipeline from input capture through speech recognition, language model…
- DeepgramFlux — Deepgram Flux (V2) real-time STT provider — DISABLED.
- DeepgramSTT — Deepgram real-time STT provider using native WebSocket (no SDK required).
- DeepgramTTS — Deepgram TTS provider for real-time streaming text-to-speech via native WebSocket.
- ElevenLabsSTT — ElevenLabs STT provider for real-time streaming speech-to-text via WebSocket.
- ElevenLabsTTS — ElevenLabs TTS provider for real-time streaming text-to-speech via WebSocket.
- EventEmitter — A type-safe event emitter with support for wildcard listeners and both synchronous and asynchronous event dispatch.
- GeminiLLM — Google Gemini LLM provider.
- GroqLLM — Groq LLM provider for ultra-fast inference.
- Logger — Structured logger with context-aware formatting and configurable levels.
- MicrophoneInput — Browser audio input provider that captures audio from the microphone.
- MistralLLM — Mistral LLM provider.
- NativeSTT — Native browser STT provider backed by the Web Speech API (SpeechRecognition).
- NativeTTS — Native browser TTS provider using the Web Speech API (SpeechSynthesis).
- NullOutput — No-op audio output provider that discards all audio.
- OpenAICompatibleLLM — Base LLM provider for any service that speaks the OpenAI chat completions format.
- OpenAILLM — OpenAI LLM provider for GPT models.
- OpenAITTS — OpenAI TTS provider using the official OpenAI SDK for text-to-speech synthesis.
- WebLLMLLM — WebLLM in-browser LLM provider.
- WebSocketManager — Managed WebSocket connection with automatic reconnection and exponential backoff.