Skip to content

Classes

Concrete class implementations — providers, orchestrators, and core SDK components.

Concrete class implementations — providers, orchestrators, and core SDK components.

  • AgentStateMachine — Orchestrator state machine that derives a high-level AgentState from three underlying sub-machines: capture, playback, and processing.
  • AnthropicLLM — Anthropic LLM provider for Claude models.
  • AssemblyAISTT — AssemblyAI real-time STT provider using a raw WebSocket connection.
  • AudioBufferQueue — Bounded FIFO queue that buffers audio chunks between pipeline stages.
  • AudioCapture — Manages microphone audio capture using the Web Audio API.
  • AudioHeaderCache — Caches the audio container header from a stream for re-injection on reconnect.
  • AudioPlayer — Manages audio playback using the Web Audio API with support for both complete and streaming playback modes.
  • BrowserAudioOutput — Browser audio output provider that plays audio through the Web Audio API.
  • BufferInput — Server-side audio input provider that accepts pushed audio buffers.
  • CartesiaTTS — Cartesia TTS provider for low-latency real-time streaming text-to-speech via WebSocket.
  • CompositeVoice — The primary class of the CompositeVoice SDK, orchestrating a complete 5-role audio pipeline from input capture through speech recognition, language model…
  • DeepgramFlux — Deepgram Flux (V2) real-time STT provider — DISABLED.
  • DeepgramSTT — Deepgram real-time STT provider using native WebSocket (no SDK required).
  • DeepgramTTS — Deepgram TTS provider for real-time streaming text-to-speech via native WebSocket.
  • ElevenLabsSTT — ElevenLabs STT provider for real-time streaming speech-to-text via WebSocket.
  • ElevenLabsTTS — ElevenLabs TTS provider for real-time streaming text-to-speech via WebSocket.
  • EventEmitter — A type-safe event emitter with support for wildcard listeners and both synchronous and asynchronous event dispatch.
  • GeminiLLM — Google Gemini LLM provider.
  • GroqLLM — Groq LLM provider for ultra-fast inference.
  • Logger — Structured logger with context-aware formatting and configurable levels.
  • MicrophoneInput — Browser audio input provider that captures audio from the microphone.
  • MistralLLM — Mistral LLM provider.
  • NativeSTT — Native browser STT provider backed by the Web Speech API (SpeechRecognition).
  • NativeTTS — Native browser TTS provider using the Web Speech API (SpeechSynthesis).
  • NullOutput — No-op audio output provider that discards all audio.
  • OpenAICompatibleLLM — Base LLM provider for any service that speaks the OpenAI chat completions format.
  • OpenAILLM — OpenAI LLM provider for GPT models.
  • OpenAITTS — OpenAI TTS provider using the official OpenAI SDK for text-to-speech synthesis.
  • WebLLMLLM — WebLLM in-browser LLM provider.
  • WebSocketManager — Managed WebSocket connection with automatic reconnection and exponential backoff.

© 2026 CompositeVoice. All rights reserved.

Font size
Contrast
Motion
Transparency