Skip to content

ElevenLabsSTTConfig

Configuration options for the ElevenLabs STT provider.

Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:85

Configuration options for the ElevenLabs STT provider.

Remarks

Extends STTProviderConfig with ElevenLabs-specific settings. You must provide either apiKey / token (for direct browser-to-ElevenLabs connections) or proxyUrl (for a server-side proxy that injects the API key). If proxyUrl is provided, it takes precedence.

Example

// Direct connection (API key exposed to browser -- development only)
const config: ElevenLabsSTTConfig = {
  apiKey: 'el-xxxxxxxxxxxx',
  model: 'scribe_v2_realtime',
  commitStrategy: 'vad',
};

// Proxy connection (recommended for production)
const config: ElevenLabsSTTConfig = {
  proxyUrl: 'http://localhost:3001/api/proxy/elevenlabs',
  audioFormat: 'pcm_16000',
  includeTimestamps: true,
};

See

Extends

Properties

PropertyTypeDefault valueDescriptionOverridesInherited fromDefined in
apiKey?stringundefinedElevenLabs API key for direct authentication. Required when connecting directly to ElevenLabs without a proxy.STTProviderConfig.apiKey-src/providers/stt/elevenlabs/ElevenLabsSTT.ts:90
audioFormat?ElevenLabsSTTAudioFormatundefinedAudio encoding format sent to the API. Default 'pcm_16000'--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:122
commitStrategy?"vad" | "manual"undefinedStrategy for committing transcription segments. - 'vad' — Voice Activity Detection automatically commits when silence is detected - 'manual' — Application controls when to commit via explicit signals Default 'vad'--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:116
debug?booleanfalseWhether to enable debug logging for this provider. Remarks When true, the provider emits detailed internal logs. This is separate from the SDK-level LoggingConfig.-STTProviderConfig.debugsrc/core/types/providers.ts:86
enableLogging?booleanundefinedWhether to enable logging on the ElevenLabs side. When false, enables zero-retention mode where no audio or transcripts are stored by ElevenLabs.--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:171
endpoint?stringundefinedCustom endpoint URL to override the provider’s default API endpoint. Remarks Useful for self-hosted instances, proxy servers, or development environments.-STTProviderConfig.endpointsrc/core/types/providers.ts:75
includeLanguageDetection?booleanundefinedWhether to include automatic language detection in results.--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:157
includeTimestamps?booleanundefinedWhether to include word-level timestamps in transcription results. Default false--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:152
interimResults?booleanundefinedWhether to enable interim (partial) transcription results. Remarks When true, the provider emits results as the user speaks, before the utterance is complete. Only applicable to live/WebSocket providers.-STTProviderConfig.interimResultssrc/core/types/providers.ts:352
keywords?string[]undefinedCustom vocabulary or keyword phrases to boost recognition accuracy. Remarks Useful for domain-specific terminology, product names, or proper nouns that the model might not recognize well by default.-STTProviderConfig.keywordssrc/core/types/providers.ts:366
language?stringundefinedLanguage code for transcription. Remarks Uses BCP 47 language tags (e.g., 'en-US', 'es-ES', 'fr-FR'). The supported languages depend on the provider and model.-STTProviderConfig.languagesrc/core/types/providers.ts:335
minSilenceDurationMs?numberundefinedMinimum silence duration in milliseconds before a speech segment ends.--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:146
minSpeechDurationMs?numberundefinedMinimum speech duration in milliseconds before it is considered valid. Helps filter out very short noise bursts.--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:141
model?ElevenLabsSTTModelundefinedSTT model to use for transcription. Default 'scribe_v2_realtime'STTProviderConfig.model-src/providers/stt/elevenlabs/ElevenLabsSTT.ts:108
previousText?stringundefinedPrevious text context to improve transcription accuracy. Useful for maintaining context across sessions. Should be kept short (approximately 50 characters or less).--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:164
proxyUrl?stringundefinedURL of the CompositeVoice proxy server’s ElevenLabs endpoint. Example: 'http://localhost:3000/api/proxy/elevenlabs'--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:96
punctuation?booleanundefinedWhether to enable automatic punctuation in transcription results.-STTProviderConfig.punctuationsrc/core/types/providers.ts:357
timeout?numberundefinedRequest timeout in milliseconds. Remarks Applies to HTTP requests (REST providers) and connection establishment (WebSocket providers). Set to 0 for no timeout.-STTProviderConfig.timeoutsrc/core/types/providers.ts:95
token?stringundefinedTemporary authentication token for WebSocket connections. Alternative to apiKey for short-lived browser sessions.--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:102
vadSilenceThresholdSecs?numberundefinedDuration of silence (in seconds) before VAD considers speech ended. Only applies when commitStrategy is 'vad'.--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:128
vadThreshold?numberundefinedVAD sensitivity threshold (0.0 to 1.0). Higher values require louder speech to trigger detection. Only applies when commitStrategy is 'vad'.--src/providers/stt/elevenlabs/ElevenLabsSTT.ts:135

© 2026 CompositeVoice. All rights reserved.

Font size
Contrast
Motion
Transparency