Skip to content

CartesiaTTSConfig

Configuration for the CartesiaTTS provider.

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:114

Configuration for the CartesiaTTS provider.

Remarks

Provide either apiKey (for direct API access) or proxyUrl (for server-side proxy). At least one must be set. If both are provided, proxyUrl takes precedence and the API key is not sent to the client. The voiceId is always required.

Example

// Direct API access
const config: CartesiaTTSConfig = {
  apiKey: 'cart-xxxxxxxxxxxx',
  voiceId: 'a0e99841-438c-4a64-b679-ae501e7d6091',
  modelId: 'sonic-2',
  language: 'en',
  outputEncoding: 'pcm_s16le',
  outputSampleRate: 24000,
};

// Via proxy server
const proxyConfig: CartesiaTTSConfig = {
  proxyUrl: 'http://localhost:3001/api/proxy/cartesia',
  voiceId: 'a0e99841-438c-4a64-b679-ae501e7d6091',
};

See

Extends

Properties

PropertyTypeDefault valueDescriptionOverridesInherited fromDefined in
apiKey?stringundefinedCartesia API key for direct authentication. Remarks Required when connecting directly to Cartesia (no proxy). Omit when using proxyUrl — the proxy server supplies the key server-side.TTSProviderConfig.apiKey-src/providers/tts/cartesia/CartesiaTTS.ts:122
cartesiaVersion?string'2024-06-10'Cartesia API version string. Remarks Used as a query parameter in the WebSocket URL for direct connections.--src/providers/tts/cartesia/CartesiaTTS.ts:209
debug?booleanfalseWhether to enable debug logging for this provider. Remarks When true, the provider emits detailed internal logs. This is separate from the SDK-level LoggingConfig.-TTSProviderConfig.debugsrc/core/types/providers.ts:86
emotion?string[]undefinedEmotion controls for voice expression. Remarks An array of emotion tags that influence the voice’s expressiveness. Example tags: 'positivity:high', 'curiosity', 'anger:low'. Example { emotion: ['positivity:high', 'curiosity'] }--src/providers/tts/cartesia/CartesiaTTS.ts:199
endpoint?stringundefinedCustom endpoint URL to override the provider’s default API endpoint. Remarks Useful for self-hosted instances, proxy servers, or development environments.-TTSProviderConfig.endpointsrc/core/types/providers.ts:75
language?string'en'BCP 47 language code for synthesis.--src/providers/tts/cartesia/CartesiaTTS.ts:158
model?stringundefinedModel to use for text-to-speech synthesis. Remarks Provider-specific model identifier (e.g., 'aura-2' for Deepgram).-TTSProviderConfig.modelsrc/core/types/providers.ts:975
modelId?CartesiaTTSModel'sonic-2'Model ID to use for synthesis. See CartesiaTTSModel--src/providers/tts/cartesia/CartesiaTTS.ts:151
outputEncoding?CartesiaOutputEncoding'pcm_s16le'Output audio encoding format. See CartesiaOutputEncoding--src/providers/tts/cartesia/CartesiaTTS.ts:166
outputFormat?stringundefinedOutput audio format identifier. Remarks Provider-specific format string (e.g., 'linear16', 'mp3', 'opus').-TTSProviderConfig.outputFormatsrc/core/types/providers.ts:1000
outputSampleRate?number16000Output audio sample rate in Hz.--src/providers/tts/cartesia/CartesiaTTS.ts:173
pitch?numberundefinedPitch adjustment in semitones. Remarks Values from -20 to +20 semitones. Not all providers support pitch adjustment.-TTSProviderConfig.pitchsrc/core/types/providers.ts:992
proxyUrl?stringundefinedURL of the CompositeVoice proxy server’s Cartesia endpoint. Remarks When set, the WebSocket connection is routed through the proxy and the apiKey is not required on the client side. The HTTP URL is automatically converted to a WebSocket URL (ws:// or wss://). Example 'http://localhost:3001/api/proxy/cartesia'--src/providers/tts/cartesia/CartesiaTTS.ts:134
rate?numberundefinedSpeech rate multiplier. Remarks Values from 0.25 (quarter speed) to 4.0 (quadruple speed), where 1.0 is normal speed. Not all providers support rate adjustment.-TTSProviderConfig.ratesrc/core/types/providers.ts:984
sampleRate?numberundefinedSample rate for the output audio in Hz. Remarks Common values are 16000, 24000, and 48000. Must match the format capabilities of the chosen voice and model.-TTSProviderConfig.sampleRatesrc/core/types/providers.ts:1009
speed?numberundefined (uses Cartesia’s default)Speech speed multiplier. Remarks Values greater than 1 speed up speech; values less than 1 slow it down.--src/providers/tts/cartesia/CartesiaTTS.ts:183
timeout?numberundefinedRequest timeout in milliseconds. Remarks Applies to HTTP requests (REST providers) and connection establishment (WebSocket providers). Set to 0 for no timeout.-TTSProviderConfig.timeoutsrc/core/types/providers.ts:95
voice?stringundefinedVoice ID or name to use for synthesis. Remarks Provider-specific voice identifier. For example, Deepgram uses identifiers like 'aura-asteria-en', while ElevenLabs uses voice IDs.-TTSProviderConfig.voicesrc/core/types/providers.ts:967
voiceIdstringundefinedCartesia voice ID (required). Remarks Find voice IDs via the Cartesia Voice Library or the API’s list voices endpoint.--src/providers/tts/cartesia/CartesiaTTS.ts:143

© 2026 CompositeVoice. All rights reserved.

Font size
Contrast
Motion
Transparency