CartesiaTTSConfig
Configuration for the CartesiaTTS provider.
Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:114
Configuration for the CartesiaTTS provider.
Remarks
Provide either apiKey (for direct API access) or proxyUrl (for server-side proxy). At least one must be set. If both are provided, proxyUrl takes precedence and the API key is not sent to the client. The voiceId is always required.
Example
// Direct API access
const config: CartesiaTTSConfig = {
apiKey: 'cart-xxxxxxxxxxxx',
voiceId: 'a0e99841-438c-4a64-b679-ae501e7d6091',
modelId: 'sonic-2',
language: 'en',
outputEncoding: 'pcm_s16le',
outputSampleRate: 24000,
};
// Via proxy server
const proxyConfig: CartesiaTTSConfig = {
proxyUrl: 'http://localhost:3001/api/proxy/cartesia',
voiceId: 'a0e99841-438c-4a64-b679-ae501e7d6091',
};
See
- CartesiaTTSModel - Available model options.
- CartesiaOutputEncoding - Available encoding options.
Extends
Properties
| Property | Type | Default value | Description | Overrides | Inherited from | Defined in |
|---|---|---|---|---|---|---|
apiKey? | string | undefined | Cartesia API key for direct authentication. Remarks Required when connecting directly to Cartesia (no proxy). Omit when using proxyUrl — the proxy server supplies the key server-side. | TTSProviderConfig.apiKey | - | src/providers/tts/cartesia/CartesiaTTS.ts:122 |
cartesiaVersion? | string | '2024-06-10' | Cartesia API version string. Remarks Used as a query parameter in the WebSocket URL for direct connections. | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:209 |
debug? | boolean | false | Whether to enable debug logging for this provider. Remarks When true, the provider emits detailed internal logs. This is separate from the SDK-level LoggingConfig. | - | TTSProviderConfig.debug | src/core/types/providers.ts:86 |
emotion? | string[] | undefined | Emotion controls for voice expression. Remarks An array of emotion tags that influence the voice’s expressiveness. Example tags: 'positivity:high', 'curiosity', 'anger:low'. Example { emotion: ['positivity:high', 'curiosity'] } | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:199 |
endpoint? | string | undefined | Custom endpoint URL to override the provider’s default API endpoint. Remarks Useful for self-hosted instances, proxy servers, or development environments. | - | TTSProviderConfig.endpoint | src/core/types/providers.ts:75 |
language? | string | 'en' | BCP 47 language code for synthesis. | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:158 |
model? | string | undefined | Model to use for text-to-speech synthesis. Remarks Provider-specific model identifier (e.g., 'aura-2' for Deepgram). | - | TTSProviderConfig.model | src/core/types/providers.ts:975 |
modelId? | CartesiaTTSModel | 'sonic-2' | Model ID to use for synthesis. See CartesiaTTSModel | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:151 |
outputEncoding? | CartesiaOutputEncoding | 'pcm_s16le' | Output audio encoding format. See CartesiaOutputEncoding | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:166 |
outputFormat? | string | undefined | Output audio format identifier. Remarks Provider-specific format string (e.g., 'linear16', 'mp3', 'opus'). | - | TTSProviderConfig.outputFormat | src/core/types/providers.ts:1000 |
outputSampleRate? | number | 16000 | Output audio sample rate in Hz. | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:173 |
pitch? | number | undefined | Pitch adjustment in semitones. Remarks Values from -20 to +20 semitones. Not all providers support pitch adjustment. | - | TTSProviderConfig.pitch | src/core/types/providers.ts:992 |
proxyUrl? | string | undefined | URL of the CompositeVoice proxy server’s Cartesia endpoint. Remarks When set, the WebSocket connection is routed through the proxy and the apiKey is not required on the client side. The HTTP URL is automatically converted to a WebSocket URL (ws:// or wss://). Example 'http://localhost:3001/api/proxy/cartesia' | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:134 |
rate? | number | undefined | Speech rate multiplier. Remarks Values from 0.25 (quarter speed) to 4.0 (quadruple speed), where 1.0 is normal speed. Not all providers support rate adjustment. | - | TTSProviderConfig.rate | src/core/types/providers.ts:984 |
sampleRate? | number | undefined | Sample rate for the output audio in Hz. Remarks Common values are 16000, 24000, and 48000. Must match the format capabilities of the chosen voice and model. | - | TTSProviderConfig.sampleRate | src/core/types/providers.ts:1009 |
speed? | number | undefined (uses Cartesia’s default) | Speech speed multiplier. Remarks Values greater than 1 speed up speech; values less than 1 slow it down. | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:183 |
timeout? | number | undefined | Request timeout in milliseconds. Remarks Applies to HTTP requests (REST providers) and connection establishment (WebSocket providers). Set to 0 for no timeout. | - | TTSProviderConfig.timeout | src/core/types/providers.ts:95 |
voice? | string | undefined | Voice ID or name to use for synthesis. Remarks Provider-specific voice identifier. For example, Deepgram uses identifiers like 'aura-asteria-en', while ElevenLabs uses voice IDs. | - | TTSProviderConfig.voice | src/core/types/providers.ts:967 |
voiceId | string | undefined | Cartesia voice ID (required). Remarks Find voice IDs via the Cartesia Voice Library or the API’s list voices endpoint. | - | - | src/providers/tts/cartesia/CartesiaTTS.ts:143 |