Skip to content

CartesiaTTS

Cartesia TTS provider for low-latency real-time streaming text-to-speech via WebSocket.

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:273

Cartesia TTS provider for low-latency real-time streaming text-to-speech via WebSocket.

Remarks

This provider establishes a WebSocket connection to the Cartesia TTS API (or a proxy). It uses Cartesia’s context-based streaming protocol, where a context_id links multiple text chunks into a single coherent utterance. The continue flag indicates whether a chunk continues an existing context or starts a new one.

The lifecycle is:

  1. Construct with CartesiaTTSConfig
  2. Call initialize() to validate configuration
  3. Call connect() to open the WebSocket and generate a context ID
  4. Call sendText() to stream text for synthesis (uses context continuation)
  5. Call finalize() to send end-of-input and flush remaining audio
  6. Call disconnect() to close the WebSocket
  7. Call dispose() to release all resources

Audio flow: Text chunks -> WebSocket -> Cartesia -> Audio chunks -> onAudio callback

Example

import { CartesiaTTS } from 'composite-voice';

const tts = new CartesiaTTS({
  apiKey: 'cart-xxxxxxxxxxxx',
  voiceId: 'a0e99841-438c-4a64-b679-ae501e7d6091',
  modelId: 'sonic-2',
  outputEncoding: 'pcm_s16le',
  outputSampleRate: 24000,
});

await tts.initialize();
await tts.connect();

tts.onAudio((chunk) => {
  // Process audio chunk
});

tts.sendText('Hello, ');
tts.sendText('world!');
await tts.finalize();
await tts.disconnect();

See

  • LiveTTSProvider - The base class this provider extends.
  • CartesiaTTSConfig - Configuration options for this provider.
  • WebSocketManager - The WebSocket manager used for connection handling.

Extends

  • LiveTTSProvider

Constructors

Constructor

new CartesiaTTS(config, logger?): CartesiaTTS;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:295

Creates a new CartesiaTTS provider instance.

Parameters

ParameterTypeDescription
configCartesiaTTSConfigConfiguration for the Cartesia TTS provider. The voiceId property is required.
logger?LoggerOptional logger instance for debug and diagnostic output.

Returns

CartesiaTTS

Example

const tts = new CartesiaTTS({
  apiKey: 'cart-xxxxxxxxxxxx',
  voiceId: 'a0e99841-438c-4a64-b679-ae501e7d6091',
});

Overrides

LiveTTSProvider.constructor

Properties

PropertyModifierTypeDefault valueDescriptionOverridesInherited fromDefined in
audioCallback?protected(chunk) => voidundefinedCallback registered by the SDK or consumer to receive audio chunks. Set via onAudio.-LiveTTSProvider.audioCallbacksrc/providers/base/BaseTTSProvider.ts:79
configpublicCartesiaTTSConfigundefinedTTS-specific provider configuration.LiveTTSProvider.config-src/providers/tts/cartesia/CartesiaTTS.ts:274
initializedprotectedbooleanfalseTracks whether initialize has completed successfully.-LiveTTSProvider.initializedsrc/providers/base/BaseProvider.ts:97
loggerprotectedLoggerundefinedScoped logger instance for this provider.-LiveTTSProvider.loggersrc/providers/base/BaseProvider.ts:94
metadataCallback?protected(metadata) => voidundefinedCallback registered by the SDK or consumer to receive audio metadata. Set via onMetadata.-LiveTTSProvider.metadataCallbacksrc/providers/base/BaseTTSProvider.ts:85
rolesreadonlyreadonly ProviderRole[]undefinedTTS providers cover the 'tts' pipeline role by default.-LiveTTSProvider.rolessrc/providers/base/BaseTTSProvider.ts:70
typereadonlyProviderTypeundefinedCommunication transport this provider uses ('rest' or 'websocket').-LiveTTSProvider.typesrc/providers/base/BaseProvider.ts:74

Methods

assertReady()

protected assertReady(): void;

Defined in: src/providers/base/BaseProvider.ts:255

Guard that throws if the provider has not been initialized.

Returns

void

Remarks

Call at the start of any method that requires the provider to be ready.

Throws

Error Thrown with a descriptive message when initialized is false.

Inherited from

LiveTTSProvider.assertReady

connect()

connect(): Promise<void>;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:402

Connects to the Cartesia WebSocket for real-time TTS streaming.

Returns

Promise<void>

Remarks

Establishes a WebSocket connection and generates a fresh context ID for the session. Auto-reconnect is disabled for TTS sessions since each session is typically short-lived.

This method is idempotent — calling it when already connected is a no-op.

Throws

ProviderConnectionError if the WebSocket connection fails.

Overrides

LiveTTSProvider.connect

disconnect()

disconnect(): Promise<void>;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:694

Disconnects from the Cartesia WebSocket.

Returns

Promise<void>

Remarks

Gracefully closes the WebSocket connection and releases the WebSocketManager instance. Also resets the context ID and chunk tracking state.

Throws

Rethrows any error that occurs during disconnection.

Overrides

LiveTTSProvider.disconnect

dispose()

dispose(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:154

Clean up resources and dispose of the provider.

Returns

Promise<void>

Remarks

Delegates to the subclass hook onDispose and resets the initialized flag. If the provider is not initialized, the call is a no-op.

Throws

Re-throws any error raised by onDispose.

Inherited from

LiveTTSProvider.dispose

emitAudio()

protected emitAudio(chunk): void;

Defined in: src/providers/base/BaseTTSProvider.ts:138

Emit a synthesized audio chunk to the registered callback.

Parameters

ParameterTypeDescription
chunkAudioChunkThe audio chunk to emit.

Returns

void

Remarks

Subclasses call this method for each chunk of audio produced during synthesis. If no callback has been registered the chunk is silently dropped.

Inherited from

LiveTTSProvider.emitAudio

emitMetadata()

protected emitMetadata(metadata): void;

Defined in: src/providers/base/BaseTTSProvider.ts:154

Emit audio metadata to the registered callback.

Parameters

ParameterTypeDescription
metadataAudioMetadataThe audio metadata to emit.

Returns

void

Remarks

Typically called once at the start of synthesis when the provider knows the output format. If no callback has been registered the metadata is silently dropped.

Inherited from

LiveTTSProvider.emitMetadata

finalize()

finalize(): Promise<void>;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:638

Finalizes the current synthesis session by sending an end-of-input signal.

Returns

Promise<void>

Remarks

Sends an empty transcript with continue: false to signal that no more text will be sent for the current context. Waits up to 2 seconds for any remaining audio to arrive, then resets the context ID for the next utterance.

Throws

Rethrows any error that occurs during finalization.

Overrides

LiveTTSProvider.finalize

getConfig()

getConfig(): TTSProviderConfig;

Defined in: src/providers/base/BaseTTSProvider.ts:165

Get a shallow copy of the current TTS configuration.

Returns

TTSProviderConfig

A new TTSProviderConfig object.

Inherited from

LiveTTSProvider.getConfig

initialize()

initialize(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:127

Initialize the provider, making it ready for use.

Returns

Promise<void>

Remarks

Calls the subclass hook onInitialize. If the provider has already been initialized the call is a no-op.

Throws

ProviderInitializationError Thrown when onInitialize rejects. The original error is wrapped with the provider class name for diagnostics.

Inherited from

LiveTTSProvider.initialize

isReady()

isReady(): boolean;

Defined in: src/providers/base/BaseProvider.ts:178

Check whether the provider has been initialized and is ready.

Returns

boolean

true when initialize has completed successfully and dispose has not yet been called.

Inherited from

LiveTTSProvider.isReady

isWebSocketConnected()

isWebSocketConnected(): boolean;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:722

Checks whether the WebSocket connection to Cartesia is currently active.

Returns

boolean

true if the WebSocket is connected, false otherwise.


onAudio()

onAudio(callback): void;

Defined in: src/providers/base/BaseTTSProvider.ts:109

Register a callback to receive synthesized audio chunks.

Parameters

ParameterTypeDescription
callback(chunk) => voidFunction invoked with each AudioChunk.

Returns

void

Remarks

All TTS providers — regardless of transport — deliver audio through this callback. CompositeVoice registers it during pipeline setup so that audio data flows into the AudioPlayer.

Inherited from

LiveTTSProvider.onAudio

onConfigUpdate()

protected onConfigUpdate(_config): void;

Defined in: src/providers/base/BaseProvider.ts:242

Hook called after updateConfig merges new values.

Parameters

ParameterTypeDescription
_configPartial<BaseProviderConfig>The partial configuration that was merged.

Returns

void

Remarks

The default implementation is a no-op. Override in subclasses to react to runtime configuration changes (e.g. reconnect with a new API key).

Inherited from

LiveTTSProvider.onConfigUpdate

onDispose()

protected onDispose(): Promise<void>;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:341

Disposes the provider, disconnecting from the WebSocket and releasing resources.

Returns

Promise<void>

Overrides

LiveTTSProvider.onDispose

onInitialize()

protected onInitialize(): Promise<void>;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:314

Validates configuration and prepares the provider for connection.

Returns

Promise<void>

Throws

ProviderInitializationError if neither apiKey nor proxyUrl is configured.

Throws

ProviderInitializationError if voiceId is not provided.

Overrides

LiveTTSProvider.onInitialize

onMetadata()

onMetadata(callback): void;

Defined in: src/providers/base/BaseTTSProvider.ts:124

Register a callback to receive audio metadata.

Parameters

ParameterTypeDescription
callback(metadata) => voidFunction invoked with AudioMetadata when available.

Returns

void

Remarks

Metadata (sample rate, encoding, channels, etc.) helps the AudioPlayer configure playback correctly. Providers may emit metadata once at the start of synthesis but are not required to.

Inherited from

LiveTTSProvider.onMetadata

sendText()

sendText(chunk): void;

Defined in: src/providers/tts/cartesia/CartesiaTTS.ts:588

Sends a text chunk to Cartesia for real-time synthesis.

Parameters

ParameterTypeDescription
chunkstringThe text to synthesize into speech.

Returns

void

Remarks

Each message includes the model ID, voice reference, output format, and a context_id for streaming continuation. The continue flag is false for the first chunk and true for subsequent chunks, allowing Cartesia to maintain prosody across multiple text segments.

Optional parameters (language, speed, emotion) are included when configured.

Overrides

LiveTTSProvider.sendText

updateConfig()

updateConfig(config): void;

Defined in: src/providers/base/BaseProvider.ts:201

Merge partial configuration updates into the current config.

Parameters

ParameterTypeDescription
configPartial<BaseProviderConfig>A partial configuration object whose keys will overwrite existing values.

Returns

void

Remarks

After merging, the subclass hook onConfigUpdate is called so providers can react to changed values at runtime.

Inherited from

LiveTTSProvider.updateConfig

© 2026 CompositeVoice. All rights reserved.

Font size
Contrast
Motion
Transparency