NativeSTT

Native browser STT provider backed by the Web Speech API (SpeechRecognition).

Defined in: src/providers/stt/native/NativeSTT.ts:168

Native browser STT provider backed by the Web Speech API (SpeechRecognition).

Remarks

Unlike other STT providers, NativeSTT manages its own audio pipeline — the browser’s SpeechRecognition API directly accesses the microphone. Because of this, the provider declares roles: ['input', 'stt'] and CompositeVoice will not set up a separate AudioInputProvider. The sendAudio method is a no-op.

Transport: WebSocket-like (browser-managed, extends LiveSTTProvider)

Browser support:

Chrome / Edge (Chromium): Full support via SpeechRecognition
Safari: Partial support via webkitSpeechRecognition
Firefox: Not supported (as of 2025)

Data flow:

Microphone -> SpeechRecognition API (browser) -> onresult event
                                                     |
CompositeVoice <- onTranscription(result) <---------+

Example

import { NativeSTT } from 'composite-voice';

const stt = new NativeSTT({
  language: 'en-US',
  continuous: true,
  interimResults: true,
  maxAlternatives: 1,
});

await stt.initialize();

stt.onTranscription((result) => {
  console.log(result.text, result.isFinal);
});

await stt.connect(); // starts listening
// ... later ...
await stt.disconnect(); // stops listening

See

LiveSTTProvider for the base WebSocket STT class
NativeSTTConfig for configuration options
DeepgramSTT for an alternative real-time STT provider

Extends

LiveSTTProvider

Constructors

Constructor

new NativeSTT(config?, logger?): NativeSTT;

Defined in: src/providers/stt/native/NativeSTT.ts:203

Create a new NativeSTT provider.

Parameters

Parameter	Type	Description
`config`	`Partial`<`NativeSTTConfig`>	Partial configuration; unset values receive sensible defaults (`language: 'en-US'`, `continuous: true`, `interimResults: true`, `maxAlternatives: 1`).
`logger?`	`Logger`	Optional parent logger; a child will be derived.

Returns

NativeSTT

Example

const stt = new NativeSTT({ language: 'fr-FR', continuous: false });

Overrides

LiveSTTProvider.constructor

Properties

Property	Modifier	Type	Default value	Description	Overrides	Inherited from	Defined in
`config`	`public`	`NativeSTTConfig`	`undefined`	STT-specific provider configuration.	`LiveSTTProvider.config`	-	src/providers/stt/native/NativeSTT.ts:169
`initialized`	`protected`	`boolean`	`false`	Tracks whether initialize has completed successfully.	-	`LiveSTTProvider.initialized`	src/providers/base/BaseProvider.ts:97
`logger`	`protected`	`Logger`	`undefined`	Scoped logger instance for this provider.	-	`LiveSTTProvider.logger`	src/providers/base/BaseProvider.ts:94
`roles`	`readonly`	readonly `ProviderRole`[]	`undefined`	NativeSTT covers both `'input'` and `'stt'` pipeline roles. Remarks The browser’s `SpeechRecognition` API handles microphone access and transcription internally, so this provider fills both the input capture and speech-to-text slots in the pipeline.	`LiveSTTProvider.roles`	-	src/providers/stt/native/NativeSTT.ts:179
`transcriptionCallback?`	`protected`	(`result`) => `void`	`undefined`	Callback registered by the SDK or consumer to receive transcription results. Set via onTranscription.	-	`LiveSTTProvider.transcriptionCallback`	src/providers/base/BaseSTTProvider.ts:73
`type`	`readonly`	`ProviderType`	`undefined`	Communication transport this provider uses (`'rest'` or `'websocket'`).	-	`LiveSTTProvider.type`	src/providers/base/BaseProvider.ts:74

Methods

assertReady()

protected assertReady(): void;

Defined in: src/providers/base/BaseProvider.ts:255

Guard that throws if the provider has not been initialized.

Returns

void

Remarks

Call at the start of any method that requires the provider to be ready.

Throws

Error Thrown with a descriptive message when initialized is false.

Inherited from

LiveSTTProvider.assertReady

connect()

connect(): Promise<void>;

Defined in: src/providers/stt/native/NativeSTT.ts:415

Start the browser’s speech recognition engine.

Returns

Promise<void>

Remarks

Checks microphone permission, then calls SpeechRecognition.start(). The returned promise resolves once the browser fires the onstart event, or rejects if the start times out or permission is denied.

Throws

ProviderConnectionError Thrown when the provider is not initialized, microphone permission is denied, or the recognition engine does not start within startTimeout milliseconds.

Overrides

LiveSTTProvider.connect

disconnect()

disconnect(): Promise<void>;

Defined in: src/providers/stt/native/NativeSTT.ts:508

Stop the browser’s speech recognition engine.

Returns

Promise<void>

Resolves immediately after requesting the stop.

Remarks

Calls SpeechRecognition.stop(). If the recognition instance has already been stopped, the error is silently ignored.

Overrides

LiveSTTProvider.disconnect

dispose()

dispose(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:154

Clean up resources and dispose of the provider.

Returns

Promise<void>

Remarks

Delegates to the subclass hook onDispose and resets the initialized flag. If the provider is not initialized, the call is a no-op.

Throws

Re-throws any error raised by onDispose.

Inherited from

LiveSTTProvider.dispose

emitTranscription()

protected emitTranscription(result): void;

Defined in: src/providers/base/BaseSTTProvider.ts:113

Emit a transcription result to the registered callback.

Parameters

Parameter	Type	Description
`result`	`TranscriptionResult`	The transcription result to emit.

Returns

void

Remarks

Subclasses call this method whenever transcribed text is available. If no callback has been registered via onTranscription, the result is logged as a warning and dropped.

Inherited from

LiveSTTProvider.emitTranscription

getConfig()

getConfig(): STTProviderConfig;

Defined in: src/providers/base/BaseSTTProvider.ts:132

Get a shallow copy of the current STT configuration.

Returns

STTProviderConfig

A new STTProviderConfig object.

Inherited from

LiveSTTProvider.getConfig

getMetadata()

getMetadata(): AudioMetadata;

Defined in: src/providers/stt/native/NativeSTT.ts:659

Returns sensible audio metadata defaults for the Web Speech API.

Returns

AudioMetadata

AudioMetadata with sampleRate: 16000, encoding: 'linear16', channels: 1, bitDepth: 16

Remarks

The Web Speech API does not expose the actual audio format it uses internally, so this returns reasonable defaults matching the most common browser configuration. These values are used by the pipeline’s STT metadata auto-configuration when NativeSTT is the input provider.

See

AudioInputProvider.getMetadata

initialize()

initialize(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:127

Initialize the provider, making it ready for use.

Returns

Promise<void>

Remarks

Calls the subclass hook onInitialize. If the provider has already been initialized the call is a no-op.

Throws

ProviderInitializationError Thrown when onInitialize rejects. The original error is wrapped with the provider class name for diagnostics.

Inherited from

LiveSTTProvider.initialize

isActive()

isActive(): boolean;

Defined in: src/providers/stt/native/NativeSTT.ts:625

Check whether the SpeechRecognition engine is actively listening.

Returns

boolean

true when recognition is active (between connect() and disconnect()).

See

AudioInputProvider.isActive

isConnected()

isConnected(): boolean;

Defined in: src/providers/stt/native/NativeSTT.ts:532

Check whether the SpeechRecognition instance exists and is ready.

Returns

boolean

true when the recognition object has been created (after initialization).

isReady()

isReady(): boolean;

Defined in: src/providers/base/BaseProvider.ts:178

Check whether the provider has been initialized and is ready.

Returns

boolean

true when initialize has completed successfully and dispose has not yet been called.

Inherited from

LiveSTTProvider.isReady

onAudio()

onAudio(_callback): void;

Defined in: src/providers/stt/native/NativeSTT.ts:641

No-op — NativeSTT directly accesses the microphone via the browser’s SpeechRecognition API and does not emit raw audio chunks.

Parameters

Parameter	Type	Description
`_callback`	(`chunk`) => `void`	Audio callback (unused).

Returns

void

Remarks

The browser handles audio capture internally. This method exists solely to satisfy the AudioInputProvider interface.

See

AudioInputProvider.onAudio

onConfigUpdate()

protected onConfigUpdate(_config): void;

Defined in: src/providers/base/BaseProvider.ts:242

Hook called after updateConfig merges new values.

Parameters

Parameter	Type	Description
`_config`	`Partial`<`BaseProviderConfig`>	The partial configuration that was merged.

Returns

void

Remarks

The default implementation is a no-op. Override in subclasses to react to runtime configuration changes (e.g. reconnect with a new API key).

Inherited from

LiveSTTProvider.onConfigUpdate

onDispose()

protected onDispose(): Promise<void>;

Defined in: src/providers/stt/native/NativeSTT.ts:253

Disconnect and release the SpeechRecognition instance.

Returns

Promise<void>

Overrides

LiveSTTProvider.onDispose

onInitialize()

protected onInitialize(): Promise<void>;

Defined in: src/providers/stt/native/NativeSTT.ts:220

Initialize the SpeechRecognition instance and configure it.

Returns

Promise<void>

Throws

Error Thrown when the Web Speech API is not available in the current browser.

Overrides

LiveSTTProvider.onInitialize

onTranscription()

onTranscription(callback): void;

Defined in: src/providers/base/BaseSTTProvider.ts:98

Parameters

Parameter	Type	Description
`callback`	(`result`) => `void`	Function invoked with each TranscriptionResult.

Returns

void

Remarks

All STT providers — regardless of transport — deliver text through this callback. CompositeVoice registers it during pipeline setup so that transcription results flow into the conversation manager and, ultimately, the LLM provider.

Inherited from

LiveSTTProvider.onTranscription

pause()

pause(): void;

Defined in: src/providers/stt/native/NativeSTT.ts:596

Pause audio capture by stopping recognition.

Returns

void

Remarks

The Web Speech API’s SpeechRecognition does not support a native pause operation, so this delegates to disconnect() to halt recognition. Use resume() to restart.

See

AudioInputProvider.pause

resume()

resume(): void;

Defined in: src/providers/stt/native/NativeSTT.ts:611

Resume audio capture after a pause.

Returns

void

Remarks

Delegates to connect() to restart the SpeechRecognition engine after a pause().

See

AudioInputProvider.resume

sendAudio()

sendAudio(_chunk): void;

Defined in: src/providers/stt/native/NativeSTT.ts:546

No-op — NativeSTT directly accesses the microphone via the SpeechRecognition API and does not accept external audio data.

Parameters

Parameter	Type	Description
`_chunk`	`ArrayBuffer`	Audio chunk (unused).

Returns

void

Remarks

CompositeVoice should not call this method because NativeSTT covers the 'input' role internally. Any invocation is silently ignored.

Overrides

LiveSTTProvider.sendAudio

start()

start(): void;

Defined in: src/providers/stt/native/NativeSTT.ts:565

Start capturing audio via the browser’s SpeechRecognition API.

Returns

void

Remarks

Delegates to connect(). This method exists to satisfy the AudioInputProvider interface for duck-type validation in the provider resolution algorithm. In the multi-role simplified path, the orchestrator calls connect() directly.

See

AudioInputProvider.start

stop()

stop(): void;

Defined in: src/providers/stt/native/NativeSTT.ts:579

Stop capturing audio via the browser’s SpeechRecognition API.

Returns

void

Remarks

Delegates to disconnect().

See

AudioInputProvider.stop

updateConfig()

updateConfig(config): void;

Defined in: src/providers/base/BaseProvider.ts:201

Merge partial configuration updates into the current config.

Parameters

Parameter	Type	Description
`config`	`Partial`<`BaseProviderConfig`>	A partial configuration object whose keys will overwrite existing values.

Returns

void

Remarks

After merging, the subclass hook onConfigUpdate is called so providers can react to changed values at runtime.

Inherited from

LiveSTTProvider.updateConfig