NativeSTT
Native browser STT provider backed by the Web Speech API (SpeechRecognition).
Defined in: src/providers/stt/native/NativeSTT.ts:168
Native browser STT provider backed by the Web Speech API (SpeechRecognition).
Remarks
Unlike other STT providers, NativeSTT manages its own audio pipeline — the browser’s SpeechRecognition API directly accesses the microphone. Because of this, the provider declares roles: ['input', 'stt'] and CompositeVoice will not set up a separate AudioInputProvider. The sendAudio method is a no-op.
Transport: WebSocket-like (browser-managed, extends LiveSTTProvider)
Browser support:
- Chrome / Edge (Chromium): Full support via
SpeechRecognition - Safari: Partial support via
webkitSpeechRecognition - Firefox: Not supported (as of 2025)
Data flow:
Microphone -> SpeechRecognition API (browser) -> onresult event
|
CompositeVoice <- onTranscription(result) <---------+
Example
import { NativeSTT } from 'composite-voice';
const stt = new NativeSTT({
language: 'en-US',
continuous: true,
interimResults: true,
maxAlternatives: 1,
});
await stt.initialize();
stt.onTranscription((result) => {
console.log(result.text, result.isFinal);
});
await stt.connect(); // starts listening
// ... later ...
await stt.disconnect(); // stops listening
See
- LiveSTTProvider for the base WebSocket STT class
- NativeSTTConfig for configuration options
- DeepgramSTT for an alternative real-time STT provider
Extends
LiveSTTProvider
Constructors
Constructor
new NativeSTT(config?, logger?): NativeSTT;
Defined in: src/providers/stt/native/NativeSTT.ts:203
Create a new NativeSTT provider.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | Partial<NativeSTTConfig> | Partial configuration; unset values receive sensible defaults (language: 'en-US', continuous: true, interimResults: true, maxAlternatives: 1). |
logger? | Logger | Optional parent logger; a child will be derived. |
Returns
NativeSTT
Example
const stt = new NativeSTT({ language: 'fr-FR', continuous: false });
Overrides
LiveSTTProvider.constructor
Properties
| Property | Modifier | Type | Default value | Description | Overrides | Inherited from | Defined in |
|---|---|---|---|---|---|---|---|
config | public | NativeSTTConfig | undefined | STT-specific provider configuration. | LiveSTTProvider.config | - | src/providers/stt/native/NativeSTT.ts:169 |
initialized | protected | boolean | false | Tracks whether initialize has completed successfully. | - | LiveSTTProvider.initialized | src/providers/base/BaseProvider.ts:97 |
logger | protected | Logger | undefined | Scoped logger instance for this provider. | - | LiveSTTProvider.logger | src/providers/base/BaseProvider.ts:94 |
roles | readonly | readonly ProviderRole[] | undefined | NativeSTT covers both 'input' and 'stt' pipeline roles. Remarks The browser’s SpeechRecognition API handles microphone access and transcription internally, so this provider fills both the input capture and speech-to-text slots in the pipeline. | LiveSTTProvider.roles | - | src/providers/stt/native/NativeSTT.ts:179 |
transcriptionCallback? | protected | (result) => void | undefined | Callback registered by the SDK or consumer to receive transcription results. Set via onTranscription. | - | LiveSTTProvider.transcriptionCallback | src/providers/base/BaseSTTProvider.ts:73 |
type | readonly | ProviderType | undefined | Communication transport this provider uses ('rest' or 'websocket'). | - | LiveSTTProvider.type | src/providers/base/BaseProvider.ts:74 |
Methods
assertReady()
protected assertReady(): void;
Defined in: src/providers/base/BaseProvider.ts:255
Guard that throws if the provider has not been initialized.
Returns
void
Remarks
Call at the start of any method that requires the provider to be ready.
Throws
Error Thrown with a descriptive message when initialized is false.
Inherited from
LiveSTTProvider.assertReady
connect()
connect(): Promise<void>;
Defined in: src/providers/stt/native/NativeSTT.ts:415
Start the browser’s speech recognition engine.
Returns
Promise<void>
Remarks
Checks microphone permission, then calls SpeechRecognition.start(). The returned promise resolves once the browser fires the onstart event, or rejects if the start times out or permission is denied.
Throws
ProviderConnectionError Thrown when the provider is not initialized, microphone permission is denied, or the recognition engine does not start within startTimeout milliseconds.
Overrides
LiveSTTProvider.connect
disconnect()
disconnect(): Promise<void>;
Defined in: src/providers/stt/native/NativeSTT.ts:508
Stop the browser’s speech recognition engine.
Returns
Promise<void>
Resolves immediately after requesting the stop.
Remarks
Calls SpeechRecognition.stop(). If the recognition instance has already been stopped, the error is silently ignored.
Overrides
LiveSTTProvider.disconnect
dispose()
dispose(): Promise<void>;
Defined in: src/providers/base/BaseProvider.ts:154
Clean up resources and dispose of the provider.
Returns
Promise<void>
Remarks
Delegates to the subclass hook onDispose and resets the initialized flag. If the provider is not initialized, the call is a no-op.
Throws
Re-throws any error raised by onDispose.
Inherited from
LiveSTTProvider.dispose
emitTranscription()
protected emitTranscription(result): void;
Defined in: src/providers/base/BaseSTTProvider.ts:113
Emit a transcription result to the registered callback.
Parameters
| Parameter | Type | Description |
|---|---|---|
result | TranscriptionResult | The transcription result to emit. |
Returns
void
Remarks
Subclasses call this method whenever transcribed text is available. If no callback has been registered via onTranscription, the result is logged as a warning and dropped.
Inherited from
LiveSTTProvider.emitTranscription
getConfig()
getConfig(): STTProviderConfig;
Defined in: src/providers/base/BaseSTTProvider.ts:132
Get a shallow copy of the current STT configuration.
Returns
A new STTProviderConfig object.
Inherited from
LiveSTTProvider.getConfig
getMetadata()
getMetadata(): AudioMetadata;
Defined in: src/providers/stt/native/NativeSTT.ts:659
Returns sensible audio metadata defaults for the Web Speech API.
Returns
AudioMetadata with sampleRate: 16000, encoding: 'linear16', channels: 1, bitDepth: 16
Remarks
The Web Speech API does not expose the actual audio format it uses internally, so this returns reasonable defaults matching the most common browser configuration. These values are used by the pipeline’s STT metadata auto-configuration when NativeSTT is the input provider.
See
AudioInputProvider.getMetadata
initialize()
initialize(): Promise<void>;
Defined in: src/providers/base/BaseProvider.ts:127
Initialize the provider, making it ready for use.
Returns
Promise<void>
Remarks
Calls the subclass hook onInitialize. If the provider has already been initialized the call is a no-op.
Throws
ProviderInitializationError Thrown when onInitialize rejects. The original error is wrapped with the provider class name for diagnostics.
Inherited from
LiveSTTProvider.initialize
isActive()
isActive(): boolean;
Defined in: src/providers/stt/native/NativeSTT.ts:625
Check whether the SpeechRecognition engine is actively listening.
Returns
boolean
true when recognition is active (between connect() and disconnect()).
See
isConnected()
isConnected(): boolean;
Defined in: src/providers/stt/native/NativeSTT.ts:532
Check whether the SpeechRecognition instance exists and is ready.
Returns
boolean
true when the recognition object has been created (after initialization).
isReady()
isReady(): boolean;
Defined in: src/providers/base/BaseProvider.ts:178
Check whether the provider has been initialized and is ready.
Returns
boolean
true when initialize has completed successfully and dispose has not yet been called.
Inherited from
LiveSTTProvider.isReady
onAudio()
onAudio(_callback): void;
Defined in: src/providers/stt/native/NativeSTT.ts:641
No-op — NativeSTT directly accesses the microphone via the browser’s SpeechRecognition API and does not emit raw audio chunks.
Parameters
| Parameter | Type | Description |
|---|---|---|
_callback | (chunk) => void | Audio callback (unused). |
Returns
void
Remarks
The browser handles audio capture internally. This method exists solely to satisfy the AudioInputProvider interface.
See
onConfigUpdate()
protected onConfigUpdate(_config): void;
Defined in: src/providers/base/BaseProvider.ts:242
Hook called after updateConfig merges new values.
Parameters
| Parameter | Type | Description |
|---|---|---|
_config | Partial<BaseProviderConfig> | The partial configuration that was merged. |
Returns
void
Remarks
The default implementation is a no-op. Override in subclasses to react to runtime configuration changes (e.g. reconnect with a new API key).
Inherited from
LiveSTTProvider.onConfigUpdate
onDispose()
protected onDispose(): Promise<void>;
Defined in: src/providers/stt/native/NativeSTT.ts:253
Disconnect and release the SpeechRecognition instance.
Returns
Promise<void>
Overrides
LiveSTTProvider.onDispose
onInitialize()
protected onInitialize(): Promise<void>;
Defined in: src/providers/stt/native/NativeSTT.ts:220
Initialize the SpeechRecognition instance and configure it.
Returns
Promise<void>
Throws
Error Thrown when the Web Speech API is not available in the current browser.
Overrides
LiveSTTProvider.onInitialize
onTranscription()
onTranscription(callback): void;
Defined in: src/providers/base/BaseSTTProvider.ts:98
Register a callback to receive transcription results.
Parameters
| Parameter | Type | Description |
|---|---|---|
callback | (result) => void | Function invoked with each TranscriptionResult. |
Returns
void
Remarks
All STT providers — regardless of transport — deliver text through this callback. CompositeVoice registers it during pipeline setup so that transcription results flow into the conversation manager and, ultimately, the LLM provider.
Inherited from
LiveSTTProvider.onTranscription
pause()
pause(): void;
Defined in: src/providers/stt/native/NativeSTT.ts:596
Pause audio capture by stopping recognition.
Returns
void
Remarks
The Web Speech API’s SpeechRecognition does not support a native pause operation, so this delegates to disconnect() to halt recognition. Use resume() to restart.
See
resume()
resume(): void;
Defined in: src/providers/stt/native/NativeSTT.ts:611
Resume audio capture after a pause.
Returns
void
Remarks
Delegates to connect() to restart the SpeechRecognition engine after a pause().
See
sendAudio()
sendAudio(_chunk): void;
Defined in: src/providers/stt/native/NativeSTT.ts:546
No-op — NativeSTT directly accesses the microphone via the SpeechRecognition API and does not accept external audio data.
Parameters
| Parameter | Type | Description |
|---|---|---|
_chunk | ArrayBuffer | Audio chunk (unused). |
Returns
void
Remarks
CompositeVoice should not call this method because NativeSTT covers the 'input' role internally. Any invocation is silently ignored.
Overrides
LiveSTTProvider.sendAudio
start()
start(): void;
Defined in: src/providers/stt/native/NativeSTT.ts:565
Start capturing audio via the browser’s SpeechRecognition API.
Returns
void
Remarks
Delegates to connect(). This method exists to satisfy the AudioInputProvider interface for duck-type validation in the provider resolution algorithm. In the multi-role simplified path, the orchestrator calls connect() directly.
See
stop()
stop(): void;
Defined in: src/providers/stt/native/NativeSTT.ts:579
Stop capturing audio via the browser’s SpeechRecognition API.
Returns
void
Remarks
Delegates to disconnect().
See
updateConfig()
updateConfig(config): void;
Defined in: src/providers/base/BaseProvider.ts:201
Merge partial configuration updates into the current config.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | Partial<BaseProviderConfig> | A partial configuration object whose keys will overwrite existing values. |
Returns
void
Remarks
After merging, the subclass hook onConfigUpdate is called so providers can react to changed values at runtime.
Inherited from
LiveSTTProvider.updateConfig