EagerLLMConfig
Configuration for the eager LLM pipeline (speculative generation).
Defined in: src/core/types/config.ts:292
Configuration for the eager LLM pipeline (speculative generation).
Remarks
When enabled, CompositeVoice starts LLM generation speculatively the moment the STT provider emits a preflight/eager-end-of-turn signal — before speech_final is confirmed. This reduces speech-to-first-token latency significantly for DeepgramFlux (e.g., flux-general-en).
If speech_final arrives with different text than the preflight, the SDK can cancel the speculative generation and restart with the confirmed text (controlled by cancelOnTextChange).
Example
const eagerLLM: EagerLLMConfig = {
enabled: true,
cancelOnTextChange: true,
};
See
- TranscriptionPreflightEvent for the event that triggers eager generation
- CompositeVoiceConfig for where this is used
Properties
| Property | Type | Default value | Description | Defined in |
|---|---|---|---|---|
cancelOnTextChange? | boolean | true | Whether to cancel speculative generation if the confirmed text differs beyond the similarityThreshold. Remarks When true, if speech_final arrives with text that is less similar than similarityThreshold to the preflight text, the in-flight LLM generation is cancelled via AbortSignal and restarted with the confirmed text. When false, the preflight result is always accepted (lowest latency, small risk of an inaccurate response). | src/core/types/config.ts:313 |
enabled | boolean | false | Whether to enable eager LLM start on STT preflight events. | src/core/types/config.ts:298 |
similarityThreshold? | number | 0.8 | Minimum text similarity (0–1) for the eager LLM response to be accepted. Remarks When the confirmed speech_final text arrives, it is compared to the preflight text using word-overlap similarity. If the score is at or above this threshold, the speculative LLM response is kept. If it is below, the response is cancelled and restarted (when cancelOnTextChange is true). A value of 1.0 requires an exact match (rarely useful in practice). A value of 0.8 allows minor additions at the end of the utterance. A value of 0.5 is very permissive — only cancels when the text changes substantially. | src/core/types/config.ts:332 |