WebLLMLLM
WebLLM in-browser LLM provider.
Defined in: src/providers/llm/webllm/WebLLMLLM.ts:196
WebLLM in-browser LLM provider.
Remarks
Uses @mlc-ai/web-llm to run language models entirely in the browser via WebGPU. This provider is unique among CompositeVoice LLM providers in that it requires no API key, no server, and no network connection after the initial model download.
Key characteristics:
- Zero server cost: All inference runs on the user’s GPU.
- Privacy-first: No data leaves the browser.
- Offline capable: Works without network after the first model download.
- Abort support: Uses
engine.interruptGenerate()when the abort signal fires, which is more reliable than HTTP cancellation for local inference. - Resource cleanup: The
dispose()method callsengine.unload()to free GPU memory.
Requirements:
- A browser with WebGPU support (Chrome 113+, Edge 113+).
- The
@mlc-ai/web-llmpeer dependency must be installed. - Sufficient GPU memory for the selected model.
Example
import { WebLLMLLM } from 'composite-voice';
const llm = new WebLLMLLM({
model: 'Llama-3.2-1B-Instruct-q4f16_1-MLC',
systemPrompt: 'You are a helpful local assistant.',
onLoadProgress: ({ progress, text }) => {
console.log(`Loading: ${Math.round(progress * 100)}% - ${text}`);
},
});
await llm.initialize(); // Downloads model on first run
const stream = await llm.generate('Tell me about WebGPU.');
for await (const chunk of stream) {
document.getElementById('output')!.textContent += chunk;
}
await llm.dispose(); // Frees GPU memory
See
- WebLLMLLMConfig for configuration options.
- BaseLLMProvider for the abstract base class.
- OpenAICompatibleLLM for server-side alternatives.
Extends
Constructors
Constructor
new WebLLMLLM(config, logger?): WebLLMLLM;
Defined in: src/providers/llm/webllm/WebLLMLLM.ts:207
Creates a new WebLLM in-browser LLM provider instance.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | WebLLMLLMConfig | WebLLM provider configuration. The only required field is model. |
logger? | Logger | Optional custom logger instance. If omitted, a default logger is created by the base class. |
Returns
WebLLMLLM
Overrides
Properties
| Property | Modifier | Type | Default value | Description | Overrides | Inherited from | Defined in |
|---|---|---|---|---|---|---|---|
config | public | WebLLMLLMConfig | undefined | LLM-specific provider configuration. | BaseLLMProvider.config | - | src/providers/llm/webllm/WebLLMLLM.ts:197 |
initialized | protected | boolean | false | Tracks whether initialize has completed successfully. | - | BaseLLMProvider.initialized | src/providers/base/BaseProvider.ts:97 |
logger | protected | Logger | undefined | Scoped logger instance for this provider. | - | BaseLLMProvider.logger | src/providers/base/BaseProvider.ts:94 |
roles | readonly | readonly ProviderRole[] | undefined | LLM providers cover the 'llm' pipeline role by default. | - | BaseLLMProvider.roles | src/providers/base/BaseLLMProvider.ts:77 |
type | readonly | ProviderType | undefined | Communication transport this provider uses ('rest' or 'websocket'). | - | BaseLLMProvider.type | src/providers/base/BaseProvider.ts:74 |
Accessors
isProxyMode
Get Signature
get protected isProxyMode(): boolean;
Defined in: src/providers/base/BaseProvider.ts:286
Whether the provider is in proxy mode.
Returns
boolean
true when proxyUrl is set.
Inherited from
Methods
assertAuth()
protected assertAuth(): void;
Defined in: src/providers/base/BaseProvider.ts:272
Validate that auth is configured (either apiKey or proxyUrl).
Returns
void
Remarks
Call this in onInitialize() for any provider that requires external authentication. Native providers (NativeSTT, NativeTTS) and in-browser providers (WebLLM) should NOT call this method.
Throws
ProviderInitializationError Thrown when neither apiKey nor proxyUrl is set.
Inherited from
assertReady()
protected assertReady(): void;
Defined in: src/providers/base/BaseProvider.ts:255
Guard that throws if the provider has not been initialized.
Returns
void
Remarks
Call at the start of any method that requires the provider to be ready.
Throws
Error Thrown with a descriptive message when initialized is false.
Inherited from
dispose()
dispose(): Promise<void>;
Defined in: src/providers/base/BaseProvider.ts:154
Clean up resources and dispose of the provider.
Returns
Promise<void>
Remarks
Delegates to the subclass hook onDispose and resets the initialized flag. If the provider is not initialized, the call is a no-op.
Throws
Re-throws any error raised by onDispose.
Inherited from
generate()
generate(prompt, options?): Promise<AsyncIterable<string, any, any>>;
Defined in: src/providers/llm/webllm/WebLLMLLM.ts:317
Generate an LLM response from a multi-turn conversation.
Parameters
| Parameter | Type | Description |
|---|---|---|
prompt | string | - |
options? | LLMGenerationOptions | Optional generation overrides (temperature, maxTokens, signal, etc.). |
Returns
Promise<AsyncIterable<string, any, any>>
An async iterable that yields text chunks. When streaming is enabled (the default), chunks arrive incrementally; otherwise, a single chunk containing the full response is yielded.
Remarks
This is the primary generation method. It converts messages to WebLLM’s ChatCompletionMessageParam format (which matches the OpenAI format) and dispatches to either the streaming or non-streaming code path based on config.stream.
System messages are passed inline (WebLLM supports role: 'system' in the messages array, unlike Anthropic).
Throws
Error Thrown if the provider has not been initialized or the engine is unavailable.
Throws
AbortError Thrown if the provided options.signal is aborted before or during generation.
Example
const messages: LLMMessage[] = [
{ role: 'system', content: 'You are a local assistant.' },
{ role: 'user', content: 'What can you do offline?' },
];
const stream = await webllm.processMessages(messages);
for await (const chunk of stream) {
console.log(chunk);
}
Overrides
generateFromMessages()
generateFromMessages(messages, options?): Promise<AsyncIterable<string, any, any>>;
Defined in: src/providers/llm/webllm/WebLLMLLM.ts:321
Generate a response from a multi-turn conversation.
Parameters
| Parameter | Type | Description |
|---|---|---|
messages | LLMMessage[] | Array of conversation messages including history. |
options? | LLMGenerationOptions | Optional generation overrides. |
Returns
Promise<AsyncIterable<string, any, any>>
An async iterable of text chunks.
Remarks
Required by the LLMProvider interface. Subclasses must implement this.
Overrides
BaseLLMProvider.generateFromMessages
getConfig()
getConfig(): LLMProviderConfig;
Defined in: src/providers/base/BaseLLMProvider.ts:246
Get a shallow copy of the current LLM configuration.
Returns
A new LLMProviderConfig object.
Inherited from
initialize()
initialize(): Promise<void>;
Defined in: src/providers/base/BaseProvider.ts:127
Initialize the provider, making it ready for use.
Returns
Promise<void>
Remarks
Calls the subclass hook onInitialize. If the provider has already been initialized the call is a no-op.
Throws
ProviderInitializationError Thrown when onInitialize rejects. The original error is wrapped with the provider class name for diagnostics.
Inherited from
isReady()
isReady(): boolean;
Defined in: src/providers/base/BaseProvider.ts:178
Check whether the provider has been initialized and is ready.
Returns
boolean
true when initialize has completed successfully and dispose has not yet been called.
Inherited from
isToolCall()
isToolCall(_chunk): boolean;
Defined in: src/providers/base/BaseLLMProvider.ts:179
Check whether a response chunk represents a tool call.
Parameters
| Parameter | Type | Description |
|---|---|---|
_chunk | unknown | A response chunk to inspect. |
Returns
boolean
true when the chunk represents a tool call.
Remarks
The default implementation returns false. Tool-aware providers override this to detect tool invocations in the response stream.
Inherited from
mergeOptions()
protected mergeOptions(options?): LLMGenerationOptions;
Defined in: src/providers/base/BaseLLMProvider.ts:224
Merge per-call generation options with the provider’s config defaults.
Parameters
| Parameter | Type | Description |
|---|---|---|
options? | LLMGenerationOptions | Optional per-call overrides. |
Returns
A merged LLMGenerationOptions object.
Remarks
Values supplied in options take precedence over values in config. Only defined values are included in the result, allowing providers to distinguish “not set” from explicit values.
Inherited from
onConfigUpdate()
protected onConfigUpdate(_config): void;
Defined in: src/providers/base/BaseProvider.ts:242
Hook called after updateConfig merges new values.
Parameters
| Parameter | Type | Description |
|---|---|---|
_config | Partial<BaseProviderConfig> | The partial configuration that was merged. |
Returns
void
Remarks
The default implementation is a no-op. Override in subclasses to react to runtime configuration changes (e.g. reconnect with a new API key).
Inherited from
BaseLLMProvider.onConfigUpdate
onDispose()
protected onDispose(): Promise<void>;
Defined in: src/providers/llm/webllm/WebLLMLLM.ts:272
Dispose of the WebLLM engine and free GPU memory.
Returns
Promise<void>
Remarks
Calls engine.unload() to release all GPU resources (model weights, KV cache, compiled shaders). This is important for freeing VRAM, especially on devices with limited GPU memory.
Called automatically by BaseLLMProvider.dispose.
Overrides
onInitialize()
protected onInitialize(): Promise<void>;
Defined in: src/providers/llm/webllm/WebLLMLLM.ts:230
Initialize the WebLLM engine.
Returns
Promise<void>
Remarks
Dynamically imports the @mlc-ai/web-llm peer dependency and creates an MLC engine. This triggers model weight download (on first use) and WebGPU shader compilation. Progress is reported via the onLoadProgress callback.
This method can take a significant amount of time on first run (minutes for large models) due to the download and compilation steps. Subsequent runs are much faster thanks to browser caching.
Called automatically by BaseLLMProvider.initialize.
Throws
ProviderInitializationError Thrown if the @mlc-ai/web-llm package cannot be found (peer dependency not installed) or if engine creation fails (e.g., no WebGPU support).
Overrides
processMessages()
processMessages(messages, options?): Promise<AsyncIterable<string, any, any>>;
Defined in: src/providers/llm/webllm/WebLLMLLM.ts:325
Process a conversation and generate a response.
Parameters
| Parameter | Type | Description |
|---|---|---|
messages | LLMMessage[] | Ordered array of LLMMessage objects representing the conversation history. |
options? | LLMGenerationOptions | Optional generation overrides. |
Returns
Promise<AsyncIterable<string, any, any>>
An AsyncIterable that yields text chunks as they arrive.
Remarks
Interface: Receive Text -> Send Text. The primary handler method. Returns an AsyncIterable that yields text chunks. When streaming is enabled, multiple chunks are yielded as tokens arrive. When streaming is disabled, a single chunk containing the full response is yielded.
Overrides
BaseLLMProvider.processMessages
processText()
processText(prompt, options?): Promise<AsyncIterable<string, any, any>>;
Defined in: src/providers/base/BaseLLMProvider.ts:163
Process a single text prompt (convenience wrapper).
Parameters
| Parameter | Type | Description |
|---|---|---|
prompt | string | The user’s input text. |
options? | LLMGenerationOptions | Optional generation overrides. |
Returns
Promise<AsyncIterable<string, any, any>>
An AsyncIterable that yields text chunks as they arrive.
Remarks
Converts the prompt to a messages array (optionally prepending a system message from config) and delegates to processMessages.
Inherited from
promptToMessages()
protected promptToMessages(prompt): LLMMessage[];
Defined in: src/providers/base/BaseLLMProvider.ts:195
Convert a plain-text prompt into an LLMMessage array.
Parameters
| Parameter | Type | Description |
|---|---|---|
prompt | string | The user’s input text. |
Returns
A messages array suitable for processMessages.
Remarks
If the provider’s config includes a systemPrompt, it is prepended as a system message. The prompt itself becomes a user message.
Inherited from
BaseLLMProvider.promptToMessages
resolveApiKey()
protected resolveApiKey(): string;
Defined in: src/providers/base/BaseProvider.ts:325
Resolve the API key for this provider.
Returns
string
The configured API key, or 'proxy' in proxy mode.
Remarks
Returns 'proxy' in proxy mode so that SDK clients (which require a non-empty API key string) can be instantiated without the real key.
Inherited from
resolveAuthHeader()
protected resolveAuthHeader(defaultAuthType?): string | undefined;
Defined in: src/providers/base/BaseProvider.ts:364
Resolve Authorization header value for the configured auth type.
Parameters
| Parameter | Type | Default value | Description |
|---|---|---|---|
defaultAuthType | "token" | "bearer" | 'token' | The default auth type for this provider. |
Returns
string | undefined
The Authorization header value, or undefined in proxy mode.
Remarks
Returns the header value for REST or server-side WebSocket connections:
'token'→'Token <apiKey>''bearer'→'Bearer <apiKey>'
Returns undefined in proxy mode.
Inherited from
BaseLLMProvider.resolveAuthHeader
resolveBaseUrl()
protected resolveBaseUrl(defaultUrl?): string | undefined;
Defined in: src/providers/base/BaseProvider.ts:307
Resolve the base URL for this provider.
Parameters
| Parameter | Type | Description |
|---|---|---|
defaultUrl? | string | The provider’s default API URL. Pass undefined to let the underlying SDK use its own default. |
Returns
string | undefined
The resolved URL, or undefined when all sources are unset.
Remarks
Priority: proxyUrl > endpoint > defaultUrl.
For WebSocket providers (this.type === 'websocket'), the proxy URL’s http(s) scheme is automatically converted to ws(s).
When no URL is configured and defaultUrl is undefined, the return value is undefined — this lets SDK-based providers (Anthropic, OpenAI) fall back to their own built-in defaults.
Inherited from
BaseLLMProvider.resolveBaseUrl
resolveWsProtocols()
protected resolveWsProtocols(defaultAuthType?): string[] | undefined;
Defined in: src/providers/base/BaseProvider.ts:343
Resolve WebSocket subprotocol for authentication.
Parameters
| Parameter | Type | Default value | Description |
|---|---|---|---|
defaultAuthType | "token" | "bearer" | 'token' | The default auth type for this provider. |
Returns
string[] | undefined
Subprotocol array for new WebSocket(url, protocols), or undefined.
Remarks
Returns the subprotocol array for direct mode based on authType:
'token'→['token', apiKey](Deepgram default)'bearer'→['bearer', apiKey](OAuth/Bearer tokens)
Returns undefined in proxy mode (no client-side auth needed).
Inherited from
BaseLLMProvider.resolveWsProtocols
updateConfig()
updateConfig(config): void;
Defined in: src/providers/base/BaseProvider.ts:201
Merge partial configuration updates into the current config.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | Partial<BaseProviderConfig> | A partial configuration object whose keys will overwrite existing values. |
Returns
void
Remarks
After merging, the subclass hook onConfigUpdate is called so providers can react to changed values at runtime.