WebLLMLLM

WebLLM in-browser LLM provider.

Defined in: src/providers/llm/webllm/WebLLMLLM.ts:196

WebLLM in-browser LLM provider.

Remarks

Uses @mlc-ai/web-llm to run language models entirely in the browser via WebGPU. This provider is unique among CompositeVoice LLM providers in that it requires no API key, no server, and no network connection after the initial model download.

Key characteristics:

Zero server cost: All inference runs on the user’s GPU.
Privacy-first: No data leaves the browser.
Offline capable: Works without network after the first model download.
Abort support: Uses engine.interruptGenerate() when the abort signal fires, which is more reliable than HTTP cancellation for local inference.
Resource cleanup: The dispose() method calls engine.unload() to free GPU memory.

Requirements:

A browser with WebGPU support (Chrome 113+, Edge 113+).
The @mlc-ai/web-llm peer dependency must be installed.
Sufficient GPU memory for the selected model.

Example

import { WebLLMLLM } from 'composite-voice';

const llm = new WebLLMLLM({
  model: 'Llama-3.2-1B-Instruct-q4f16_1-MLC',
  systemPrompt: 'You are a helpful local assistant.',
  onLoadProgress: ({ progress, text }) => {
    console.log(`Loading: ${Math.round(progress * 100)}% - ${text}`);
  },
});
await llm.initialize(); // Downloads model on first run

const stream = await llm.generate('Tell me about WebGPU.');
for await (const chunk of stream) {
  document.getElementById('output')!.textContent += chunk;
}

await llm.dispose(); // Frees GPU memory

See

WebLLMLLMConfig for configuration options.
BaseLLMProvider for the abstract base class.
OpenAICompatibleLLM for server-side alternatives.

Extends

BaseLLMProvider

Constructors

Constructor

new WebLLMLLM(config, logger?): WebLLMLLM;

Defined in: src/providers/llm/webllm/WebLLMLLM.ts:207

Creates a new WebLLM in-browser LLM provider instance.

Parameters

Parameter	Type	Description
`config`	`WebLLMLLMConfig`	WebLLM provider configuration. The only required field is `model`.
`logger?`	`Logger`	Optional custom logger instance. If omitted, a default logger is created by the base class.

Returns

WebLLMLLM

Overrides

BaseLLMProvider.constructor

Properties

Property	Modifier	Type	Default value	Description	Overrides	Inherited from	Defined in
`config`	`public`	`WebLLMLLMConfig`	`undefined`	LLM-specific provider configuration.	`BaseLLMProvider`.`config`	-	src/providers/llm/webllm/WebLLMLLM.ts:197
`initialized`	`protected`	`boolean`	`false`	Tracks whether initialize has completed successfully.	-	`BaseLLMProvider`.`initialized`	src/providers/base/BaseProvider.ts:97
`logger`	`protected`	`Logger`	`undefined`	Scoped logger instance for this provider.	-	`BaseLLMProvider`.`logger`	src/providers/base/BaseProvider.ts:94
`roles`	`readonly`	readonly `ProviderRole`[]	`undefined`	LLM providers cover the `'llm'` pipeline role by default.	-	`BaseLLMProvider`.`roles`	src/providers/base/BaseLLMProvider.ts:77
`type`	`readonly`	`ProviderType`	`undefined`	Communication transport this provider uses (`'rest'` or `'websocket'`).	-	`BaseLLMProvider`.`type`	src/providers/base/BaseProvider.ts:74

Accessors

isProxyMode

Get Signature

get protected isProxyMode(): boolean;

Defined in: src/providers/base/BaseProvider.ts:286

Whether the provider is in proxy mode.

Returns

boolean

true when proxyUrl is set.

Inherited from

BaseLLMProvider.isProxyMode

Methods

assertAuth()

protected assertAuth(): void;

Defined in: src/providers/base/BaseProvider.ts:272

Validate that auth is configured (either apiKey or proxyUrl).

Returns

void

Remarks

Call this in onInitialize() for any provider that requires external authentication. Native providers (NativeSTT, NativeTTS) and in-browser providers (WebLLM) should NOT call this method.

Throws

ProviderInitializationError Thrown when neither apiKey nor proxyUrl is set.

Inherited from

BaseLLMProvider.assertAuth

assertReady()

protected assertReady(): void;

Defined in: src/providers/base/BaseProvider.ts:255

Guard that throws if the provider has not been initialized.

Returns

void

Remarks

Call at the start of any method that requires the provider to be ready.

Throws

Error Thrown with a descriptive message when initialized is false.

Inherited from

BaseLLMProvider.assertReady

dispose()

dispose(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:154

Clean up resources and dispose of the provider.

Returns

Promise<void>

Remarks

Delegates to the subclass hook onDispose and resets the initialized flag. If the provider is not initialized, the call is a no-op.

Throws

Re-throws any error raised by onDispose.

Inherited from

BaseLLMProvider.dispose

generate()

generate(prompt, options?): Promise<AsyncIterable<string, any, any>>;

Defined in: src/providers/llm/webllm/WebLLMLLM.ts:317

Generate an LLM response from a multi-turn conversation.

Parameters

Parameter	Type	Description
`prompt`	`string`	-
`options?`	`LLMGenerationOptions`	Optional generation overrides (temperature, maxTokens, signal, etc.).

Returns

Promise<AsyncIterable<string, any, any>>

An async iterable that yields text chunks. When streaming is enabled (the default), chunks arrive incrementally; otherwise, a single chunk containing the full response is yielded.

Remarks

This is the primary generation method. It converts messages to WebLLM’s ChatCompletionMessageParam format (which matches the OpenAI format) and dispatches to either the streaming or non-streaming code path based on config.stream.

System messages are passed inline (WebLLM supports role: 'system' in the messages array, unlike Anthropic).

Throws

Error Thrown if the provider has not been initialized or the engine is unavailable.

Throws

AbortError Thrown if the provided options.signal is aborted before or during generation.

Example

const messages: LLMMessage[] = [
  { role: 'system', content: 'You are a local assistant.' },
  { role: 'user', content: 'What can you do offline?' },
];

const stream = await webllm.processMessages(messages);
for await (const chunk of stream) {
  console.log(chunk);
}

Overrides

BaseLLMProvider.generate

generateFromMessages()

generateFromMessages(messages, options?): Promise<AsyncIterable<string, any, any>>;

Defined in: src/providers/llm/webllm/WebLLMLLM.ts:321

Generate a response from a multi-turn conversation.

Parameters

Parameter	Type	Description
`messages`	`LLMMessage`[]	Array of conversation messages including history.
`options?`	`LLMGenerationOptions`	Optional generation overrides.

Returns

Promise<AsyncIterable<string, any, any>>

An async iterable of text chunks.

Remarks

Required by the LLMProvider interface. Subclasses must implement this.

Overrides

BaseLLMProvider.generateFromMessages

getConfig()

getConfig(): LLMProviderConfig;

Defined in: src/providers/base/BaseLLMProvider.ts:246

Get a shallow copy of the current LLM configuration.

Returns

LLMProviderConfig

A new LLMProviderConfig object.

Inherited from

BaseLLMProvider.getConfig

initialize()

initialize(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:127

Initialize the provider, making it ready for use.

Returns

Promise<void>

Remarks

Calls the subclass hook onInitialize. If the provider has already been initialized the call is a no-op.

Throws

ProviderInitializationError Thrown when onInitialize rejects. The original error is wrapped with the provider class name for diagnostics.

Inherited from

BaseLLMProvider.initialize

isReady()

isReady(): boolean;

Defined in: src/providers/base/BaseProvider.ts:178

Check whether the provider has been initialized and is ready.

Returns

boolean

true when initialize has completed successfully and dispose has not yet been called.

Inherited from

BaseLLMProvider.isReady

isToolCall()

isToolCall(_chunk): boolean;

Defined in: src/providers/base/BaseLLMProvider.ts:179

Check whether a response chunk represents a tool call.

Parameters

Parameter	Type	Description
`_chunk`	`unknown`	A response chunk to inspect.

Returns

boolean

true when the chunk represents a tool call.

Remarks

The default implementation returns false. Tool-aware providers override this to detect tool invocations in the response stream.

Inherited from

BaseLLMProvider.isToolCall

mergeOptions()

protected mergeOptions(options?): LLMGenerationOptions;

Defined in: src/providers/base/BaseLLMProvider.ts:224

Merge per-call generation options with the provider’s config defaults.

Parameters

Parameter	Type	Description
`options?`	`LLMGenerationOptions`	Optional per-call overrides.

Returns

LLMGenerationOptions

A merged LLMGenerationOptions object.

Remarks

Values supplied in options take precedence over values in config. Only defined values are included in the result, allowing providers to distinguish “not set” from explicit values.

Inherited from

BaseLLMProvider.mergeOptions

onConfigUpdate()

protected onConfigUpdate(_config): void;

Defined in: src/providers/base/BaseProvider.ts:242

Hook called after updateConfig merges new values.

Parameters

Parameter	Type	Description
`_config`	`Partial`<`BaseProviderConfig`>	The partial configuration that was merged.

Returns

void

Remarks

The default implementation is a no-op. Override in subclasses to react to runtime configuration changes (e.g. reconnect with a new API key).

Inherited from

BaseLLMProvider.onConfigUpdate

onDispose()

protected onDispose(): Promise<void>;

Defined in: src/providers/llm/webllm/WebLLMLLM.ts:272

Dispose of the WebLLM engine and free GPU memory.

Returns

Promise<void>

Remarks

Calls engine.unload() to release all GPU resources (model weights, KV cache, compiled shaders). This is important for freeing VRAM, especially on devices with limited GPU memory.

Called automatically by BaseLLMProvider.dispose.

Overrides

BaseLLMProvider.onDispose

onInitialize()

protected onInitialize(): Promise<void>;

Defined in: src/providers/llm/webllm/WebLLMLLM.ts:230

Initialize the WebLLM engine.

Returns

Promise<void>

Remarks

Dynamically imports the @mlc-ai/web-llm peer dependency and creates an MLC engine. This triggers model weight download (on first use) and WebGPU shader compilation. Progress is reported via the onLoadProgress callback.

This method can take a significant amount of time on first run (minutes for large models) due to the download and compilation steps. Subsequent runs are much faster thanks to browser caching.

Called automatically by BaseLLMProvider.initialize.

Throws

ProviderInitializationError Thrown if the @mlc-ai/web-llm package cannot be found (peer dependency not installed) or if engine creation fails (e.g., no WebGPU support).

Overrides

BaseLLMProvider.onInitialize

processMessages()

processMessages(messages, options?): Promise<AsyncIterable<string, any, any>>;

Defined in: src/providers/llm/webllm/WebLLMLLM.ts:325

Process a conversation and generate a response.

Parameters

Parameter	Type	Description
`messages`	`LLMMessage`[]	Ordered array of LLMMessage objects representing the conversation history.
`options?`	`LLMGenerationOptions`	Optional generation overrides.

Returns

Promise<AsyncIterable<string, any, any>>

An AsyncIterable that yields text chunks as they arrive.

Remarks

Interface: Receive Text -> Send Text. The primary handler method. Returns an AsyncIterable that yields text chunks. When streaming is enabled, multiple chunks are yielded as tokens arrive. When streaming is disabled, a single chunk containing the full response is yielded.

Overrides

BaseLLMProvider.processMessages

processText()

processText(prompt, options?): Promise<AsyncIterable<string, any, any>>;

Defined in: src/providers/base/BaseLLMProvider.ts:163

Process a single text prompt (convenience wrapper).

Parameters

Parameter	Type	Description
`prompt`	`string`	The user’s input text.
`options?`	`LLMGenerationOptions`	Optional generation overrides.

Returns

Promise<AsyncIterable<string, any, any>>

An AsyncIterable that yields text chunks as they arrive.

Remarks

Converts the prompt to a messages array (optionally prepending a system message from config) and delegates to processMessages.

Inherited from

BaseLLMProvider.processText

promptToMessages()

protected promptToMessages(prompt): LLMMessage[];

Defined in: src/providers/base/BaseLLMProvider.ts:195

Convert a plain-text prompt into an LLMMessage array.

Parameters

Parameter	Type	Description
`prompt`	`string`	The user’s input text.

Returns

LLMMessage[]

A messages array suitable for processMessages.

Remarks

If the provider’s config includes a systemPrompt, it is prepended as a system message. The prompt itself becomes a user message.

Inherited from

BaseLLMProvider.promptToMessages

resolveApiKey()

protected resolveApiKey(): string;

Defined in: src/providers/base/BaseProvider.ts:325

Resolve the API key for this provider.

Returns

string

The configured API key, or 'proxy' in proxy mode.

Remarks

Returns 'proxy' in proxy mode so that SDK clients (which require a non-empty API key string) can be instantiated without the real key.

Inherited from

BaseLLMProvider.resolveApiKey

resolveAuthHeader()

protected resolveAuthHeader(defaultAuthType?): string | undefined;

Defined in: src/providers/base/BaseProvider.ts:364

Resolve Authorization header value for the configured auth type.

Parameters

Parameter	Type	Default value	Description
`defaultAuthType`	`"token"` \| `"bearer"`	`'token'`	The default auth type for this provider.

Returns

string | undefined

The Authorization header value, or undefined in proxy mode.

Remarks

Returns the header value for REST or server-side WebSocket connections:

'token' → 'Token <apiKey>'
'bearer' → 'Bearer <apiKey>'

Returns undefined in proxy mode.

Inherited from

BaseLLMProvider.resolveAuthHeader

resolveBaseUrl()

protected resolveBaseUrl(defaultUrl?): string | undefined;

Defined in: src/providers/base/BaseProvider.ts:307

Resolve the base URL for this provider.

Parameters

Parameter	Type	Description
`defaultUrl?`	`string`	The provider’s default API URL. Pass `undefined` to let the underlying SDK use its own default.

Returns

string | undefined

The resolved URL, or undefined when all sources are unset.

Remarks

Priority: proxyUrl > endpoint > defaultUrl.

For WebSocket providers (this.type === 'websocket'), the proxy URL’s http(s) scheme is automatically converted to ws(s).

When no URL is configured and defaultUrl is undefined, the return value is undefined — this lets SDK-based providers (Anthropic, OpenAI) fall back to their own built-in defaults.

Inherited from

BaseLLMProvider.resolveBaseUrl

resolveWsProtocols()

protected resolveWsProtocols(defaultAuthType?): string[] | undefined;

Defined in: src/providers/base/BaseProvider.ts:343

Resolve WebSocket subprotocol for authentication.

Parameters

Parameter	Type	Default value	Description
`defaultAuthType`	`"token"` \| `"bearer"`	`'token'`	The default auth type for this provider.

Returns

string[] | undefined

Subprotocol array for new WebSocket(url, protocols), or undefined.

Remarks

Returns the subprotocol array for direct mode based on authType:

'token' → ['token', apiKey] (Deepgram default)
'bearer' → ['bearer', apiKey] (OAuth/Bearer tokens)

Returns undefined in proxy mode (no client-side auth needed).

Inherited from

BaseLLMProvider.resolveWsProtocols

updateConfig()

updateConfig(config): void;

Defined in: src/providers/base/BaseProvider.ts:201

Merge partial configuration updates into the current config.

Parameters

Parameter	Type	Description
`config`	`Partial`<`BaseProviderConfig`>	A partial configuration object whose keys will overwrite existing values.

Returns

void

Remarks

After merging, the subclass hook onConfigUpdate is called so providers can react to changed values at runtime.

Inherited from

BaseLLMProvider.updateConfig