OpenAI Compatible
Connect any OpenAI-compatible LLM endpoint to a CompositeVoice pipeline.
Use OpenAICompatibleLLM when you need to connect a custom, self-hosted, or third-party LLM that speaks the OpenAI chat completions format. This includes services like Ollama, vLLM, LiteLLM, Together AI, Perplexity, DeepSeek, and any other /v1/chat/completions endpoint.
Prerequisites
- An accessible OpenAI-compatible API endpoint
- Install the peer dependency:
npm install openai
Basic setup
import { CompositeVoice, OpenAICompatibleLLM, NativeSTT, NativeTTS } from '@lukeocodes/composite-voice';
const agent = new CompositeVoice({
stt: new NativeSTT({ language: 'en-US' }),
llm: new OpenAICompatibleLLM({
baseURL: 'https://my-model-server.example.com/v1',
apiKey: 'my-api-key',
model: 'my-custom-model',
systemPrompt: 'You are a concise voice assistant. Keep answers under two sentences.',
}),
tts: new NativeTTS(),
});
await agent.start();
Configuration options
| Option | Type | Default | Description |
|---|---|---|---|
model | string | (required) | Model identifier recognized by the target endpoint. |
baseURL | string | — | Base URL for the API (e.g., http://localhost:11434/v1). |
systemPrompt | string | — | System-level instructions for the assistant. |
temperature | number | — | Randomness (0 = deterministic, 2 = creative). |
maxTokens | number | — | Maximum tokens per response. |
topP | number | — | Nucleus sampling threshold (0—1). |
stream | boolean | true | Stream tokens incrementally. |
proxyUrl | string | — | CompositeVoice proxy endpoint. Takes precedence over baseURL. |
apiKey | string | — | API key for the target endpoint. |
maxRetries | number | 3 | Retry count for failed requests. |
Common endpoints
Ollama (local models)
const llm = new OpenAICompatibleLLM({
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama', // Ollama ignores the key but the SDK requires one
model: 'llama3.2',
systemPrompt: 'You are a helpful voice assistant.',
});
Together AI
const llm = new OpenAICompatibleLLM({
baseURL: 'https://api.together.xyz/v1',
apiKey: 'your-together-api-key',
model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
});
DeepSeek
const llm = new OpenAICompatibleLLM({
baseURL: 'https://api.deepseek.com/v1',
apiKey: 'your-deepseek-api-key',
model: 'deepseek-chat',
});
Complete example
import {
CompositeVoice,
OpenAICompatibleLLM,
DeepgramSTT,
DeepgramTTS,
} from '@lukeocodes/composite-voice';
const agent = new CompositeVoice({
stt: new DeepgramSTT({
proxyUrl: '/api/proxy/deepgram',
language: 'en',
options: { model: 'nova-3', smartFormat: true },
}),
llm: new OpenAICompatibleLLM({
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama',
model: 'llama3.2',
temperature: 0.7,
maxTokens: 256,
systemPrompt: 'You are a friendly voice assistant. Answer briefly.',
}),
tts: new DeepgramTTS({
proxyUrl: '/api/proxy/deepgram',
voice: 'aura-2-thalia-en',
}),
conversationHistory: { enabled: true, maxTurns: 10 },
});
await agent.start();
Tips
- Provide either
apiKeyorproxyUrl. At least one is required. If both are set,proxyUrltakes precedence and the SDK sends a dummy key. - Verify your endpoint supports streaming. Some self-hosted setups disable SSE streaming. Set
stream: falseif your endpoint does not support it. - This is the base class for OpenAI, Groq, Mistral, and Gemini. If you use one of those services, prefer their dedicated provider classes — they set correct defaults for
baseURLandmodel. - Extend this class for custom providers. Override
providerNameandbuildClientOptions()to add provider-specific behavior. See the source ofGroqLLMorGeminiLLMfor examples.
Related
- Providers reference — all LLM providers at a glance
- API reference — full class documentation
- OpenAI guide — dedicated OpenAI provider
- Groq guide — dedicated Groq provider