Skip to content

OpenAI Compatible

Connect any OpenAI-compatible LLM endpoint to a CompositeVoice pipeline.

Use OpenAICompatibleLLM when you need to connect a custom, self-hosted, or third-party LLM that speaks the OpenAI chat completions format. This includes services like Ollama, vLLM, LiteLLM, Together AI, Perplexity, DeepSeek, and any other /v1/chat/completions endpoint.

Prerequisites

  • An accessible OpenAI-compatible API endpoint
  • Install the peer dependency:
npm install openai

Basic setup

import { CompositeVoice, OpenAICompatibleLLM, NativeSTT, NativeTTS } from '@lukeocodes/composite-voice';

const agent = new CompositeVoice({
  stt: new NativeSTT({ language: 'en-US' }),
  llm: new OpenAICompatibleLLM({
    baseURL: 'https://my-model-server.example.com/v1',
    apiKey: 'my-api-key',
    model: 'my-custom-model',
    systemPrompt: 'You are a concise voice assistant. Keep answers under two sentences.',
  }),
  tts: new NativeTTS(),
});

await agent.start();

Configuration options

OptionTypeDefaultDescription
modelstring(required)Model identifier recognized by the target endpoint.
baseURLstringBase URL for the API (e.g., http://localhost:11434/v1).
systemPromptstringSystem-level instructions for the assistant.
temperaturenumberRandomness (0 = deterministic, 2 = creative).
maxTokensnumberMaximum tokens per response.
topPnumberNucleus sampling threshold (0—1).
streambooleantrueStream tokens incrementally.
proxyUrlstringCompositeVoice proxy endpoint. Takes precedence over baseURL.
apiKeystringAPI key for the target endpoint.
maxRetriesnumber3Retry count for failed requests.

Common endpoints

Ollama (local models)

const llm = new OpenAICompatibleLLM({
  baseURL: 'http://localhost:11434/v1',
  apiKey: 'ollama',  // Ollama ignores the key but the SDK requires one
  model: 'llama3.2',
  systemPrompt: 'You are a helpful voice assistant.',
});

Together AI

const llm = new OpenAICompatibleLLM({
  baseURL: 'https://api.together.xyz/v1',
  apiKey: 'your-together-api-key',
  model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
});

DeepSeek

const llm = new OpenAICompatibleLLM({
  baseURL: 'https://api.deepseek.com/v1',
  apiKey: 'your-deepseek-api-key',
  model: 'deepseek-chat',
});

Complete example

import {
  CompositeVoice,
  OpenAICompatibleLLM,
  DeepgramSTT,
  DeepgramTTS,
} from '@lukeocodes/composite-voice';

const agent = new CompositeVoice({
  stt: new DeepgramSTT({
    proxyUrl: '/api/proxy/deepgram',
    language: 'en',
    options: { model: 'nova-3', smartFormat: true },
  }),
  llm: new OpenAICompatibleLLM({
    baseURL: 'http://localhost:11434/v1',
    apiKey: 'ollama',
    model: 'llama3.2',
    temperature: 0.7,
    maxTokens: 256,
    systemPrompt: 'You are a friendly voice assistant. Answer briefly.',
  }),
  tts: new DeepgramTTS({
    proxyUrl: '/api/proxy/deepgram',
    voice: 'aura-2-thalia-en',
  }),
  conversationHistory: { enabled: true, maxTurns: 10 },
});

await agent.start();

Tips

  • Provide either apiKey or proxyUrl. At least one is required. If both are set, proxyUrl takes precedence and the SDK sends a dummy key.
  • Verify your endpoint supports streaming. Some self-hosted setups disable SSE streaming. Set stream: false if your endpoint does not support it.
  • This is the base class for OpenAI, Groq, Mistral, and Gemini. If you use one of those services, prefer their dedicated provider classes — they set correct defaults for baseURL and model.
  • Extend this class for custom providers. Override providerName and buildClientOptions() to add provider-specific behavior. See the source of GroqLLM or GeminiLLM for examples.

© 2026 CompositeVoice. All rights reserved.

Font size
Contrast
Motion
Transparency