Local LLM providers (Ollama & LM Studio)
Overview
Forge works great with local providers. We recommend Ollama for most setups; LM Studio is also supported via its OpenAI-compatible server.
Ollama (recommended)
- Install Ollama:
# macOS
brew install ollama
ollama --version
- Start the Ollama service (if not already running):
ollama serve
- Pull a model:
ollama pull llama3.2:3b
- Verify the API is reachable:
curl http://localhost:11434/api/tags | jq .
Tip: Set OLLAMA_HOST if your endpoint differs (default is http://localhost:11434).
LM Studio
-
Install LM Studio from their website.
-
Open LM Studio and download a chat model (e.g., GPT‑OSS 20B or Llama variants).
-
Start the local server (OpenAI-compatible):
Settings → Developer → Start Server
- Note the server URL (default:
http://localhost:1234/v1). You can setLMSTUDIO_HOSTto override.
Using with Forge
- When creating an agent, pick a provider and model:
poetry run forge create agent my-ollama --provider=ollama --model llama3.2:3b
- Or bootstrap recommendations for an existing agent:
poetry run forge models bootstrap agents/my-ollama --provider-order ollama,lmstudio --interactive
If using Docker for the full stack, ensure your provider is accessible from containers (e.g., host networking or proper OLLAMA_HOST/LMSTUDIO_HOST).