LLM_PROVIDER environment variable, with per-provider API key and model overrides. Defaults are tracked in app/config.py and routing lives in app/services/llm_client.py.
Quick reference
| Provider | LLM_PROVIDER | Auth | Reasoning model default | Toolcall model default |
|---|---|---|---|---|
| Anthropic | anthropic | ANTHROPIC_API_KEY | claude-sonnet-4-6 | claude-haiku-4-5-20251001 |
| OpenAI | openai | OPENAI_API_KEY | gpt-5.4-mini | gpt-5.4-mini |
| OpenRouter | openrouter | OPENROUTER_API_KEY | openrouter/auto | openrouter/auto |
| DeepSeek | deepseek | DEEPSEEK_API_KEY | deepseek-v4-pro | deepseek-v4-flash |
| Google Gemini | gemini | GEMINI_API_KEY | gemini-3.1-pro-preview | gemini-3.1-flash-lite-preview |
| NVIDIA NIM | nvidia | NVIDIA_API_KEY | meta/llama-3.1-405b-instruct | meta/llama-3.1-8b-instruct |
| MiniMax | minimax | MINIMAX_API_KEY | MiniMax-M3 | MiniMax-M2.7-highspeed |
| Amazon Bedrock | bedrock | AWS IAM (AWS_REGION) | us.anthropic.claude-sonnet-4-6 | us.anthropic.claude-haiku-4-5-20251001-v1:0 |
| Ollama (local) | ollama | None (local daemon) | llama3.2 | llama3.2 |
| OpenAI Codex CLI | codex | codex login (CLI) | Codex CLI default | Codex CLI default |
| Claude Code CLI | claude-code | claude login (CLI) | Claude Code CLI default | Claude Code CLI default |
| GitHub Copilot CLI | copilot | copilot login or gh auth login (CLI) | Copilot CLI default | Copilot CLI default |
| Google Antigravity CLI | antigravity-cli | agy (browser OAuth, OS keyring) | Whatever the local agy config is set to (switch via /models inside agy) | same as reasoning model |
- Reasoning model — full-capability model used for diagnosis, claim validation, and multi-step analysis.
- Toolcall model — lightweight, lower-cost model used for tool selection and routing.
Selecting a provider
SetLLM_PROVIDER (default: anthropic) in your environment or .env file:
.env:
/model shows curated quick-pick choices for common models. Providers
with fast-changing or account-gated catalogs (OpenAI, OpenRouter, Gemini, NVIDIA, Bedrock, local
CLIs, Ollama, and DeepSeek) also accept custom model IDs:
LLM_MAX_TOKENS (default 4096) controls the response token budget for every provider.
API providers
Anthropic
OpenAI
o1, o3, o4, gpt-5*) automatically use max_completion_tokens instead of max_tokens.
OpenRouter
https://openrouter.ai/api/v1.
DeepSeek
https://api.deepseek.com.
Google Gemini
https://generativelanguage.googleapis.com/v1beta/openai/. Get an API key at aistudio.google.com.
NVIDIA NIM
https://integrate.api.nvidia.com/v1. Browse available models on build.nvidia.com.
MiniMax
https://api.minimax.io/v1. Temperature is fixed to 1.0 to match MiniMax recommendations.
Amazon Bedrock
InvokeModel / Converse access scoped to those resources in IAM).
Model routing:
- Anthropic Claude on Bedrock (
anthropic.claude-*,us.anthropic.claude-*, and foundation-model ARNs that containanthropic.claude) use the existing AnthropicBedrock SDK path. - Other Bedrock foundation models (for example Mistral, Meta Llama, Amazon Titan IDs you enable in your account) use the Bedrock Converse API via
boto3, so you can setBEDROCK_REASONING_MODELto a non-Claude model ID when your use case requires it. - Application inference profile ARNs (
…:application-inference-profile/…) do not encode the vendor in the ID; those are always sent through Converse, which works for any backing model in the profile.
app/config.py are US cross-region inference profile IDs for Anthropic Claude; override with IDs or ARNs that are inference-access enabled in your account and region.
Ollama (local)
${OLLAMA_HOST}/v1.
CLI providers (subprocess)
CLI-backed providers shell out to a vendor CLI instead of an HTTP API. They authenticate via the vendor’s own login command; OpenSRE detects the binary onPATH (or via an explicit env var) and reuses the existing session.
Investigation timeouts: Each ReAct turn runs one full CLI subprocess with the system prompt, tool schemas, and conversation history. The shared default subprocess budget is 300 seconds (Python adds a small buffer). Override per provider when needed, for example GEMINI_CLI_TIMEOUT_SECONDS, CLAUDE_CODE_TIMEOUT_SECONDS, or ANTIGRAVITY_CLI_TIMEOUT_SECONDS (clamped 30–600 where the adapter supports it).
OpenAI Codex
CODEX_MODEL is unset, OpenSRE omits -m so codex exec uses the CLI’s currently configured model. If CODEX_BIN is unset, the binary is resolved via PATH and known install locations.
Claude Code
npm i -g @anthropic-ai/claude-code). If CLAUDE_CODE_MODEL is unset, OpenSRE omits the --model flag and the CLI uses its configured default. If CLAUDE_CODE_BIN is unset, the binary is resolved via PATH and known install locations.
GitHub Copilot
npm i -g @github/copilot). Login uses the interactive /login slash command or copilot login. OpenSRE detects auth in this order: (1) COPILOT_GITHUB_TOKEN / GH_TOKEN / GITHUB_TOKEN env, (2) gh auth status when gh is on PATH (including ✓ Logged in to github.com account …, - Active account: true, or a supported - Token: prefix: gho_, github_pat_, ghu_ per Copilot docs — not ghp_), with gh auth status --hostname … when COPILOT_GH_HOST or GH_HOST targets a non-github.com host. It does not read plaintext $COPILOT_HOME/config.json (keychain-backed installs may omit it; mis-parsing arbitrary JSON risks false positives). If nothing matches, detection reports logged_in=None and the runner verifies at invoke time. If COPILOT_MODEL is unset, OpenSRE omits --model. Invocations run as copilot -p PROMPT --no-color --no-ask-user --silent so they never block on user input. BYOK / COPILOT_OFFLINE: GitHub auth may be unnecessary; a None probe can still be fine if Copilot is configured for offline or external providers only.
Google Antigravity CLI
agy) is Google’s successor to Gemini CLI. Install via curl -fsSL https://antigravity.google/cli/install.sh | bash, then run agy install to configure your shell PATH. The minimum tested version is 1.0.1 — older builds log a warning via the probe and direct you to agy update.
Why two Google providers? Google’s transition announcement states that on 2026-06-18 Gemini CLI stops serving Pro/Ultra and free users. Paid Gemini Code Assist licences keep Gemini CLI indefinitely. OpenSRE keeps both gemini-cli (deprecated alias with a probe-time notice) and antigravity-cli so either group can run without surprises.
As a best-effort fallback, the probe treats explicit GEMINI_API_KEY / GOOGLE_API_KEY / GOOGLE_APPLICATION_CREDENTIALS env credentials as authenticated (mirroring the Gemini CLI adapter), so users migrating across the two CLIs can keep their existing env-var-based auth without re-running the browser flow.
Invocations run as agy -p PROMPT --print-timeout {N}s. The adapter never passes --continue / --conversation / --sandbox / --dangerously-skip-permissions, keeping every opensre call ephemeral.
xAI Grok Build CLI
grok). Install with
curl -fsSL https://x.ai/cli/install.sh | bash (macOS/Linux) or
irm https://x.ai/cli/install.ps1 | iex (Windows). If GROK_CLI_MODEL is unset, OpenSRE
omits -m and the CLI uses its configured default. The wizard populates the model list live
from grok models at onboarding time so newly released models appear without an OpenSRE update.
Invocations run as grok -p PROMPT --output-format plain, so each opensre call is a single
non-interactive turn. The adapter deliberately omits --always-approve: OpenSRE drives its own
tools, so Grok is used purely as a text responder and never auto-executes shell commands or file edits.
Auth detection: auth is probed via grok models (~0.5 s, no LLM call), which prints
“You are logged in” on success. XAI_API_KEY is treated as an authenticated fallback for
headless / CI runs even when the probe result is unclear. XAI_API_KEY is forwarded only
to the Grok subprocess (never via the shared CLI env allowlist), so it cannot leak into other
CLI adapters.
Not to be confused withSeegroq. Thegrok-cliprovider is xAI’s Grok Build CLI. The separategroqprovider is the Groq HTTP API (a different company); the two are unrelated.
app/integrations/llm_cli/AGENTS.md for the adapter pattern used to add new CLI providers.
Reasoning effort (interactive shell)
In the TTY REPL (opensre with no subcommand), /effort stores a session preference for how strongly reasoning models should think before answering. It applies only when LLM_PROVIDER is openai (HTTP API) or codex (Codex CLI); other providers ignore the setting and the shell notes that.
| Input | Sent to the model |
|---|---|
low, medium, high, xhigh | same string |
max | xhigh |
/effort alone to show the current choice (or (default) when unset) and the usage line. /new starts a fresh session but keeps /effort (and trust mode), consistent with other session prefs.
Outside the REPL, optional defaults use the environment variable:
/effort overrides this for interactive runs. Implementation: app/llm_reasoning_effort.py.
Provider fallback and diagnostics
If the provider you set inLLM_PROVIDER is missing its API key, OpenSRE does not fail outright — it falls back to the next configured provider (by default it tries openai, then anthropic) so a partially configured machine still works. The trade-off is that calls can quietly go to a different provider than you intended, and you would otherwise only find out via a confusing error naming the fallback provider (for example “Anthropic credit balance too low” when you actually configured OpenAI).
To make this visible:
-
A warning is logged the first time a fallback happens, naming the configured provider, the missing key, and the provider actually used:
-
/status(in the interactive shell) shows the resolved provider and flags a fallback inline, instead of just echoingLLM_PROVIDER: -
Provider errors in the interactive shell are prefixed with which provider served the request and whether it was a fallback, so the message is actionable:
LLM_PROVIDER to a provider you have credentials for.
Switching providers at runtime
OpenSRE caches LLM clients on first use. To switch providers within a single process (tests, benchmarks), callreset_llm_singletons() from app.services.llm_client after updating the env vars; otherwise a fresh process picks up the new LLM_PROVIDER automatically.
Where this lives in the code
- Provider literals and defaults:
app/config.py(LLMProvider,LLMSettings). - Runtime routing:
app/services/llm_client.py(_create_llm_client). - API-backed provider guide:
app/services/AGENTS.md. - CLI-backed provider guide:
app/integrations/llm_cli/AGENTS.md.
Tracer