Skip to main content

Documentation Index

Fetch the complete documentation index at: https://opensre.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

OpenSRE is provider-agnostic: bring your own model. Selection is controlled by the LLM_PROVIDER environment variable, with per-provider API key and model overrides. Defaults are tracked in app/config.py and routing lives in app/services/llm_client.py.

Quick reference

ProviderLLM_PROVIDERAuthReasoning model defaultToolcall model default
AnthropicanthropicANTHROPIC_API_KEYclaude-sonnet-4-6claude-haiku-4-5-20251001
OpenAIopenaiOPENAI_API_KEYgpt-5.4gpt-5.4-mini
OpenRouteropenrouterOPENROUTER_API_KEYopenrouter/autoopenrouter/auto
RequestyrequestyREQUESTY_API_KEYanthropic/claude-sonnet-4-6anthropic/claude-sonnet-4-6
Google GeminigeminiGEMINI_API_KEYgemini-3.1-pro-previewgemini-3.1-flash-lite-preview
NVIDIA NIMnvidiaNVIDIA_API_KEYmeta/llama-3.1-405b-instructmeta/llama-3.1-8b-instruct
MiniMaxminimaxMINIMAX_API_KEYMiniMax-M2.7MiniMax-M2.7-highspeed
Amazon BedrockbedrockAWS IAM (AWS_REGION)us.anthropic.claude-sonnet-4-6us.anthropic.claude-haiku-4-5-20251001-v1:0
Ollama (local)ollamaNone (local daemon)llama3.2llama3.2
OpenAI Codex CLIcodexcodex login (CLI)Codex CLI defaultCodex CLI default
Claude Code CLIclaude-codeclaude login (CLI)Claude Code CLI defaultClaude Code CLI default
OpenSRE distinguishes two model slots per provider:
  • Reasoning model — full-capability model used for diagnosis, claim validation, and multi-step analysis.
  • Toolcall model — lightweight, lower-cost model used for tool selection and routing.

Selecting a provider

Set LLM_PROVIDER (default: anthropic) in your environment or .env file:
export LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...
Or run the onboarding wizard, which writes the same values to .env:
opensre onboard
Override the default model for a slot via env vars:
export OPENAI_REASONING_MODEL=gpt-5.4
export OPENAI_TOOLCALL_MODEL=gpt-5.4-mini
A shared LLM_MAX_TOKENS (default 4096) controls the response token budget for every provider.

API providers

Anthropic

export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
# Optional overrides:
export ANTHROPIC_REASONING_MODEL=claude-sonnet-4-6
export ANTHROPIC_TOOLCALL_MODEL=claude-haiku-4-5-20251001
The default. Uses the Anthropic Python SDK directly. Get an API key at console.anthropic.com.

OpenAI

export LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...
# Optional overrides:
export OPENAI_REASONING_MODEL=gpt-5.4
export OPENAI_TOOLCALL_MODEL=gpt-5.4-mini
Uses the OpenAI SDK. Reasoning models (o1, o3, o4, gpt-5*) automatically use max_completion_tokens instead of max_tokens.

OpenRouter

export LLM_PROVIDER=openrouter
export OPENROUTER_API_KEY=sk-or-...
# Optional override (single value applies to both slots if set):
export OPENROUTER_MODEL=openrouter/auto
# Or per-slot:
export OPENROUTER_REASONING_MODEL=anthropic/claude-sonnet-4-6
export OPENROUTER_TOOLCALL_MODEL=openai/gpt-4o-mini
OpenAI-compatible proxy — pick any model on openrouter.ai/models. Base URL: https://openrouter.ai/api/v1.

Requesty

export LLM_PROVIDER=requesty
export REQUESTY_API_KEY=...
# Optional override (single value applies to both slots if set):
export REQUESTY_MODEL=anthropic/claude-sonnet-4-6
# Or per-slot:
export REQUESTY_REASONING_MODEL=anthropic/claude-sonnet-4-6
export REQUESTY_TOOLCALL_MODEL=anthropic/claude-sonnet-4-6
OpenAI-compatible gateway at https://router.requesty.ai/v1. Sends an X-Title: OpenSRE header for usage attribution. Browse models on requesty.ai.

Google Gemini

export LLM_PROVIDER=gemini
export GEMINI_API_KEY=...
# Optional override:
export GEMINI_MODEL=gemini-3.1-pro-preview
# Or per-slot:
export GEMINI_REASONING_MODEL=gemini-3.1-pro-preview
export GEMINI_TOOLCALL_MODEL=gemini-3.1-flash-lite-preview
Uses Google’s OpenAI-compatible endpoint at https://generativelanguage.googleapis.com/v1beta/openai/. Get an API key at aistudio.google.com.

NVIDIA NIM

export LLM_PROVIDER=nvidia
export NVIDIA_API_KEY=nvapi-...
# Optional override:
export NVIDIA_MODEL=meta/llama-3.1-405b-instruct
# Or per-slot:
export NVIDIA_REASONING_MODEL=meta/llama-3.1-405b-instruct
export NVIDIA_TOOLCALL_MODEL=meta/llama-3.1-8b-instruct
Uses NVIDIA’s OpenAI-compatible API at https://integrate.api.nvidia.com/v1. Browse available models on build.nvidia.com.

MiniMax

export LLM_PROVIDER=minimax
export MINIMAX_API_KEY=...
# Optional override (single value applies to both slots if set):
export MINIMAX_MODEL=MiniMax-M2.7
# Or per-slot:
export MINIMAX_REASONING_MODEL=MiniMax-M2.7
export MINIMAX_TOOLCALL_MODEL=MiniMax-M2.7-highspeed
OpenAI-compatible endpoint at https://api.minimax.io/v1. Temperature is fixed to 1.0 to match MiniMax recommendations.

Amazon Bedrock

export LLM_PROVIDER=bedrock
export AWS_REGION=us-east-1
# Optional overrides:
export BEDROCK_REASONING_MODEL=us.anthropic.claude-sonnet-4-6
export BEDROCK_TOOLCALL_MODEL=us.anthropic.claude-haiku-4-5-20251001-v1:0
No API key — auth uses the AWS credential chain (environment variables, shared credentials file, or IAM role). Your principal needs permission to invoke the model IDs you configure (for example Bedrock InvokeModel / Converse access scoped to those resources in IAM). Model routing:
  • Anthropic Claude on Bedrock (anthropic.claude-*, us.anthropic.claude-*, and foundation-model ARNs that contain anthropic.claude) use the existing AnthropicBedrock SDK path.
  • Other Bedrock foundation models (for example Mistral, Meta Llama, Amazon Titan IDs you enable in your account) use the Bedrock Converse API via boto3, so you can set BEDROCK_REASONING_MODEL to a non-Claude model ID when your use case requires it.
  • Application inference profile ARNs (…:application-inference-profile/…) do not encode the vendor in the ID; those are always sent through Converse, which works for any backing model in the profile.
Defaults in app/config.py are US cross-region inference profile IDs for Anthropic Claude; override with IDs or ARNs that are inference-access enabled in your account and region.

Ollama (local)

export LLM_PROVIDER=ollama
# Optional overrides:
export OLLAMA_HOST=http://localhost:11434
export OLLAMA_MODEL=llama3.2
Run any local model exposed by an Ollama daemon. No API key required — OpenSRE talks to Ollama’s OpenAI-compatible endpoint at ${OLLAMA_HOST}/v1.

CLI providers (subprocess)

CLI-backed providers shell out to a vendor CLI instead of an HTTP API. They authenticate via the vendor’s own login command; OpenSRE detects the binary on PATH (or via an explicit env var) and reuses the existing session.

OpenAI Codex

export LLM_PROVIDER=codex
# Authenticate the Codex CLI separately:
codex login
# Optional overrides (all blank-by-default):
export CODEX_MODEL=
export CODEX_BIN=
Requires the OpenAI Codex CLI. If CODEX_MODEL is unset, OpenSRE omits -m so codex exec uses the CLI’s currently configured model. If CODEX_BIN is unset, the binary is resolved via PATH and known install locations.

Claude Code

export LLM_PROVIDER=claude-code
# Authenticate the Claude Code CLI separately:
claude login
# Optional overrides (all blank-by-default):
export CLAUDE_CODE_MODEL=
export CLAUDE_CODE_BIN=
Requires the Claude Code CLI (npm i -g @anthropic-ai/claude-code). If CLAUDE_CODE_MODEL is unset, OpenSRE omits the --model flag and the CLI uses its configured default. If CLAUDE_CODE_BIN is unset, the binary is resolved via PATH and known install locations. See app/integrations/llm_cli/AGENTS.md for the adapter pattern used to add new CLI providers.

Switching providers at runtime

OpenSRE caches LLM clients on first use. To switch providers within a single process (tests, benchmarks), call reset_llm_singletons() from app.services.llm_client after updating the env vars; otherwise a fresh process picks up the new LLM_PROVIDER automatically.

Where this lives in the code