Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.stagewise.io/llms.txt

Use this file to discover all available pages before exploring further.

Custom endpoints connect stagewise to any model, anywhere — including models running locally on your machine. stagewise supports multiple API styles: OpenAI-compatible chat completions, OpenAI Responses, Anthropic, Google, Azure OpenAI, Amazon Bedrock, and Google Vertex AI.

Creating a custom endpoint

1

Open Models & Providers

Go to Settings → Agent → Models & Providers.
2

Open custom providers

Click Custom Providers in the top-right corner of the built-in providers section.
3

Add a provider

Click Add Provider.
4

Configure the provider

Unless your provider states otherwise, use OpenAI (Chat Completions) — the de facto standard for self-hosted and proxy services.
Fill in:
  • Name — Display name
  • Provider type — The API specification your endpoint implements
  • Base URL — The endpoint URL

Supported API specifications

SpecExample services
openai-chat-completionsOllama, LM Studio, vLLM, LiteLLM, Together AI
anthropicSelf-hosted Claude, Anthropic-compatible proxies
openai-responsesOpenAI Responses API endpoints
googleSelf-hosted Gemini-compatible services
azureAzure OpenAI Service
amazon-bedrockAWS Bedrock
google-vertexGoogle Vertex AI

Cloud platform configuration

Azure OpenAI

Additional fields: Resource name and API version (e.g., 2024-02-01).

Amazon Bedrock

Choose an Authentication Method:
MethodWhat to enter
Access KeysAWS Region, Access Key ID, and Secret Access Key
Named ProfileOptional AWS Region override and an AWS Profile from ~/.aws/config or ~/.aws/credentials
Default Credential ChainOptional AWS Region override. stagewise resolves credentials from environment variables, shared credentials, ECS/EC2 metadata, and SSO
SSO profiles require an active session. If requests fail with an expired-token error, run aws sso login --profile <name>.

Google Vertex AI

Additional fields: Project ID, Location (e.g., us-central1), and Google credentials (service account JSON).

Local models via Ollama

A typical setup for running models locally:
  1. Run Ollama on your machine: ollama serve
  2. Pull a model: ollama pull llama3
  3. Create a custom endpoint in stagewise:
    • Provider type: openai-chat-completions
    • Base URL: http://localhost:11434/v1
  4. Add a custom model with the model ID Ollama expects (e.g., llama3)
  5. Select the model from the model picker

Model ID mapping

Some endpoints use different model identifiers. Use Model ID Mapping to remap:
{
  "claude-sonnet-4-6": "claude-v2"
}
This tells stagewise: “when Sonnet 4.6 is selected, request claude-v2 from this endpoint.”

Using your custom endpoint

After creating the endpoint, go to any built-in provider’s configuration and switch to Custom endpoint mode. Select your endpoint from the dropdown. All models for that provider route through your custom endpoint.