Docs

Roxy AI API

Roxy AI exposes an OpenAI-compatible API under /api, with versioned aliases under /api/v1. The web app uses the same model catalog and shares the same owner-managed key system.

Endpoints

POST /api/chat/completions

Main OpenAI-style chat endpoint. Also available at /api/v1/chat/completions.

GET /api/models

OpenAI-style model list. Also available at /api/v1/models.

GET /api/keys

Owner-only route that lists stored API keys.

POST /api/keys

Owner-only route for creating new stored client keys.

Authentication

Send API keys as Authorization: Bearer YOUR_KEY. The owner key is seeded through ROXY_ADMIN_API_KEY and bypasses rate limits. Keys created in the UI are stored locally in data/api-keys.json.

curl /api/chat/completions \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-pro",
    "messages": [
      { "role": "user", "content": "Build a pricing table in Tailwind." }
    ],
    "stream": false
  }'

Streaming

Set stream: true to receive server-sent events in the same shape most OpenAI-compatible clients expect.

{
  "model": "google/gemini-2.5-pro",
  "messages": [{ "role": "user", "content": "Write a login form." }],
  "stream": true,
  "stream_options": { "include_usage": true }
}

Model Catalog

GLM 4.5
zai/glm-4.5
128K context

Fast, roomy context for long edits and big pasted files.

Gemini 2.5 Pro
google/gemini-2.5-pro
1M context

Best fit for huge repositories, long specs, and image inputs.

Llama 3.3 70B
meta/llama-3.3-70b
131K context

Solid open-weight option for long-form drafting and coding.

Claude Sonnet 4
anthropic/claude-sonnet-4
200K context

Strong code reasoning with clean, structured markdown output.

Notes

Guest traffic hitting the web chat is rate-limited by IP. API keys use per-key limits unless they are the seeded owner key.

Attachments are capped at 10MB per file in the web UI. Images render inline in chat; other files render as file cards.

The current key store and rate limiter are local-process friendly. For multi-node production deployments, move keys and counters into a shared database or Redis.