Gateway

Beta

Overview

Gateway provides OpenAI-compatible endpoints for chat completions, embeddings, and model listing. Point your existing OpenAI SDK—or Reactor’s client—at Gateway and route requests to OpenRouter, Amazon Bedrock, Azure Foundry, or any OpenAI-compatible API.

A built-in model registry ships sensible defaults. Define aliases like reasoning/cheapest to let Gateway pick the best model by cost or availability. Usage events emit to Analytics for token tracking keyed by the authenticated user.

Gateway is in beta: the core API is stable, but provider coverage, quota enforcement, and regional routing continue to expand.

Key features

OpenAI-compatible API — /ai/v1/chat/completions, /ai/v1/embeddings, /ai/v1/models.
Multi-provider dispatch — OpenRouter, Bedrock (SigV4), Azure Foundry, generic OpenAI-compatible APIs.
Model registry — Built-in defaults with per-project TOML overlays.
Routing aliases — Strategies: first, random, round_robin, cheapest, fastest.
Streaming — Server-Sent Events with data: {...} chunks and [DONE] terminator.
Usage metering — Token counts emitted per request for billing and analytics.
Extension seam — Cloud deployments can inject quota checks and regional routing without forking the OSS crate.

Quickstart

List models, send a chat completion, and stream the response.

reactor auth login user@example.com --password '...'
reactor ai models list
reactor ai test gpt-4o-mini --prompt "Explain Reactor.cloud in one sentence"
reactor ai chat claude-3-5-sonnet --prompt "Write a haiku about backends" --stream

import { createClient } from '@reactor/client';

const reactor = createClient({ url: process.env.REACTOR_URL! });
await reactor.auth.signInWithPassword({ email, password });

// Non-streaming
const response = await reactor.ai.chatCompletion({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await reactor.ai.chatCompletionStream({
  model: 'reasoning/cheapest',
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

# List models
curl -s "$REACTOR_URL/ai/v1/models" \
  -H "Authorization: Bearer $REACTOR_TOKEN"

# Chat completion
curl -s -X POST "$REACTOR_URL/ai/v1/chat/completions" \
  -H "Authorization: Bearer $REACTOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role":"user","content":"Hello!"}],
    "max_tokens": 256
  }'

# Streaming
curl -N -X POST "$REACTOR_URL/ai/v1/chat/completions" \
  -H "Authorization: Bearer $REACTOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role":"user","content":"Hello!"}],
    "stream": true
  }'

How-to guides

Use a routing alias

Aliases resolve to one or more models using a strategy defined in the registry.

# ai-models.toml (project overlay)
[aliases.reasoning/cheapest]
strategy = "cheapest"
targets = ["gpt-4o-mini", "claude-3-haiku"]

[aliases.reasoning/best]
strategy = "first"
targets = ["gpt-4", "claude-3-5-sonnet"]

CLI
JavaScript

reactor ai aliases list
reactor ai chat reasoning/cheapest --prompt "Summarize this API design"

const response = await reactor.ai.chatCompletion({
  model: 'reasoning/cheapest',
  messages: [{ role: 'user', content: 'Draft a product announcement' }],
});

Generate embeddings

reactor ai embed text-embedding-3-small "Hello world"

const { data } = await reactor.ai.embed({
  model: 'text-embedding-3-small',
  input: ['Hello', 'World'],
});
console.log(data[0].embedding.length);

curl -s -X POST "$REACTOR_URL/ai/v1/embeddings" \
  -H "Authorization: Bearer $REACTOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Hello world"
  }'

Drop-in OpenAI SDK replacement

Point the official OpenAI client at Gateway:

JavaScript

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: `${process.env.REACTOR_URL}/ai/v1`,
  apiKey: process.env.REACTOR_TOKEN!, // JWT access token
});

const completion = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }],
});

Configuration

[gateway]
# Optional registry overlay (merges with built-in defaults)
registry_overlay = "./ai-models.toml"
# Or fetch remotely:
# registry_url = "https://example.com/models.toml"

[gateway.providers.openrouter]
enabled = true
# api_key via OPENROUTER_API_KEY env

[gateway.providers.bedrock]
enabled = false
region = "us-east-1"
# AWS credentials via standard env vars

[gateway.providers.foundry]
enabled = false
endpoint = "https://my-resource.openai.azure.com"
# api_key via AZURE_FOUNDRY_API_KEY env

Environment variables:

Variable	Required	Description
`REACTOR_AI_BIND`	No	Default `0.0.0.0:8004` (may vary by deploy)
`REACTOR_AI_AUTH_URL`	Yes (microservices)	Identity service URL
`OPENROUTER_API_KEY`	If OpenRouter enabled	Provider API key
`AWS_ACCESS_KEY_ID`	If Bedrock enabled	AWS credentials
`AWS_SECRET_ACCESS_KEY`	If Bedrock enabled	AWS credentials
`AWS_REGION`	No	Default `us-east-1`
`AZURE_FOUNDRY_API_KEY`	If Foundry enabled	Azure API key
`AZURE_FOUNDRY_ENDPOINT`	If Foundry enabled	Azure resource URL

Reference reactor.toml section name in project config:

[ai]
registry_overlay = "./ai-models.toml"

Limits and quotas

Limit	Value	Notes
Authentication	Required	No anonymous inference
Streaming	SSE	OpenAI-compatible chunk format
Provider errors	Passed through	Mapped to `provider_error` with upstream detail
Quota enforcement	Cloud only	OSS ships no-op extensions
Credit ledger / billing	Cloud only	Consumes Analytics usage events
Prompt caching	Not in v0	Planned via reactor-cache integration
Function calling	Partial	Full tool support across providers in progress

Built-in alias strategies:

Strategy	Behavior
`first`	Use first available target
`random`	Random target per request
`round_robin`	Rotate across targets
`cheapest`	Lowest `input_price_per_mtok` + `output_price_per_mtok`
`fastest`	Provider latency heuristic (beta)

Usage event shape (sent to Analytics):

{
  "event": "ai.usage",
  "properties": {
    "model_id": "gpt-4o-mini",
    "user_id": "user_01HZ...",
    "tokens_in": 42,
    "tokens_out": 128
  }
}

API and SDK links

HTTP base path: /ai/v1/
OpenAPI reference: Gateway API
JavaScript SDK: reactor.ai, @reactor/ai
Swift SDK: ReactorAI
CLI: reactor ai
Guide: AI chatbot backend

Method	Path	Description
`GET`	`/ai/v1/health`	Service health
`GET`	`/ai/v1/models`	List available models
`POST`	`/ai/v1/chat/completions`	Chat (streaming or not)
`POST`	`/ai/v1/embeddings`	Generate embeddings

Troubleshooting

`model_not_found` (404)

The model ID or alias is not in the registry. List models with GET /ai/v1/models or add an overlay in ai-models.toml.

`provider_error` (502)

The upstream provider rejected or failed the request. Check provider credentials, rate limits, and model availability in the provider’s dashboard. Error details may include the upstream message.

reactor ai doctor
reactor ai test gpt-4o-mini --prompt "ping"

`unauthorized` (401)

Missing or expired JWT. Refresh your access token via Identity before calling Gateway.

Streaming stalls or incomplete responses

Some proxies buffer SSE. Connect directly to the API origin or disable buffering for /ai/v1/chat/completions when stream: true. Ensure your client reads until the [DONE] line.

`quota_exceeded` (429, Cloud only)

Your organization’s credit balance or rate limit was hit. Check usage in Analytics or your Cloud dashboard. Self-hosted OSS deployments do not enforce quotas by default.