Integration Guide
Document version: for gateway 4.1.0
Audience: engineering and platform teams integrating HiveTrace Gateway into a production environment.
This document describes the contract between the client application and the gateway: request and response formats, the error handling model, and streaming behavior. The guide is intended for practical implementation and operation of the integration.
1. Purpose and Place in the Architecture
Section titled “1. Purpose and Place in the Architecture”HiveTrace Gateway is an OpenAI-compatible HTTP proxy placed between the client application and the LLM provider (OpenAI, Anthropic OpenAI-compatible endpoints, LiteLLM, vLLM, Ollama in OpenAI mode, and others). For each request, the following sequence is performed:
- The gateway receives it as a regular OpenAI call.
- It is registered in HiveTrace before being sent to the LLM.
- It is proxied to the upstream.
- It is registered in HiveTrace after the response is received.
With the appropriate policy, HiveTrace can reject a request or replace the model response with a predefined message. This is only possible in synchronous mode, when data is delivered to HiveTrace monitoring for analysis synchronously.
┌────────────┐ ┌──────────────────┐ ┌────────────┐│ Client │────▶│ HiveTrace │────▶│ Upstream ││ │ │ Gateway :4100 │ │ (LiteLLM, ││ │◀────│ │◀────│ vLLM, …) │└────────────┘ └────────┬─────────┘ └────────────┘ │ ▼ ┌──────────────┐ │ HiveTrace │ │ Monitoring │ └──────────────┘1.1. OpenAI Compatibility
Section titled “1.1. OpenAI Compatibility”The API matches OpenAI Chat Completions, Embeddings, and Models: the request body format, response shape, SSE streaming, and error envelope are identical to OpenAI. The request body is not modified when forwarded to the upstream, with one exception: stream: true is forcibly rewritten to false when app_mode=sync; see section 6.3.2.
Additional values required for monitoring analytics (application_id, user_id, session_id) are passed to the gateway through HTTP headers. They are not forwarded to the upstream and do not affect the model response. See section 4 for the full list.
The X-HiveTrace-Api-Key header is required for all requests to the gateway. In the OpenAI SDK, it is set through default_headers; see the example in section 5.1.
1.2. OpenAPI Specification
Section titled “1.2. OpenAPI Specification”The API contract is available in OpenAPI 3.1 format:
- Snapshot:
openapi.json. - On a running gateway instance:
GET /openapi.json, Swagger UI at/docs, ReDoc at/redoc.
2. Endpoints
Section titled “2. Endpoints”The gateway provides a single entry point (:4100 by default) and implements three OpenAI-compatible endpoints:
| Method | Path | Purpose | HiveTrace pipeline |
|---|---|---|---|
POST | /v1/chat/completions | Chat completion. stream: true is supported for async monitoring mode. | Full (pre-call + post-call + error logging) |
POST | /v1/embeddings | Embeddings. | Not applied - transparent proxying |
GET | /v1/models | List of upstream models. | Not applied - transparent proxying |
The full HiveTrace policy check path is available for chat/completions.
3. Authentication
Section titled “3. Authentication”The gateway uses a two-level authorization model:
3.1. Client -> Gateway
Section titled “3.1. Client -> Gateway”The client sends the X-HiveTrace-Api-Key: <gateway-key> header, which is the required access key for the gateway. Without it, any request to any endpoint (/v1/chat/completions, /v1/embeddings, /v1/models) is rejected with status 401 invalid_request_error. The gateway uses the same key to authenticate to HiveTrace for outgoing calls, so changing the client key automatically changes the gateway identity in HiveTrace. The server token in env is not used and has been removed.
3.2. Client -> Gateway -> Upstream
Section titled “3.2. Client -> Gateway -> Upstream”The client sends the Authorization: Bearer <key> header. The gateway does not validate or store this token; it is passed to the upstream unchanged. This means:
<key>must be a valid upstream key, such as an OpenAI API key for direct calls to OpenAI or a LiteLLM key.
3.3. Applications
Section titled “3.3. Applications”The application is identified by the X-Application-Id header; see section 4. This allows one gateway instance to serve multiple applications with different HiveTrace policies without changing infrastructure configuration.
4. Request Headers
Section titled “4. Request Headers”The “Source” column indicates whether the header belongs to the OpenAI standard (proxied to the upstream without gateway interpretation) or is a gateway extension (processed by the gateway and not forwarded to the upstream).
| Header | Source | Requiredness | Purpose |
|---|---|---|---|
Authorization: Bearer <key> | OpenAI | for upstream | Upstream key. Passed to the upstream without modification or validation. |
Content-Type: application/json | OpenAI | for non-stream requests | Standard for the OpenAI API. |
Accept: text/event-stream | OpenAI | for stream requests | Standard for OpenAI streaming. |
X-HiveTrace-Api-Key | Gateway | yes | Gateway access key. Also used as the Bearer token for outgoing gateway -> HiveTrace calls. Missing or empty value -> 401. |
X-Application-Id | Gateway | no | Application UUID in HiveTrace. Determines the per-app policy. If missing, falls back to the HIVETRACE_APPLICATION_ID env variable. |
X-User-Id | Gateway | no | End-user identifier for audit. |
X-Session-Id | Gateway | no | User session identifier. |
X-Attached-Files | Gateway | no | JSON array of attachment descriptors for a separate audit copy in HiveTrace. Format: section 5.3.2. |
5. Request Format
Section titled “5. Request Format”The request body matches the OpenAI Chat Completions / Embeddings payload. There are no gateway-specific fields in the body.
5.1. HTTP Examples
Section titled “5.1. HTTP Examples”Minimal request:
curl -X POST http://gateway:4100/v1/chat/completions \ -H "Authorization: Bearer <upstream-key>" \ -H "X-HiveTrace-Api-Key: <gateway-key>" \ -H "Content-Type: application/json" \ -H "X-Application-Id: 11111111-2222-3333-4444-555555555555" \ -d '{ "model": "gpt-5", "messages": [{"role": "user", "content": "Hello!"}] }'With user and session identification:
curl -X POST http://gateway:4100/v1/chat/completions \ -H "Authorization: Bearer <upstream-key>" \ -H "X-HiveTrace-Api-Key: <gateway-key>" \ -H "Content-Type: application/json" \ -H "X-Application-Id: 11111111-…" \ -H "X-User-Id: alice@company.com" \ -H "X-Session-Id: session-2025-04-28-abc" \ -d '{ "model": "gpt-5", "messages": [...] }'Streaming:
curl -X POST http://gateway:4100/v1/chat/completions \ -H "Authorization: Bearer <upstream-key>" \ -H "X-HiveTrace-Api-Key: <gateway-key>" \ -H "Content-Type: application/json" \ -H "Accept: text/event-stream" \ -d '{ "model": "gpt-5", "stream": true, "messages": [...] }'5.3. File Transfer
Section titled “5.3. File Transfer”File transfer is supported through two strategies. They solve different tasks and can be used separately or together.
| Strategy | File reaches the LLM | File reaches HiveTrace (audit) | OpenAI client changes |
|---|---|---|---|
| 5.3.1. OpenAI standard (multimodal) | Yes | No | Not required |
5.3.2. X-Attached-Files header | No | Yes | One additional header |
| 5.3.3. Both strategies in a single request | Yes | Yes | One additional header |
5.3.1. Strategy 1 - OpenAI Standard (Multimodal messages[].content)
Section titled “5.3.1. Strategy 1 - OpenAI Standard (Multimodal messages[].content)”Files are passed using the standard OpenAI method: an array of parts in messages[].content. The gateway does not interpret these parts and forwards the request body to the upstream unchanged.
curl -X POST http://gateway:4100/v1/chat/completions \ -H "Authorization: Bearer <upstream-key>" \ -H "X-HiveTrace-Api-Key: <gateway-key>" \ -H "Content-Type: application/json" \ -H "X-Application-Id: <APPLICATION_ID>" \ -d '{ "model": "gpt-5", "messages": [{ "role": "user", "content": [ {"type": "text", "text": "What is in the image?"}, {"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBORw0KG..."}} ] }] }'Behavior:
- LLM. Receives the request in the standard OpenAI multimodal format. Processing of
image_url,file, and other non-text parts depends on the model and upstream. - HiveTrace. Only the text part (
type: "text") is written to theuser_promptrecord. The contents ofimage_url,file, and other non-text parts are not stored in HiveTrace. - Client changes. Not required. Any OpenAI client or SDK with multimodal support works without changes.
Use this when the file must be processed by the model without saving an audit copy of the file in HiveTrace.
5.3.2. Strategy 2 - X-Attached-Files Extension
Section titled “5.3.2. Strategy 2 - X-Attached-Files Extension”Files are passed in the X-Attached-Files header as a JSON array of descriptors. The gateway does not forward this header to the upstream; instead, it downloads or decodes the files and attaches them to the user_prompt audit record in HiveTrace.
Transfer through base64 (for local files):
B64=$(base64 -i ./test111.txt | tr -d '\n')curl -X POST http://gateway:4100/v1/chat/completions \ -H "Authorization: Bearer <upstream-key>" \ -H "X-HiveTrace-Api-Key: <gateway-key>" \ -H "Content-Type: application/json" \ -H "X-Application-Id: <APPLICATION_ID>" \ -H "X-User-Id: alice@company.com" \ -H "X-Attached-Files: [{\"name\":\"test111.txt\",\"content_base64\":\"$B64\",\"type\":\"text/plain\"}]" \ -d '{ "model": "gpt-5", "messages": [{"role":"user","content":"User message"}] }'Transfer through URL (the gateway downloads the file over HTTP/HTTPS):
curl -X POST http://gateway:4100/v1/chat/completions \ -H "Authorization: Bearer <upstream-key>" \ -H "X-HiveTrace-Api-Key: <gateway-key>" \ -H "Content-Type: application/json" \ -H "X-Application-Id: <APPLICATION_ID>" \ -H "X-User-Id: alice@company.com" \ -H 'X-Attached-Files: [{"url":"https://files.company.example/contract.pdf","name":"contract.pdf","type":"application/pdf"}]' \ -d '{ "model": "gpt-5", "messages": [{"role":"user","content":"User message"}] }'Descriptor format (X-Attached-Files is a JSON array; each item contains either url or content_base64):
| Field | Type | Description |
|---|---|---|
url | string | HTTP/HTTPS URL of the file. data: URIs with inline base64 are also supported. |
content_base64 | string | Base64-encoded file content. Alternative to url. |
name | string | File name. Used when saving to HiveTrace; inferred from the URL if missing. |
type | string | File MIME type. If missing, inferred from the extension or from the download response Content-Type. |
Header alias: X-AttachedFiles.
Behavior:
- LLM. The request body is sent to the upstream unchanged;
X-Attached-Filesis not forwarded to the upstream. If there is no inline file data inmessages[].content, the model does not receive the file. - HiveTrace. An audit copy of the file is additionally attached to the
user_promptrecord and linked to the requestanalysis_id. - Client changes. One additional header. The request body remains a standard OpenAI payload.
Additional capabilities compared with strategy 1:
- An audit copy of the file is saved in HiveTrace regardless of whether the file is passed to the model.
- URL transfer is supported: the gateway downloads the file itself, so the client does not need to put the file content into the request payload.
- The full file content reaches HiveTrace, not only the text part of the request.
5.3.3. Combined Scenario
Section titled “5.3.3. Combined Scenario”If you need both (a) to pass the file to the LLM and (b) to save its audit copy in HiveTrace, use both strategies in a single request. messages[].content delivers the file to the model, and X-Attached-Files provides the audit copy in HiveTrace.
B64=$(base64 -i ./contract.pdf | tr -d '\n')curl -X POST http://gateway:4100/v1/chat/completions \ -H "Authorization: Bearer <upstream-key>" \ -H "X-HiveTrace-Api-Key: <gateway-key>" \ -H "Content-Type: application/json" \ -H "X-Application-Id: <APPLICATION_ID>" \ -H "X-User-Id: alice@company.com" \ -H "X-Attached-Files: [{\"name\":\"contract.pdf\",\"content_base64\":\"$B64\",\"type\":\"application/pdf\"}]" \ -d "{ \"model\": \"gpt-5\", \"messages\": [{ \"role\": \"user\", \"content\": [ {\"type\": \"text\", \"text\": \"Summarize the document.\"}, {\"type\": \"file\", \"file\": {\"filename\": \"contract.pdf\", \"file_data\": \"data:application/pdf;base64,$B64\"}} ] }] }"In this case:
messages[].contentis passed to the upstream unchanged; the model receives the file through the standard OpenAI protocol.X-Attached-Filesis processed by the gateway in parallel: the file is decoded and attached touser_promptin HiveTrace.- The model response is registered in HiveTrace through the standard post-call request.
Use this for compliance scenarios where both model-side file processing and audit of the original content in HiveTrace are required.
5.3.4. Attachment Error Handling
Section titled “5.3.4. Attachment Error Handling”A failure to download or decode a file (HTTP >=400, timeout, size exceeding HIVETRACE_FILES_MAX_BYTES, base64 decoding error) does not cause a client-facing failure. The gateway isolates such failures:
- The problematic file is not sent to HiveTrace. The gateway logs a warning with diagnostics (
Attachment download failed: url=… status=404, and so on). - Other files in
X-Attached-Filesare processed independently. - The request body is passed to the upstream unchanged.
- The client receives a normal model response with the HTTP status returned by the upstream.
Attachment Limits
Section titled “Attachment Limits”| Parameter | Default | ENV |
|---|---|---|
| Maximum size of one file | 20 MiB | HIVETRACE_FILES_MAX_BYTES |
| Parallel downloads | 4 | HIVETRACE_FILES_MAX_CONCURRENCY |
| Download timeout for one file | 60 s | HIVETRACE_FILES_TIMEOUT |
A file that exceeds the limit is excluded from processing and a warning is written to the log; other attachments continue to be processed.
5.4. Request Body Size Limits
Section titled “5.4. Request Body Size Limits”The gateway does not impose its own request body size limit. Size control is handled by the upstream.
6. Response Format
Section titled “6. Response Format”6.1. Successful Non-Streaming Response
Section titled “6.1. Successful Non-Streaming Response”The response body and HTTP status fully match the upstream response. Content type: application/json. The structure matches the standard OpenAI format:
{ "id": "chatcmpl-…", "object": "chat.completion", "created": 1234567890, "model": "gpt-5", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 23, "completion_tokens": 45, "total_tokens": 68 }}6.2. Successful Streaming Response
Section titled “6.2. Successful Streaming Response”Content-Type: text/event-stream. The response body is a sequence of SSE frames:
data: {"id":"…","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hel"}}]}
data: {"id":"…","object":"chat.completion.chunk","choices":[{"delta":{"content":"lo!"},"finish_reason":"stop"}]}
data: [DONE]Comment frames may be sent between regular data: frames to keep the connection alive:
: keepaliveA streaming response can be delivered to the user with monitoring enabled only in async mode.
6.3. Response Replacement by HiveTrace Policy
Section titled “6.3. Response Replacement by HiveTrace Policy”This is possible only when app_mode=sync is set in the per-app policy (Redis). If mode=async or the policy is not configured, the behavior described in this section is not applied to the request.
6.3.1. Pre-Call Block (HiveTrace Rejected Before the LLM Call)
Section titled “6.3.1. Pre-Call Block (HiveTrace Rejected Before the LLM Call)”The request was not sent to the upstream. The gateway returns a synthesized chat.completion with a diagnostic metadata.hivetrace block:
{ "id": "hivetrace-guardrails-<uuid>", "object": "chat.completion", "created": 1717171717, "model": "<requested or 'unknown'>", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "<phrase from the application policy>" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0 }, "metadata": { "hivetrace": { "response_replaced": true, "response_replacement": { "stage": "pre_call", "type": "<guardrails|custom_policy|dataclean>", "request_id": "<metadata.request_id from the request body or a generated UUID>" }, "prepared_response": { "...": "verdict context from HiveTrace" } } }}The client can detect replacement by checking for metadata.hivetrace.response_replaced == true or by the id prefix (hivetrace-…).
6.3.2. Streaming + Sync Mode
Section titled “6.3.2. Streaming + Sync Mode”In sync mode, the gateway forcibly disables stream: true by replacing it with stream: false before calling HiveTrace. The reason is that replacing an SSE response after transmission to the client has started is technically impossible. Therefore, in sync mode, the client always receives a non-streaming response, even if body.stream was set to true.
7. Errors
Section titled “7. Errors”All errors returned to the client have the following shape:
{"error": {"message": "...", "type": "...", "code": <int>}}If the upstream already returns an OpenAI-formatted error ({"error": {...}}), the gateway passes it through without an additional wrapper, preserving the provider’s original fields (type, param, code).
The type field is used to distinguish error sources:
invalid_request_error- a client error rejected by the gateway or upstream.upstream_error- the upstream returned>=400, and the response body did not match the OpenAI-compatible format.gateway_error- the gateway could not successfully call the upstream. Alwayscode: 502.rate_limit_error,authentication_error, and others - native values from OpenAI-compatible upstreams, passed through unchanged.
7.1. Complete Table
Section titled “7.1. Complete Table”| Status | Source | Condition | Type in error.type |
|---|---|---|---|
400 | gateway | Invalid JSON in the request body | invalid_request_error |
400 | upstream | Business validation (missing messages, exceeded context window, and so on) | invalid_request_error (passthrough) |
401 | gateway | Missing or empty X-HiveTrace-Api-Key header | invalid_request_error |
401 | upstream | Authorization: Bearer rejected by the upstream | invalid_request_error (passthrough) |
403 | upstream | Authorization succeeded, but the action is forbidden (for example, organization without quota) | passthrough |
404 | upstream | The specified model does not exist | invalid_request_error (passthrough) |
413 | upstream / proxy | Body is too large | upstream_error |
422 | upstream | Payload failed validation (strict-mode schema) | passthrough |
429 | upstream | Rate limit | rate_limit_error (passthrough) |
500 | upstream | Internal upstream error | upstream_error |
502 | gateway | Network error between the gateway and upstream (DNS, refused, TLS). Also httpx.ReadTimeout BEFORE the first byte on non-streaming requests. | gateway_error |
503 | upstream | Upstream unavailable (maintenance, overload) | upstream_error |
504 | - (SSE error frames only) | Mid-stream httpx.TimeoutException | upstream_error |
8. Timeouts and Limits
Section titled “8. Timeouts and Limits”8.1. Layer Timeouts
Section titled “8.1. Layer Timeouts”| Layer | Parameter | Default | Controlled by |
|---|---|---|---|
| Gateway -> upstream (single HTTP call) | GATEWAY_UPSTREAM_TIMEOUT | 120 s | gateway |
Gateway -> HiveTrace (/process_request/, /process_response/) | HIVETRACE_TIMEOUT | 60 s | gateway |
| Gateway -> attachment URL (download) | HIVETRACE_FILES_TIMEOUT | 60 s | gateway |
| Gateway -> client: SSE keepalive | GATEWAY_STREAM_HEARTBEAT_SECONDS | 60 s | gateway |
9. Configuration
Section titled “9. Configuration”9.1. Minimum Environment Variables
Section titled “9.1. Minimum Environment Variables”# UpstreamUPSTREAM_URL=http://litellm:4000
# HiveTraceHIVETRACE_URL=https://hivetrace.example.com/apiHIVETRACE_APPLICATION_ID=<default app uuid>
# Redis (for per-app policies; optional if blocking is not required)REDIS_HOST=redisREDIS_PORT=6379REDIS_DB=0REDIS_USER=""REDIS_PASSWORD="admin"REDIS_SSL=False9.2. Behavior When HiveTrace Is Missing
Section titled “9.2. Behavior When HiveTrace Is Missing”If HIVETRACE_URL is not set, the gateway forwards requests to the upstream without sending telemetry to HiveTrace. When the module is imported, a warning is written to stderr: HiveTrace is effectively disabled.
The X-HiveTrace-Api-Key contract is preserved: the key remains required for client requests, although it is effectively unused when there is no HiveTrace connection. This mode is intended for debugging and resilient handling of configuration errors; in deployments, HIVETRACE_URL must be set explicitly.
9.3. Behavior When Redis Is Missing
Section titled “9.3. Behavior When Redis Is Missing”If Redis is not configured, per-app blocking policies are unavailable; pre-call and post-call telemetry is sent to HiveTrace in async mode by default.