Chat Completion API Documentation

Endpoint

POST /api/chat/completions

This endpoint generates chat completions using the specified model and conversation history. It supports both non-streaming and streaming responses.


Request Format

Body Parameters

The request body should be in JSON format with the following fields:

Parameter Type Description
model string The model name to use (e.g., brogevity-mini).
messages List<Dict> The conversation history, including system, user, and assistant roles. Each entry must have role and content.
temperature float (default: 1.0) Controls the randomness of the output. Lower values (e.g., 0.2) make output more deterministic.
top_p float (default: 1.0) Alternative to temperature. Filters output based on cumulative probability.
n int (default: 1) Number of completions to generate.
stream bool (default: false) If true, streams the response in chunks.
stop string or List<string> (optional) Sequence(s) that will stop the generation.
max_tokens int (optional) Maximum number of tokens to generate in the completion.
presence_penalty float (default: 0.0) Penalizes new tokens based on their presence in the text so far, encouraging new topics.
frequency_penalty float (default: 0.0) Penalizes tokens based on their frequency in the text so far.
logit_bias Dict<string, float> (optional) Modifies the likelihood of specific tokens appearing in the completion.
user string (optional) A unique identifier for tracking the user making the request.

Example Request

{
    "model": "brogevity-mini",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What Andrew Huberman thinks about Sleep?"}
    ],
    "temperature": 0.7,
    "stream": false
}


Response Format

Non-Streaming Response

When stream is set to false, the response is returned as a single JSON object:

Field Type Description
id string Unique identifier for the completion.
object string The object type (e.g., chat.completion).
created int Timestamp when the completion was created.
model string The model used for the completion.
choices List<Dict> Contains the generated messages.
usage Dict Token usage statistics (prompt_tokens, completion_tokens, total_tokens).

Example Non-Streaming Response

{
    "id": "bro5e804520a51740f39bc321f0",
    "object": "chat.completion",
    "created": 1732906423,
    "model": "brogevity-mini",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Andrew Huberman emphasizes the critical role of sleep in cognitive and physical health..."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 5319,
        "completion_tokens": 492,
        "total_tokens": 5811
    }
}


Streaming Response