Given a hint, the model will return one or more predicted completions and can also return the probability of the surrogate marker for each position. Complete creation for the provided prompts and parameters
Request
Header Params
Content-Type
string
required
Example:
application/json
Accept
string
required
Example:
application/json
Authorization
string
optional
Example:
Bearer {{YOUR_API_KEY}}
Body Params application/json
model
string
required
The ID of the model to use. For more details on which models can be used with the Chat API, refer to the model endpoint compatibility table.
messages
array [object {2}]
required
A list of messages included in the conversation so far. Python code example.
role
string
optional
content
string
optional
temperature
integer
optional
The sampling temperature to use, between 0 and 2. Higher values (e.g., 0.8) make the output more random, while lower values (e.g., 0.2) make it more focused and deterministic. We generally recommend altering this or top_p but not both.
top_p
integer
optional
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
n
integer
optional
Default is 1. The number of chat completion choices to generate for each input message.
stream
boolean
optional
Default is false. If set, partial message increments will be sent like in ChatGPT. Tokens will be sent as server-sent events (SSE) as they become available, and the stream will terminate with a data: [DONE] message. Python code example.
stop
string
optional
Default is null. Up to 4 sequences where the API will stop generating further tokens.
max_tokens
integer
optional
Default is infinity. The maximum number of tokens to generate in the chat completion.The total length of input tokens and generated tokens is limited by the model's context length. Python code example for token calculation.
presence_penalty
number
optional
A number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the likelihood of the model discussing new topics. See more on frequency and presence penalties.
frequency_penalty
number
optional
Default is 0. A number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text, reducing the likelihood of the model repeating the same line. See more on frequency and presence penalties.
logit_bias
null
optional
Modify the likelihood of specified tokens appearing in the completion.Accepts a JSON object that maps tokens (specified by their token IDs from the tokenizer) to associated bias values (-100 to 100). Mathematically, the bias is added to the logits generated by the model before sampling. The exact effect varies per model, but values between -1 and 1 should reduce or increase the likelihood of the associated tokens being selected; values like -100 or 100 should disable or exclusively allow the associated tokens.
user
string
optional
A unique identifier representing your end user, which can help OpenAI monitor and detect abuse. Learn more.
response_format
object
optional
An object specifying the format the model must output. Enabling JSON mode with { "type": "json_object" } ensures the model generates valid JSON messages. Important: When using JSON mode, you must also instruct the model to generate JSON through system or user messages. Failure to do so may result in endless blank streams until the token limit is reached, leading to increased latency and the appearance of a 'stuck' request. Also note that if finish_reason="length", the message content may be partially truncated, indicating generation exceeded max_tokens or the conversation exceeded the maximum context length. Show properties.
seen
integer
optional
This feature is experimental. If specified, our system will make a best effort to deterministically sample such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor backend changes.
tools
array[string]
required
A list of tools the model can call. Currently, only functions are supported as tools. Use this feature to provide a list of functions the model can generate JSON inputs for.
tool_choice
object
required
Controls which function (if any) the model calls. none means the model will not call a function but will generate a message. auto means the model can choose between generating a message and calling a function. Forcing the model to call a function can be done with {"type": "function", "function": {"name": "my_function"}}. If no functions exist, the default is none. If functions exist, the default is auto. Show possible types.
Example
{"model":"claude-3-5-sonnet-20240620","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello!"}],"stream":true}
{"id":"chatcmpl-123","object":"chat.completion","created":1677652288,"choices":[{"index":0,"message":{"role":"assistant","content":"\n\nHello there, how may I assist you today?"},"finish_reason":"stop"}],"usage":{"prompt_tokens":9,"completion_tokens":12,"total_tokens":21}}