Skip to main content
POST
/
v1
/
responses
curl --request POST \
  --url https://api.minimax.io/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "model": "MiniMax-M3",
  "input": "Hello!"
}
'
{
  "id": "abc123",
  "object": "response",
  "created_at": 1764000000,
  "model": "MiniMax-M3",
  "status": "completed",
  "output": [
    {
      "id": "abc123_msg",
      "type": "message",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Hello! I'm MiniMax. How can I help you today?",
          "annotations": []
        }
      ]
    }
  ],
  "output_text": "Hello! I'm MiniMax. How can I help you today?",
  "usage": {
    "input_tokens": 8,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 14,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 22
  },
  "parallel_tool_calls": true,
  "store": false,
  "truncation": "disabled"
}

Reasoning Control

For MiniMax-M3, the reasoning field controls whether the response can include reasoning output.
  • If reasoning is omitted, reasoning is disabled by default and the response does not include an output item with type: "reasoning".
  • reasoning: {"effort": "none"} is the default behavior and disables reasoning output for MiniMax-M3.
  • Values minimal, low, medium, and high are accepted for compatibility and enable reasoning output, but they do not tune MiniMax-M3’s reasoning depth.
  • For M2.x models, reasoning cannot be disabled; reasoning: {"effort": "none"} is accepted but reasoning remains on.
{
  "model": "MiniMax-M3",
  "input": "Which is larger, 9.11 or 9.9?"
}
{
  "model": "MiniMax-M3",
  "input": "Which is larger, 9.11 or 9.9?",
  "reasoning": {
    "effort": "minimal"
  }
}

Authorizations

Authorization
string
header
required

HTTP: Bearer Auth

  • Security Scheme Type: http
  • HTTP Authorization Scheme: Bearer API_key, used to authenticate your account. View it in Account Management > API Keys

Headers

Content-Type
enum<string>
default:application/json
required

Media type of the request body. Must be set to application/json

Available options:
application/json

Body

application/json
model
string
required

Model name to invoke, e.g. MiniMax-M3

Example:

"MiniMax-M3"

input
required

Conversation content. Supports either a simple text or a full conversation history array

service_tier
enum<string>
default:standard

Service tier for request admission. Supported values are standard and priority. If omitted, the request uses the standard tier. The priority price is 1.5 times the standard price and ensures priority admission so the request is processed ahead of other requests, leading to faster responses and fewer failures.

Available options:
standard,
priority
instructions
string

System instructions

max_output_tokens
integer

Maximum output token count

temperature
number<float>
default:1

Sampling temperature, range (0, 1]

Required range: 0 <= x <= 1
top_p
number<float>
default:0.95

Nucleus sampling, range (0, 1]

Required range: 0 <= x <= 1
stream
boolean
default:false

Set to true to enable SSE streaming response

tools
object[]

Tool list

tool_choice
enum<string>

Tool selection strategy: none means no tool will be called; auto lets the model decide whether to call tools

Available options:
none,
auto
metadata
object

Request metadata. Both keys and values are strings

prompt_cache_key
string

Prompt cache routing identifier

text
object

Output format control

reasoning
object

Reasoning control. For MiniMax-M3, the default is none, which disables reasoning. Set effort to a non-none value (minimal, low, medium, or high) to enable Adaptive Thinking, but this does not tune MiniMax-M3's reasoning depth. For M2.x models, reasoning cannot be disabled.

Response

200 - application/json

Successful response

id
string
required

Response ID

Example:

"abc123"

object
enum<string>
required

Object type, always response

Available options:
response
created_at
integer
required

Response creation time (Unix seconds)

model
string
required

Actual model that processed the request

status
enum<string>
required

Response status

Available options:
completed,
incomplete,
failed
output
(Message · object | Reasoning · object | Function Call · object)[]
required

Model output list

Assistant reply

output_text
string | null

Convenience field. Concatenation of all text outputs

usage
object
error
object

Error info, only returned when status=failed

incomplete_details
object

Reason for incompletion, only returned when status=incomplete

parallel_tool_calls
boolean

Whether parallel tool calls are supported

store
boolean

Whether the response is persisted

truncation
enum<string>

Context truncation strategy

Available options:
disabled