Docs
  1. ChatGPT (Audio)
Docs
  • Introduction
  • Quick Start Guide
  • Make a request
  • Chat Models
    • ChatGpt
      • ChatGPT (Audio)
        • Create a voice
          POST
        • Create a transcript
          POST
        • Create translation
          POST
      • ChatGPT (Chat)
        • Chat completion object
        • Create chat completion (streaming)
        • Create chat completion (non-streaming)
        • Create chat image recognition (streaming)
        • Create chat image recognition (streaming) base64
        • Create chat image recognition (non-streaming)
        • Function calling
        • N choices
        • Create chat function call (only non-streaming)
        • Create structured output
      • ChatGPT (Completions)
        • Completion object
        • Creation completed
      • ChatGPT(Embeddings)
        • Embedded Object
        • Create embed
    • Anthropic Claude
      • Create chat completion (streaming)
      • Create chat completion (non-streaming)
      • Create chat image recognition (streaming)
      • Create chat image recognition (non-streaming)
    • Gemini
      • Gemini Image creation interface (gemini-2.0-flash-exp-image-generation)
      • Chat interface
      • Image recognition interface
  • Image Models
    • MJ
      • Submit Imagine task (mj_imagine)
      • Submit Blend task (mj_blend)
      • Submit Describe task (mj_describe)
      • Submit Change task (mj_variation, mj_upscale,mj_reroll)
      • Query task status based on task ID
    • Ideogram
      • Generate with Ideogram 3.0
      • Edit with Ideogram 3.0
      • Remix with Ideogram 3.0
    • Kling Image
      • Submit Image Generation
      • Get Image by Task ID
      • Submit Kolors Virtual Try On
      • Get Kolors Virtual Try On by Task ID
    • DALL·E 3
      POST
    • Flux (OpenAI dall-e-3 format)
      POST
  • Video Models
    • Kling Video
      • Create Video by Text
      • Get Video by Task ID(text2video)
      • Create Video by Image
      • Get Video by Task ID(image2video)
    • Runway ML Video
      • Create Video by Runway
      • Get Video by Task ID
    • Luma Video
      • Create Video by Luma
      • Get Video by Task ID
    • Pika Video
      • Create Video by Pika
      • Get Video by Task ID
  • Music Model - Suno
    • Illustrate
    • Parameter
    • Task submission
      • Generate songs (inspiration, customization, continuation)
      • Generate lyrics
    • Query interface
      • Query a single task
  • Python Samples
    • python openai official library (using AutoGPT, langchain, etc.)
    • Python uses speech to text
    • Python uses text to speech
    • Python uses Embeddings
    • python calls DALL·E
    • python simple call openai function-calling demo
    • python langchain
    • python llama_index
    • Python uses gpt-4o to identify pictures-local pictures
    • python library streaming output
    • Python uses gpt-4o to identify images
  • Plug-in/software usage tutorials
    • Setting HTTP for Make.com with Yescale
    • Sample Code for gpt-4o-audio/gpt-4o-mini-audio
  • Help Center
    • HTTP status codes
  1. ChatGPT (Audio)

Create a transcript

POST
/v1/audio/transcriptions

Request

Body Params multipart/form-data
file
file 
required
Audio file object (not file name) to be transcribed, in the format: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav or webm.
Example:
file://D:\Backup\Downloads\123.mp3
model
string 
required
The model ID to use. Currently only whisper-1 is available.
Example:
whisper-1
language
string 
optional
Enter the language of the audio. Providing the input language in ISO-639-1 format improves accuracy and latency.
prompt
string 
optional
An optional text to guide the model's style or to continue the previous audio paragraph. The prompt should match the audio language.
temperature
number 
optional
Default is 0
Sampling temperature, between 0 and 1. Higher values ​​like 0.8 will make the output more random, while lower values ​​like 0.2 will make it more focused and deterministic. If set to 0, the model will automatically increase the temperature using log probability until a specific threshold is reached.

Request samples

Shell
JavaScript
Java
Swift
Go
PHP
Python
HTTP
C
C#
Objective-C
Ruby
OCaml
Dart
R
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST '/v1/audio/transcriptions' \
--form 'file=@"D:\\Backup\\Downloads\\123.mp3"' \
--form 'model="whisper-1"'

Responses

🟢200success
application/json
Body
text
string 
required
Example
{
    "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}
Previous
Create a voice
Next
Create translation
Built with