Docs
  1. ChatGPT (Audio)
Docs
  • Introduction
  • Quick Start Guide
  • Make a request
  • YEScale System API Tutorial
  • Chat Models
    • ChatGpt
      • ChatGPT (Audio)
        • Create transcription by gpt-4o-mini-transcribe & gpt-4o-transcribe
          POST
        • Create a voice with gpt-4o-mini-tts
          POST
        • Create a voice
          POST
        • Create a transcript
          POST
        • Create translation
          POST
      • ChatGPT (Chat)
        • Chat completion object
        • Create chat completion (streaming)
        • Create chat completion (non-streaming)
        • Create chat image recognition (streaming)
        • Create chat image recognition (streaming) base64
        • Create chat image recognition (non-streaming)
        • Function calling
        • N choices
        • Create chat function call (only non-streaming)
        • Create structured output
      • ChatGPT (Completions)
        • Completion object
        • Creation completed
      • ChatGPT(Embeddings)
        • Embedded Object
        • Create embed
    • Anthropic Claude
      • Offical Format
        • Messages (official Anthropic format)
        • Messages(Image Recognition)
        • Messages(function call)
        • Messages(Web search)
      • Create chat completion (streaming)
      • Create chat completion (non-streaming)
      • Create chat image recognition (streaming)
      • Create chat image recognition (non-streaming)
    • Gemini
      • Gemini Image creation interface (gemini-2.0-flash-exp-image-generation)
      • Chat interface
      • Image recognition interface
      • Function calling - Google Search
      • Function calling - codeExecution
  • Image Models
    • GPT-IMAGE-1
      • Generate Image by gpt-image-1
      • Edit Image by gpt-image-1
    • MJ
      • Submit Imagine task (mj_imagine)
      • Submit Blend task (mj_blend)
      • Submit Describe task (mj_describe)
      • Submit Change task (mj_variation, mj_upscale,mj_reroll)
      • Query task status based on task ID
    • Ideogram
      • Generate with Ideogram 3.0
      • Edit with Ideogram 3.0
      • Remix with Ideogram 3.0
      • Ideogram Upscale
    • Kling Image
      • Submit Image Generation
      • Get Image by Task ID
      • Submit Kolors Virtual Try On
      • Get Kolors Virtual Try On by Task ID
    • Flux
      • Flux on Replicate
        • Submit Image by flux-kontext-pro
        • Submit Image by flux-kontext-max
        • Submit Image by flux-pro
        • Submit Image by flux-pro-1.1-ultra
        • Get Image by ID
    • Recraft API
      • Recraft Image
      • Generate Image
      • Generate Vector Image
      • Remove Background
      • Clarity Upscale
      • Generative Upscale
    • Models use Dall-e Format
      • Google Imagen
      • Bytedance - seedream-3.0
      • Recraftv3 use Dall-e endpoint
      • Flux use Dall-e endpoint
    • Google Imagen
      • Google/imagen-4 on Replicate
      • Get Imagen 4 Task
    • DALL·E 3
      POST
  • Video Models
    • Kling Video
      • Create Video by Text
      • Get Video by Task ID(text2video)
      • Create Video by Image
      • Get Video by Task ID(image2video)
    • Runway ML Video
      • Create Video by Runway
      • Get Video by Task ID
    • Luma Video
      • Create Video by Luma
      • Get Video by Task ID
    • Pika Video
      • Create Video by Pika
      • Get Video by Task ID
    • Google Veo
      • Submit Video Request
      • Submit Video Request with Frames
      • Get Video by ID
    • Minimax - Hailuo
      • Submit Video Request
      • Get Video
    • Seedance
      • Submit Video Request
      • Get Video by Task ID
    • Mj Video
      • Submit Mj Video Request
      • Get Mj Video by task id
  • FAL-AI Models
    • Images Models
      • Ideogram/v3/remix
      • Flux-pro/kontext/max
      • Fal-recraft-v3-text-to-image
      • Fal-recraft-v3-image-to-image
      • Fal-bytedance-seededit-v3-edit-image
    • Audio Models
      • Minimax/speech-02-hd
      • Minimax/speech-02-turbo
      • Elevenlabs/tts/turbo-v2.5
      • Elevenlabs/tts/multilingual-v2
    • Video Models
      • Topaz/upscale/video
      • Luma-dream-machine/ray-2-flash/reframe
      • Luma-dream-machine/ray-2/reframe
    • Get FAL-AI tasks
  • Music Model - Suno
    • Illustrate
    • Parameter
    • Task submission
      • Generate songs (inspiration, customization, continuation)
      • Generate lyrics
    • Query interface
      • Query a single task
  • Python Samples
    • python openai official library (using AutoGPT, langchain, etc.)
    • Python uses speech to text
    • Python uses text to speech
    • Python uses Embeddings
    • python calls DALL·E
    • python simple call openai function-calling demo
    • python langchain
    • python llama_index
    • Python uses gpt-4o to identify pictures-local pictures
    • python library streaming output
    • Python uses gpt-4o to identify images
  • Plug-in/software usage tutorials
    • Setting HTTP for Make.com with Yescale
    • Sample Code for gpt-4o-audio/gpt-4o-mini-audio
  • Help Center
    • HTTP status codes
  • Tutorials
    • GPT-Image-1 API: A Step-by-Step Guide With Examples
    • Claude Code via YEScale API
    • Task Sync Endpoint Usage Guide
  1. ChatGPT (Audio)

Create a voice with gpt-4o-mini-tts

POST
/v1/audio/speech
GPT-4o mini TTS is a text-to-speech model built on GPT-4o mini, a fast and powerful language model. Use it to convert text to natural sounding spoken text. The maximum number of input tokens is 2000.
For intelligent realtime applications, use the gpt-4o-mini-tts model, our newest and most reliable text-to-speech model. You can prompt the model to control aspects of speech, including:
Accent
Emotional range
Intonation
Impressions
Speed of speech
Tone
Whispering

Request

Body Params application/json

Example
{
    "model": "gpt-4o-mini-tts",
    "input": "Việt nam có đẹp không!",
    "voice": "coral",
    "instructions": "Speak in a cheerful and positive tone."
  }

Request Code Samples

Shell
JavaScript
Java
Swift
Go
PHP
Python
HTTP
C
C#
Objective-C
Ruby
OCaml
Dart
R
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST '/v1/audio/speech' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "gpt-4o-mini-tts",
    "input": "Việt nam có đẹp không!",
    "voice": "coral",
    "instructions": "Speak in a cheerful and positive tone."
  }'

Responses

🟢200success
application/json
Body

Example
{}
Modified at 2025-06-12 14:10:12
Previous
Create transcription by gpt-4o-mini-transcribe & gpt-4o-transcribe
Next
Create a voice
Built with