Docs
  1. YEScale Media
Docs
  • Introduction
  • Quick Start Guide
  • Make a request
  • YEScale System API Tutorial
  • Chat Models
    • ChatGpt
      • ChatGPT (Chat)
        • Chat completion object
        • Create chat completion (streaming)
        • Create chat completion (non-streaming)
        • Create chat image recognition (streaming)
        • Create chat image recognition (streaming) base64
        • Create chat image recognition (non-streaming)
        • Function calling
        • N choices
        • Create chat function call (only non-streaming)
        • Create structured output
      • ChatGPT Response
        • OpenAI Responses
      • ChatGPT (Audio)
        • Create transcription by gpt-4o-mini-transcribe & gpt-4o-transcribe
        • Create a voice with gpt-4o-mini-tts
        • Create a voice
        • Create a transcript
        • Create translation
      • ChatGPT (Completions)
        • Completion object
        • Creation completed
      • ChatGPT(Embeddings)
        • Embedded Object
        • Create embed
    • Anthropic Claude
      • Offical Format
        • Messages (official Anthropic format)
        • Messages(Image Recognition)
        • Messages(function call)
        • Messages(Web search)
      • Create chat completion (streaming)
      • Create chat completion (non-streaming)
      • Create chat image recognition (streaming)
      • Create chat image recognition (non-streaming)
    • Gemini
      • Official Endpoint - v1beta
        • Text generation
        • Thinking with Gemini 2.5
        • Structured output
        • Function calling
        • Grounding with Google Search
        • URL context
        • Gemini TTS
        • Gemini Embedding
      • OpenAI Endpoint Chat
        • Gemini Image creation interface (gemini-2.0-flash-exp-image-generation)
        • Chat interface
        • Image recognition interface
        • Function calling - Google Search
        • Function calling - codeExecution
    • Deepseek
      • Deepseek v3.1
  • YEScale Media
    • Introduction to YEScale Media
    • Submit
      • Submit Sora-2/ Sora-2-Pro
      • Submit Imagen4[Fast/Ultra]
      • Submit Veo3.1/Veo3.1-pro
      • Submit Flux Kontext
      • Submit Kling 2.5 Turbo
      • Submit Hailuo 2.3
      • Submit Seedream-4.0
      • Submit wan2.5-i2v-preview
      • Submit Nano-banana
      • Submit GPT-Image
    • Get YEScale Media Task
      GET
  • Image Models
    • GPT-IMAGE-1
      • Generate Image by gpt-image-1
      • Edit Image by gpt-image-1
    • QWEN IMAGE
      • Generate Image by qwen-image
      • Edit Image by qwen-image-edit
    • MJ
      • Submit Imagine task (mj_imagine)
      • Submit Blend task (mj_blend)
      • Submit Describe task (mj_describe)
      • Submit Change task (mj_variation, mj_upscale,mj_reroll)
      • Query task status based on task ID
    • Ideogram
      • Generate with Ideogram 3.0
      • Edit with Ideogram 3.0
      • Remix with Ideogram 3.0
      • Ideogram Upscale
    • Kling Image
      • Submit Image Generation
      • Get Image by Task ID
      • Submit Kolors Virtual Try On
      • Get Kolors Virtual Try On by Task ID
    • Flux
      • Flux on Replicate
        • Submit Image by flux-kontext-pro
        • Submit Image by flux-kontext-max
        • Submit Image by flux-pro
        • Submit Image by flux-pro-1.1-ultra
        • Get Image by ID
    • Recraft API
      • Recraft Image
      • Generate Image
      • Generate Vector Image
      • Remove Background
      • Clarity Upscale
      • Generative Upscale
    • Models use Dall-e Format
      • Google Imagen
      • Bytedance - seedream-3.0
      • Recraftv3 use Dall-e endpoint
      • Flux use Dall-e endpoint
      • Bytedance - Seedream-4.0
      • Flux Kontext Pro/Max by Dall-e Endpoint
    • Google Imagen
      • Google/imagen-4 on Replicate
      • Get Imagen 4 Task
    • Gemini-2.5-flash-image
      • Official Endpoint
        • Image generation with Gemini (aka Nano Banana)
        • Image generation with Gemini (aka Nano Banana) - With aspectRatio
        • Image editing (text-and-image-to-image)
      • OpenAI Chat
        • Create Text to Image
        • Edit Image - Base64 Image -> Image
      • Create Text to Image (Dall-e endpoint)
      • Edit Image (Gpt-image-1 Endpoint)
    • DALL·E 3
      POST
  • Video Models
    • Kling Video
      • Create Video by Text
      • Get Video by Task ID(text2video)
      • Create Video by Image
      • Get Video by Task ID(image2video)
    • Runway ML Video
      • Create Video by Runway
      • Get Video by Task ID
    • Luma Video
      • Create Video by Luma
      • Get Video by Task ID
    • Pika Video
      • Create Video by Pika
      • Get Video by Task ID
    • Google Veo
      • Submit Video Request
      • Submit Video Request with Frames
      • Get Video by ID
    • Minimax - Hailuo
      • Submit Video Request
      • Get Video
    • Seedance
      • Submit Video Request
      • Get Video by Task ID
    • Mj Video
      • Submit Mj Video Request
      • Get Mj Video by task id
  • FAL-AI Models
    • Images Models
      • Ideogram/v3/remix
      • Flux-pro/kontext/max
      • Fal-bytedance-seededit-v3-edit-image
      • Fal-recraft-v3-text-to-image
      • Fal-recraft-v3-image-to-image
      • Fal-recraft-upscale-crisp
    • Audio Models
      • Minimax/speech-02-hd
      • Minimax/speech-02-turbo
      • Minimax/voice-clone
      • Elevenlabs/tts/turbo-v2.5
      • Elevenlabs/tts/multilingual-v2
      • Elevenlabs/tts/eleven-v3
    • Video Models
      • Topaz/upscale/video
      • Luma-dream-machine/ray-2-flash/reframe
      • Luma-dream-machine/ray-2/reframe
      • Kling/Lipsync- Audio2Video
      • Kling/Lipsync- Text2Video
    • Get FAL-AI tasks
  • Music Model - Suno
    • Illustrate
    • Parameter
    • Task submission
      • Generate songs (inspiration, customization, continuation)
      • Generate lyrics
    • Query interface
      • Query a single task
  • Python Samples
    • python openai official library (using AutoGPT, langchain, etc.)
    • Python uses speech to text
    • Python uses text to speech
    • Python uses Embeddings
    • python calls DALL·E
    • python simple call openai function-calling demo
    • python langchain
    • python llama_index
    • Python uses gpt-4o to identify pictures-local pictures
    • python library streaming output
    • Python uses gpt-4o to identify images
  • Plug-in/software usage tutorials
    • Setting HTTP for Make.com with Yescale
    • Sample Code for gpt-4o-audio/gpt-4o-mini-audio
  • Help Center
    • HTTP status codes
  • Tutorials
    • GPT-Image-1 API: A Step-by-Step Guide With Examples
    • Claude Code via YEScale API
    • Task Sync Endpoint Usage Guide
  1. YEScale Media

Introduction to YEScale Media

Introduction to YEScale Media#

Welcome to the YEScale Media API documentation. This document provides a technical overview for developers looking to integrate our unified media generation service.
YEScale Media is a powerful API layer designed to simplify the use of generative media models (Video, Audio, Images). Our primary goal is to abstract away the complexity of managing multiple endpoints and sources, such as official APIs (OpenAI, Stability AI), Fal-AI, and Replicate, providing you with a single, reliable point of integration.

Core Features#

1. YEScale CDN#

All media generated through our service is automatically stored on the YEScale Content Delivery Network (CDN). This provides a stable, long-term storage solution for your assets, eliminating issues with temporary signed URLs from underlying services that expire quickly.
Persistent Links: Generated media links do not expire.
High Availability: Benefit from a robust and globally distributed network.
Simplified Asset Management: No need to build your own system for downloading and re-hosting files.

2. Auto Route Mode#

Our intelligent routing system ensures high availability and successful task completion. If a request to a primary source fails for any reason (e.g., capacity issues, API errors), YEScale Media automatically reroutes the task to a capable alternative source to fulfill the request.
Increased Reliability: Drastically reduces the rate of failed generations.
Seamless Fallback: The entire process is transparent to the end-user.
Optimized Performance: We continuously monitor provider performance to route your request to the best available option.
Example Scenario: A request for an image using dall-e-3 via the official API fails. YEScale Media's Auto Route Mode detects the failure and seamlessly resubmits the same request to a provider like Fal-AI, ensuring the user receives their generated image.

3. Simplified & Unified Payload#

We provide a consistent and intuitive payload structure across all supported models, regardless of the underlying source. This dramatically reduces integration time and code complexity.
The core structure is as follows:
{
  "model": "model-name",
  "prompt": "Your descriptive prompt here.",
  "config": {}
}
model (string): The identifier for the desired model (e.g., gemini-2.5-flash-image[nano-banana], gpt-image).
prompt (string): The text prompt for the generation.
config (object): A flexible container for all model-specific parameters. We standardize common parameters for ease of use.

Standardized config Parameters#

We standardize common parameters across similar model types. For example, image generation models often use:
size: e.g., "1024x1024"
aspect_ratio: e.g., "16:9"

Input Images#

For image-to-image or video generation tasks, you can provide input media in the images list within the config object. We support both publicly accessible URLs and Base64 encoded strings for maximum flexibility.
{
  "model": "some-image-to-image-model",
  "prompt": "A cat wearing a wizard hat.",
  "config": {
    "images": [
      "https://example.com/source_image.jpg",
      "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA..."
    ]
  }
}

API Reference#

The YEScale Media API is organized around the task resource. The workflow consists of submitting a task and then polling for its result.

1. Submit a Task#

This endpoint creates a new media generation task. The task is processed asynchronously.
Endpoint: POST /task/submit
Request Payload:
{
    "model": "gpt-image",
    "prompt": "Edit Text to We Are Yescale",
    "config": {
        "background": "transparent",
        "quality": "low",
        "size": "1024x1024",
        "images": ["https://cdn.yescale.vip/yescale-gpt-image-e3cb4a854f7e.png"]
    }
}
Success Response (202 Accepted):
The response will contain a task_id which you will use to retrieve the result.
{
    "created_at": 1763103034,
    "progress": "0%",
    "results": {},
    "status": "SUBMITTED",
    "task_id": "yescale-gpt-image-3bff624161c0",
    "updated_at": 1763103034
}

2. Get Task Result#

Retrieve the status and result of a previously submitted task using its task_id.
Endpoint: GET /task/{task_id}
Path Parameters:
task_id (string): The ID of the task returned from the POST /task/submit endpoint.
Example Request:
GET /task/t_0a1b2c3d4e5f6g7h8i9j
Success Response (200 OK):
Once the task is complete, the status will be completed, and the result object will contain the output, including the permanent YEScale CDN URL for your generated media.
{
    "finish_time": 1761576861,
    "message": "YEScale - YESCALE_MEDIA - sora-2 - Task Result",
    "progress": "100%",
    "status": "SUCCESS",
    "submit_time": 1761576676,
    "task_id": "yescale-sora-2-eefd4aad095d",
    "task_result": {
        "note": "Link will be removed at the specified time (GMT+7).",
        "url": "https://cdn.yescale.vip/yescale-sora-2-eefd4aad095d.mp4",
        "url_expires_at": "2025-11-11 21:54:21"
    }
}
Modified at 2025-11-14 07:02:42
Previous
Deepseek v3.1
Next
Submit Sora-2/ Sora-2-Pro
Built with