Docs
  1. Official Endpoint - v1beta
Docs
  • Introduction
  • Quick Start Guide
  • Make a request
  • YEScale System API Tutorial
  • Chat Models
    • ChatGpt
      • ChatGPT (Audio)
        • Create transcription by gpt-4o-mini-transcribe & gpt-4o-transcribe
        • Create a voice with gpt-4o-mini-tts
        • Create a voice
        • Create a transcript
        • Create translation
      • ChatGPT (Chat)
        • Chat completion object
        • Create chat completion (streaming)
        • Create chat completion (non-streaming)
        • Create chat image recognition (streaming)
        • Create chat image recognition (streaming) base64
        • Create chat image recognition (non-streaming)
        • Function calling
        • N choices
        • Create chat function call (only non-streaming)
        • Create structured output
      • ChatGPT (Completions)
        • Completion object
        • Creation completed
      • ChatGPT(Embeddings)
        • Embedded Object
        • Create embed
    • Anthropic Claude
      • Offical Format
        • Messages (official Anthropic format)
        • Messages(Image Recognition)
        • Messages(function call)
        • Messages(Web search)
      • Create chat completion (streaming)
      • Create chat completion (non-streaming)
      • Create chat image recognition (streaming)
      • Create chat image recognition (non-streaming)
    • Gemini
      • Official Endpoint - v1beta
        • Text generation
          POST
        • Thinking with Gemini 2.5
          POST
        • Structured output
          POST
        • Function calling
          POST
        • Grounding with Google Search
          POST
        • URL context
          POST
        • Gemini TTS
          POST
      • OpenAI Endpoint Chat
        • Gemini Image creation interface (gemini-2.0-flash-exp-image-generation)
        • Chat interface
        • Image recognition interface
        • Function calling - Google Search
        • Function calling - codeExecution
    • Deepseek
      • Deepseek v3.1
  • Image Models
    • GPT-IMAGE-1
      • Generate Image by gpt-image-1
      • Edit Image by gpt-image-1
    • QWEN IMAGE
      • Generate Image by qwen-image
      • Edit Image by qwen-image-edit
    • MJ
      • Submit Imagine task (mj_imagine)
      • Submit Blend task (mj_blend)
      • Submit Describe task (mj_describe)
      • Submit Change task (mj_variation, mj_upscale,mj_reroll)
      • Query task status based on task ID
    • Ideogram
      • Generate with Ideogram 3.0
      • Edit with Ideogram 3.0
      • Remix with Ideogram 3.0
      • Ideogram Upscale
    • Kling Image
      • Submit Image Generation
      • Get Image by Task ID
      • Submit Kolors Virtual Try On
      • Get Kolors Virtual Try On by Task ID
    • Flux
      • Flux on Replicate
        • Submit Image by flux-kontext-pro
        • Submit Image by flux-kontext-max
        • Submit Image by flux-pro
        • Submit Image by flux-pro-1.1-ultra
        • Get Image by ID
    • Recraft API
      • Recraft Image
      • Generate Image
      • Generate Vector Image
      • Remove Background
      • Clarity Upscale
      • Generative Upscale
    • Models use Dall-e Format
      • Google Imagen
      • Bytedance - seedream-3.0
      • Recraftv3 use Dall-e endpoint
      • Flux use Dall-e endpoint
      • Bytedance - Seedream-4.0
    • Google Imagen
      • Google/imagen-4 on Replicate
      • Get Imagen 4 Task
    • Gemini-2.5-flash-image
      • Official Endpoint
        • Image generation with Gemini (aka Nano Banana)
        • Image editing (text-and-image-to-image)
      • OpenAI Chat
        • Create Text to Image
        • Edit Image - Base64 Image -> Image
      • Create Text to Image (Dall-e endpoint)
      • Edit Image (Gpt-image-1 Endpoint)
    • DALL·E 3
      POST
  • Video Models
    • Kling Video
      • Create Video by Text
      • Get Video by Task ID(text2video)
      • Create Video by Image
      • Get Video by Task ID(image2video)
    • Runway ML Video
      • Create Video by Runway
      • Get Video by Task ID
    • Luma Video
      • Create Video by Luma
      • Get Video by Task ID
    • Pika Video
      • Create Video by Pika
      • Get Video by Task ID
    • Google Veo
      • Submit Video Request
      • Submit Video Request with Frames
      • Get Video by ID
    • Minimax - Hailuo
      • Submit Video Request
      • Get Video
    • Seedance
      • Submit Video Request
      • Get Video by Task ID
    • Mj Video
      • Submit Mj Video Request
      • Get Mj Video by task id
  • FAL-AI Models
    • Images Models
      • Ideogram/v3/remix
      • Flux-pro/kontext/max
      • Fal-bytedance-seededit-v3-edit-image
      • Fal-recraft-v3-text-to-image
      • Fal-recraft-v3-image-to-image
      • Fal-recraft-upscale-crisp
    • Audio Models
      • Minimax/speech-02-hd
      • Minimax/speech-02-turbo
      • Elevenlabs/tts/turbo-v2.5
      • Elevenlabs/tts/multilingual-v2
      • Elevenlabs/tts/eleven-v3
    • Video Models
      • Topaz/upscale/video
      • Luma-dream-machine/ray-2-flash/reframe
      • Luma-dream-machine/ray-2/reframe
      • Kling/Lipsync- Audio2Video
      • Kling/Lipsync- Text2Video
    • Get FAL-AI tasks
  • Music Model - Suno
    • Illustrate
    • Parameter
    • Task submission
      • Generate songs (inspiration, customization, continuation)
      • Generate lyrics
    • Query interface
      • Query a single task
  • Python Samples
    • python openai official library (using AutoGPT, langchain, etc.)
    • Python uses speech to text
    • Python uses text to speech
    • Python uses Embeddings
    • python calls DALL·E
    • python simple call openai function-calling demo
    • python langchain
    • python llama_index
    • Python uses gpt-4o to identify pictures-local pictures
    • python library streaming output
    • Python uses gpt-4o to identify images
  • Plug-in/software usage tutorials
    • Setting HTTP for Make.com with Yescale
    • Sample Code for gpt-4o-audio/gpt-4o-mini-audio
  • Help Center
    • HTTP status codes
  • Tutorials
    • GPT-Image-1 API: A Step-by-Step Guide With Examples
    • Claude Code via YEScale API
    • Task Sync Endpoint Usage Guide
  1. Official Endpoint - v1beta

Text generation

POST
/v1beta/models/{model_name}:generateContent

Request

Path Params

Header Params

Body Params application/json

Example
{
    "contents": [
      {
        "parts": [
          {
            "text": "How does AI work?"
          }
        ]
      }
    ]
  }

Request Code Samples

Shell
JavaScript
Java
Swift
Go
PHP
Python
HTTP
C
C#
Objective-C
Ruby
OCaml
Dart
R
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST '/v1beta/models/:generateContent' \
--header 'x-goog-api-key;' \
--header 'Content-Type: application/json' \
--data-raw '{
    "contents": [
      {
        "parts": [
          {
            "text": "How does AI work?"
          }
        ]
      }
    ]
  }'

Responses

🟢200OK
application/json
Body

Example
{
    "candidates": [
        {
            "content": {
                "parts": [
                    {
                        "text": "AI, or Artificial Intelligence, fundamentally works by **enabling machines to learn from data, identify patterns, make decisions, and solve problems in ways that simulate human intelligence.** It's not magic, but a combination of complex algorithms, vast amounts of data, and powerful computational resources.\n\nHere's a breakdown of the core concepts and how it generally works:\n\n### The Core Idea: Learning from Data\n\nAt its heart, most modern AI (especially Machine Learning) operates on the principle of **learning from examples rather than being explicitly programmed for every possible scenario.**\n\nImagine teaching a child:\n*   You show them many pictures of cats and dogs, telling them \"This is a cat,\" \"This is a dog.\"\n*   Eventually, they learn to distinguish between the two on their own, even with new pictures they haven't seen before.\n\nAI works similarly.\n\n### Key Components of AI\n\n1.  **Data:** This is the fuel for AI. AI models learn by processing massive amounts of data. This data can be text, images, audio, video, numbers, etc., and often needs to be cleaned, labeled, and prepared.\n2.  **Algorithms (Models):** These are the sets of rules and statistical techniques that AI uses to learn from the data. Different tasks require different types of algorithms.\n3.  **Computational Power:** Training complex AI models requires immense processing power (CPUs and especially GPUs) to handle the vast amounts of data and calculations.\n\n### The General Process of How AI Works\n\n1.  **Data Collection & Preparation:**\n    *   Gathering relevant data (e.g., images for object recognition, text for language translation).\n    *   Cleaning the data (removing errors, inconsistencies).\n    *   Labeling the data (e.g., manually identifying objects in images, or categorizing text). This is crucial for *supervised learning*.\n\n2.  **Choosing a Model/Algorithm:**\n    *   Depending on the problem (e.g., prediction, classification, generation), an appropriate AI model (e.g., neural network, decision tree, support vector machine) is selected.\n\n3.  **Training the Model:**\n    *   The prepared data is fed into the algorithm.\n    *   The algorithm processes this data, looking for patterns and relationships.\n    *   During training, the model makes predictions and compares them to the actual correct answers (if available).\n    *   It then adjusts its internal parameters (often called \"weights\" and \"biases\" in neural networks) to minimize the difference between its predictions and the correct answers. This is an iterative process that can happen millions or billions of times.\n    *   **Analogy:** It's like a student practicing for an exam, getting feedback on wrong answers, and adjusting their understanding.\n\n4.  **Evaluation:**\n    *   After training, the model is tested with *new, unseen data* to assess how well it generalizes and performs on real-world inputs.\n    *   If it performs well, it's ready for deployment. If not, the process might involve further training, more data, or a different algorithm.\n\n5.  **Inference (Deployment/Prediction):**\n    *   Once trained and evaluated, the AI model can be used to make predictions or decisions on new, real-world data.\n    *   For example, a trained image recognition AI can now identify objects in new photos instantly.\n\n### Key Branches of AI\n\nWhile \"AI\" is an umbrella term, modern AI is largely driven by these subfields:\n\n1.  **Machine Learning (ML):** This is the most common approach today. It focuses on building systems that learn from data.\n    *   **Supervised Learning:** Learns from labeled data (input-output pairs).\n        *   **Classification:** Categorizing data (e.g., spam/not-spam, cat/dog).\n        *   **Regression:** Predicting a continuous value (e.g., house prices, stock values).\n    *   **Unsupervised Learning:** Learns from unlabeled data, finding hidden patterns or structures.\n        *   **Clustering:** Grouping similar data points together (e.g., customer segmentation).\n        *   **Dimensionality Reduction:** Simplifying data while retaining important information.\n    *   **Reinforcement Learning (RL):** An agent learns by performing actions in an environment, receiving rewards for good actions and penalties for bad ones, gradually optimizing its behavior (e.g., AI playing games like Go or chess, robotics).\n\n2.  **Deep Learning (DL):** A *subset* of Machine Learning that uses artificial **neural networks** with many layers (hence \"deep\").\n    *   Inspired by the structure of the human brain (though a very simplified model).\n    *   Each layer in a deep neural network processes the input in a hierarchical way, extracting increasingly complex features.\n    *   Deep learning has revolutionized fields like:\n        *   **Image Recognition (CNNs - Convolutional Neural Networks):** Identifying faces, objects, medical conditions in scans.\n        *   **Natural Language Processing (RNNs, Transformers):** Understanding and generating human language (chatbots, translation, summarization).\n        *   **Speech Recognition:** Converting spoken words to text.\n\n### Examples of AI in Action\n\n*   **Image Recognition:** Your phone unlocking with your face, self-driving cars identifying pedestrians and traffic signs.\n*   **Natural Language Processing:** ChatGPT, Google Translate, spam filters, virtual assistants like Siri or Alexa.\n*   **Recommendation Systems:** Netflix suggesting movies, Amazon recommending products.\n*   **Medical Diagnosis:** AI helping doctors analyze medical images for diseases.\n*   **Financial Fraud Detection:** Identifying suspicious transactions.\n*   **Robotics:** Robots performing complex tasks in manufacturing or exploration.\n\nIn essence, AI works by using sophisticated computational methods to sift through vast amounts of information, learn patterns, and then apply that learned knowledge to new situations to make intelligent decisions or perform specific tasks."
                    }
                ],
                "role": "model"
            },
            "finishReason": "STOP",
            "index": 0
        }
    ],
    "usageMetadata": {
        "promptTokenCount": 6,
        "candidatesTokenCount": 1265,
        "totalTokenCount": 2472,
        "promptTokensDetails": [
            {
                "modality": "TEXT",
                "tokenCount": 6
            }
        ],
        "thoughtsTokenCount": 1201
    },
    "modelVersion": "gemini-2.5-flash",
    "responseId": "kuC6aPShNKWAqtsPgfKR6A8"
}
Modified at 2025-09-05 13:11:49
Previous
Create chat image recognition (non-streaming)
Next
Thinking with Gemini 2.5
Built with