Baidu ERNIE (ERNIE 4.0 / Speed / Lite)

LLM

by Baidu

Baidu's ERNIE models are deeply integrated with the Qianfan platform. ERNIE 4.0 is the flagship model, while Speed and Lite models offer free tiers. Strong in Chinese language understanding.

Official API DocumentationFree: Free

API Endpoint

https://console.bce.baidu.com/qianfan/

Documentation

Official Docs

Pricing

View Pricing

Registration & API Key Steps

Step 1

Visit Baidu Intelligent Cloud and register an account.

Open link

Step 2

Complete real-name verification (Chinese ID required for full access).

Step 3

Navigate to Qianfan Console.

Open link

Step 4

Go to "Model Services" > "App Access" and create a new application.

Step 5

Record your APP ID, API Key, and Secret Key.

Step 6

Free models (ERNIE-Speed, ERNIE-Lite) are immediately available.

Pricing

Tier	Price	Features
ERNIE 4.0	¥120 / ¥120 per 1M tokens	Input / Output (~$16.50). Most capable model.
ERNIE 3.5	¥12 / ¥12 per 1M tokens	Input / Output (~$1.65). Balanced option.
ERNIE-Speed-8K	Free	Free tier with limited RPM.
ERNIE-Lite-8K	Free	Free tier, lightweight model.

Application Tips

Tip 1

ERNIE-Speed-8K and ERNIE-Lite-8K are free — great for testing and low-volume usage.

Tip 2

Prepaid billing mode offers lower prices than pay-as-you-go.

Tip 3

Qianfan platform also hosts third-party models like Llama, ChatGLM.

Tip 4

Baidu offers enterprise discounts and custom model training services.

Tip 5

Web search augmentation is available as an add-on but incurs additional per-query fees.

Tip 6

Use company email for faster application approval.

China Access Solutions

Access Solution

Directly accessible in China. Native support for Chinese users. Requires Baidu Cloud account with real-name verification. Supports Alipay/WeChat/bank payment.

Code Example

JavaScript / TypeScript

import requests
import json

# Get access token first
def get_access_token(api_key, secret_key):
    url = f"https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id={api_key}&client_secret={secret_key}"
    response = requests.post(url)
    return response.json().get("access_token")

access_token = get_access_token("your-api-key", "your-secret-key")

# Call ERNIE model
url = f"https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/ernie-speed-8k?access_token={access_token}"
payload = {
    "messages": [{"role": "user", "content": "你好，请介绍一下你自己"}]
}
response = requests.post(url, json=payload)
print(response.json()["result"])

Rate Limits

Tier	Limits
Default	ERNIE-Speed-8K (free): 300 RPM. ERNIE 4.0: varies by billing plan. Pay-as-you-go has lower limits than prepaid.

Recommended Use Cases

Chinese content generationEnterprise chatbotsKnowledge Q&ASearch augmentationDocument processing

Last Updated: 2026-02-10

Related API Guides

OpenAI GPT-4o / GPT-4.1 / o3

OpenAI

OpenAI's flagship LLM family including GPT-4o for multimodal tasks, GPT-4.1 for long-context coding, and o3 for advanced reasoning. Industry-leading models with the largest developer ecosystem.

Anthropic Claude (Sonnet 4.5 / Opus 4.5)

Anthropic

Anthropic's Claude model family excels in nuanced reasoning, safety, and long-context tasks. Claude Sonnet 4.5 offers the best balance of cost and performance, while Opus 4.5 delivers frontier intelligence.

Google Gemini (2.5 Pro / 2.5 Flash)

Google

Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.

Registration & API Key Steps

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Pricing

Application Tips

Tip 1

Tip 2

Tip 3

Tip 4

Tip 5

Tip 6

China Access Solutions

Access Solution

Code Example

Rate Limits

Recommended Use Cases

Related API Guides

OpenAI GPT-4o / GPT-4.1 / o3

Anthropic Claude (Sonnet 4.5 / Opus 4.5)

Google Gemini (2.5 Pro / 2.5 Flash)

Meta Llama 4 (Scout / Maverick)