Baidu ERNIE (ERNIE 4.0 / Speed / Lite)
LLMby Baidu
Baidu's ERNIE models are deeply integrated with the Qianfan platform. ERNIE 4.0 is the flagship model, while Speed and Lite models offer free tiers. Strong in Chinese language understanding.
Registration & API Key Steps
Step 2
Complete real-name verification (Chinese ID required for full access).
Step 4
Go to "Model Services" > "App Access" and create a new application.
Step 5
Record your APP ID, API Key, and Secret Key.
Step 6
Free models (ERNIE-Speed, ERNIE-Lite) are immediately available.
Pricing
| Tier | Price | Features |
|---|---|---|
| ERNIE 4.0 | ¥120 / ¥120 per 1M tokens | Input / Output (~$16.50). Most capable model. |
| ERNIE 3.5 | ¥12 / ¥12 per 1M tokens | Input / Output (~$1.65). Balanced option. |
| ERNIE-Speed-8K | Free | Free tier with limited RPM. |
| ERNIE-Lite-8K | Free | Free tier, lightweight model. |
Application Tips
Tip 1
ERNIE-Speed-8K and ERNIE-Lite-8K are free — great for testing and low-volume usage.
Tip 2
Prepaid billing mode offers lower prices than pay-as-you-go.
Tip 3
Qianfan platform also hosts third-party models like Llama, ChatGLM.
Tip 4
Baidu offers enterprise discounts and custom model training services.
Tip 5
Web search augmentation is available as an add-on but incurs additional per-query fees.
Tip 6
Use company email for faster application approval.
China Access Solutions
Access Solution
Directly accessible in China. Native support for Chinese users. Requires Baidu Cloud account with real-name verification. Supports Alipay/WeChat/bank payment.
Code Example
import requests
import json
# Get access token first
def get_access_token(api_key, secret_key):
url = f"https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id={api_key}&client_secret={secret_key}"
response = requests.post(url)
return response.json().get("access_token")
access_token = get_access_token("your-api-key", "your-secret-key")
# Call ERNIE model
url = f"https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/ernie-speed-8k?access_token={access_token}"
payload = {
"messages": [{"role": "user", "content": "你好,请介绍一下你自己"}]
}
response = requests.post(url, json=payload)
print(response.json()["result"])Rate Limits
| Tier | Limits |
|---|---|
| Default | ERNIE-Speed-8K (free): 300 RPM. ERNIE 4.0: varies by billing plan. Pay-as-you-go has lower limits than prepaid. |
Recommended Use Cases
Related API Guides
OpenAI GPT-4o / GPT-4.1 / o3
OpenAI
OpenAI's flagship LLM family including GPT-4o for multimodal tasks, GPT-4.1 for long-context coding, and o3 for advanced reasoning. Industry-leading models with the largest developer ecosystem.
Anthropic Claude (Sonnet 4.5 / Opus 4.5)
Anthropic
Anthropic's Claude model family excels in nuanced reasoning, safety, and long-context tasks. Claude Sonnet 4.5 offers the best balance of cost and performance, while Opus 4.5 delivers frontier intelligence.
Google Gemini (2.5 Pro / 2.5 Flash)
Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.
Meta Llama 4 (Scout / Maverick)
Meta
Meta's open-source Llama 4 models are free to use and available through multiple cloud providers. Llama 4 Scout and Maverick offer competitive performance at extremely low cost through partner APIs.