Alibaba Qwen (Qwen-Max / Qwen-Plus)
LLMby Alibaba Cloud
Alibaba's Qwen series excels in Chinese and multilingual tasks. Available through Alibaba Cloud's Bailian (DashScope) platform. Qwen 2.5 is also open-source for self-hosting.
Registration & API Key Steps
Step 3
Complete real-name verification (required for Chinese cloud services).
Step 4
Agree to the Bailian service terms to activate the service.
Step 5
Navigate to API Key management and create a new key.
Step 6
Install the DashScope SDK or use the OpenAI-compatible endpoint.
Pricing
| Tier | Price | Features |
|---|---|---|
| Qwen-Turbo | ¥2 / ¥6 per 1M tokens | Input / Output (~$0.28/$0.82). Fast and affordable. |
| Qwen-Plus | ¥4 / ¥12 per 1M tokens | Input / Output (~$0.55/$1.65). Balanced performance. |
| Qwen-Max | ¥40 / ¥120 per 1M tokens | Input / Output (~$5.50/$16.50). Most capable model. |
| Qwen-Long | ¥0.5 / ¥2 per 1M tokens | Input / Output. Optimized for long documents. |
Application Tips
Tip 1
Qwen models are also available as open-source on Hugging Face for self-hosting.
Tip 2
API supports OpenAI-compatible format — easy to migrate from OpenAI.
Tip 3
Qwen-Long is extremely cheap for long document processing (¥0.5/1M input tokens).
Tip 4
New users get generous free token allocations for each model.
Tip 5
Alibaba Cloud's international platform also available at alibabacloud.com for global users.
Tip 6
Qwen 2.5 72B is one of the best open-source models and can be self-hosted.
China Access Solutions
Access Solution
Directly accessible in China. Native support for Chinese phone registration, Alipay payment. No VPN needed. Best choice for Chinese language applications.
Code Example
from openai import OpenAI
client = OpenAI(
api_key="sk-xxx", # Bailian API key
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)
response = client.chat.completions.create(
model="qwen-plus",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "用中文解释量子计算的基本原理"}
]
)
print(response.choices[0].message.content)
# --- cURL example ---
# curl https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
# -H "Authorization: Bearer sk-xxx" \
# -H "Content-Type: application/json" \
# -d '{"model":"qwen-plus","messages":[{"role":"user","content":"你好"}]}'Rate Limits
| Tier | Limits |
|---|---|
| Default | Qwen-Turbo: 500 RPM, 500K TPM. Qwen-Plus: 200 RPM, 200K TPM. Qwen-Max: 120 RPM, 120K TPM. Can apply for higher limits. |
Recommended Use Cases
Related API Guides
OpenAI GPT-4o / GPT-4.1 / o3
OpenAI
OpenAI's flagship LLM family including GPT-4o for multimodal tasks, GPT-4.1 for long-context coding, and o3 for advanced reasoning. Industry-leading models with the largest developer ecosystem.
Anthropic Claude (Sonnet 4.5 / Opus 4.5)
Anthropic
Anthropic's Claude model family excels in nuanced reasoning, safety, and long-context tasks. Claude Sonnet 4.5 offers the best balance of cost and performance, while Opus 4.5 delivers frontier intelligence.
Google Gemini (2.5 Pro / 2.5 Flash)
Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.
Meta Llama 4 (Scout / Maverick)
Meta
Meta's open-source Llama 4 models are free to use and available through multiple cloud providers. Llama 4 Scout and Maverick offer competitive performance at extremely low cost through partner APIs.