G

Google Gemini (2.5 Pro / 2.5 Flash)

LLM

by Google

Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.

Registration & API Key Steps

1

Step 1

Visit aistudio.google.com and sign in with your Google account.

Open link
2

Step 2

Click "Get API Key" in the left sidebar.

3

Step 3

Click "Create API Key" and select or create a Google Cloud project.

4

Step 4

Copy your API key — no billing setup required for free tier.

5

Step 5

Optionally, enable billing in Google Cloud Console for higher rate limits.

Pricing

TierPriceFeatures
Gemini 2.5 Flash$0.15 / $0.60 per 1M tokensInput / Output. Fast and affordable. Thinking tokens: $0.50/1M.
Gemini 2.5 Flash-Lite$0.10 / $0.40 per 1M tokensInput / Output. Most affordable option.
Gemini 2.5 Pro$1.25 / $10.00 per 1M tokensInput / Output (<=200K context). >200K context: $2.50/$20.00.
Free TierFree5-15 RPM, up to 1,000 RPD. No credit card required. Access to all models.

Application Tips

Tip 1

Best free tier in the industry — start here for experimentation with no credit card.

Tip 2

1M token context window is 8x larger than ChatGPT's, great for large document analysis.

Tip 3

Use Gemini 2.5 Flash for most tasks — excellent performance at very low cost.

Tip 4

Context caching saves 75% on repeated content (cache reads cost 25% of input).

Tip 5

Gemini API is also available through Vertex AI for enterprise with SLA guarantees.

Tip 6

Google Search grounding available: first 1,500 queries/day free, then $35/1,000 queries.

China Access Solutions

Access Solution

Google services are blocked in mainland China. Requires VPN/proxy. Consider using Vertex AI through a supported cloud region, or third-party relay services.

Code Example

JavaScript / TypeScript
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-2.5-flash")

response = model.generate_content("Explain how AI works")
print(response.text)

# --- cURL example ---
# curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=YOUR_API_KEY" \
#   -H "Content-Type: application/json" \
#   -d '{
#     "contents": [{"parts": [{"text": "Explain how AI works"}]}]
#   }'

Rate Limits

TierLimits
DefaultFree: 5-15 RPM, 250K-1M TPM, 1,000 RPD. Pay-as-you-go: 2,000 RPM, 4M TPM. Varies by model.

Recommended Use Cases

Multimodal understandingLong document analysisCode generationResearch & summarizationGrounded search
Last Updated: 2026-02-10

Related API Guides