R

Replicate

Model Deployment

by Replicate

Run open-source AI models with a simple API. Pay-per-use pricing for thousands of models including image, video, audio, and text generation.

Registration & API Key Steps

1

Create Replicate Account

Sign up at replicate.com with GitHub account.

Open link
2

Get API Token

Generate an API token from your account settings.

Open link
3

Install Client Library

Install the Python or Node.js client: pip install replicate or npm install replicate.

4

Run Your First Model

Browse models at replicate.com/explore and run one with a single API call. Pay only for compute used.

Pricing

TierPriceFeatures
Pay-as-you-goVaries by modelNo subscription required, Billed per prediction, GPU-time based, No minimum
SDXL Example~$0.003/image (~4 sec)~30 images per $0.10, Nvidia L40S GPU, Fast generation
Flux Schnell Example~$0.003/imageLatest Flux model, High quality, Fast inference
Custom ModelsBased on GPU timeDeploy your own models, Auto-scaling, Cold start optimization

Application Tips

Extremely Affordable for Images

Models like SDXL and Flux cost ~$0.003/image. You can generate ~300 images for $1.

Official Models Have Predictable Pricing

Official models are priced by output (per image, per second of video, per token) rather than raw GPU time.

Deploy Custom Models

Use Cog (Replicate's open-source tool) to package and deploy your own models on Replicate.

No Upfront Cost

No subscription or minimum spend. Pay only for what you use. Great for experimentation.

China Access Solutions

API Proxy

Use an overseas proxy server to access Replicate API from China.

Direct Access

Replicate may be accessible directly in some regions.

Code Example

JavaScript / TypeScript
import Replicate from 'replicate';

const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });

// Image Generation with Flux
const output = await replicate.run(
  'black-forest-labs/flux-schnell',
  { input: { prompt: 'An iguana on the beach, pointillism' } }
);
// output is an array of FileOutput objects
const fs = require('fs');
for (const item of output) {
  const buffer = await item.blob().then(b => b.arrayBuffer());
  fs.writeFileSync('output.png', Buffer.from(buffer));
}

// Video Generation with Minimax
const video = await replicate.run(
  'minimax/video-01',
  {
    input: {
      prompt: 'A cat playing piano in a jazz club',
      prompt_optimizer: true,
    }
  }
);
console.log('Video URL:', video);

// Run any model by name
const result = await replicate.run('owner/model-name', {
  input: { /* model-specific inputs */ }
});

Rate Limits

TierLimits
DefaultNo hard limits, billed per use
Cold StartFirst request may be slower (model loading)

Recommended Use Cases

Rapid prototyping with various modelsImage/video/audio generationModel comparison & evaluationCustom model deploymentBuilding AI-powered products
Last Updated: 2025-02

Related API Guides