papaiapi

OpenAI & Gemini compatible API • 12x cheaper

Getting Started

1. Get an API key from our Telegram Bot
2. Use OpenAI SDK, Gemini SDK, or point gemini-cli at /v1direct — all three work!
3. Pricing: Gemini Chat $0.50/1k requests • Direct Gemini $0.50/1k requests • Grok Image $0.005/image • Gemini Image $0.02/image • Flow Image $0.03/image • Gemini Voice $0.03/generation • Gemini Files $0.01/request • Video Artifact Removal $0.01/source-second

Available Models

Model	Type	Description
`gemini-flash`	Chat	Fast text generation via Gemini
`grok`	Image	Grok image generation
`gemini-image`	Image	Gemini image generation
`imagen4`	Image	Flow Image - Imagen 4
`nanobanana`	Image	Flow Image - NanoBanana
`nanobananapro1k`	Image	Flow Image - NanoBanana Pro 1K
`nanobananapro2k`	Image	Flow Image - NanoBanana Pro 2K
`gemini-voice`	Audio	Gemini Voice Generation (Beta) - $0.03/generation
`gemini-files`	File Analysis	Gemini Files (Beta) - File + Prompt Analysis - $0.01/request
`any Gemini model`	Direct Proxy	Use the official model name (e.g. `gemini-2.5-flash`, `gemini-2.5-pro`) via `/v1direct` — see below

Base URLs

SDK / Style	Base URL
OpenAI	`https://papaiapi.com/v1`
Gemini (managed)	`https://papaiapi.com/v1beta`
Gemini (direct proxy)	`https://papaiapi.com/v1direct`

Authentication

OpenAI style:

Authorization: Bearer sk_live_your_api_key_here

Gemini style:

?key=sk_live_your_api_key_here
# or header:
x-goog-api-key: sk_live_your_api_key_here

Endpoints

POST /v1/chat/completions

Create a chat completion (OpenAI-compatible)

Request Body

Parameter	Type	Description
`model`	string	required Model to use: `gemini-flash`
`messages`	array	required Array of message objects with `role` and `content`
`stream`	boolean	optional Enable streaming responses (recommended)

POST /v1/images/generations

Generate images (OpenAI-compatible). Supports Grok, Gemini Image, and Flow Image models.

Request Body

Parameter	Type	Description
`prompt`	string	required Text description of the image to generate
`model`	string	optional Model to use: `grok` (default), `gemini-image`, `imagen4`, `nanobanana`, `nanobananapro1k`, `nanobananapro2k`
`n`	integer	optional Number of images (default: 1, currently only 1 supported)
`size`	string	optional Aspect ratio for Flow Image models: `1792x1024` (landscape) or `1024x1792` (portrait). Ignored for Grok/Gemini.

POST /v1/audio/speech

Generate voice audio from text (OpenAI-compatible TTS). Beta: ~40% success rate.

Request Body

Parameter	Type	Description
`model`	string	optional Model to use: `gemini-voice` (default)
`input`	string	required Text to generate audio from

Response

{
  "url": "https://papaiapi.com/temp/abc123.wav"
}

Note: Audio URLs expire after 1 hour. Download files promptly if you need to keep them.

POST /v1/files/chat

Analyze files (documents, images, videos) with a prompt using Gemini AI (Beta).

Request (Multipart)

Field	Type	Description
`file`	file	required* File to analyze (PDF, image, video, etc.). Max 50MB.
`prompt`	string	required Prompt/question about the file

Request (JSON)

Parameter	Type	Description
`file_url`	string	required* Publicly accessible URL of the file
`prompt`	string	required Prompt/question about the file

* Provide either file (multipart) or file_url (JSON).

Response

Returns an OpenAI-compatible chat completion response.

GET /v1/models

List available models (OpenAI)

Gemini API Endpoints

POST /v1beta/models/{model}:generateContent

Generate content (Gemini-compatible)

Request Body

Parameter	Type	Description
`contents`	array	required Array of content objects with `parts`
`generationConfig`	object	optional Generation config (temperature, maxOutputTokens, etc.)

POST /v1beta/models/{model}:streamGenerateContent

Stream generate content (Gemini-compatible SSE)

GET /v1beta/models

List available models (Gemini)

Direct Gemini API Proxy `/v1direct`

A transparent proxy in front of Google's generativelanguage.googleapis.com. Anything you'd send to the Google endpoint, send to /v1direct instead and we forward it through a rotated pool of Gemini API keys + US residential proxies. You get every official Gemini model and feature (text, multimodal, function calling, streaming, long context, thinking, etc.) with no quota of your own — billed per request through your papaiapi balance.

Why use it

• Drop-in compatible with the official @google/genai / google-genai SDKs and Google's own gemini-cli agent — just change the base URL.
• Access every Gemini model by its real name (gemini-2.5-flash, gemini-2.5-pro, etc.) — no need for our managed model aliases.
• Auto-retry and key rotation on quota / 429 / 403 errors, totally transparent to you.

Base URL & Auth

BASE_URL  https://papaiapi.com/v1direct
HEADER    x-goog-api-key: sk_live_your_api_key_here
# or:     Authorization: Bearer sk_live_your_api_key_here
# or:     ?key=sk_live_your_api_key_here

The path you send is forwarded verbatim to Google. So:

POST  https://papaiapi.com/v1direct/v1beta/models/gemini-2.5-flash:generateContent
   →  https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent

Pricing

$0.50 per 1,000 requests, regardless of model or response size. (No token-based billing — flat rate.)

Limitations

• OAuth / Vertex AI auth flows are not supported — must use x-goog-api-key / Bearer mode.
• Multipart file upload (upload/v1beta/files) is not yet supported. Inline base64 in the request body works fine.
• 5-minute soft timeout per request to match Railway's proxy ceiling.

Quickstart

gemini-cli

Node.js (@google/genai)

Python (google-genai)

cURL

# Install Google's official agent CLI
npm install -g @google/gemini-cli

# Point it at papaiapi instead of Google's endpoint
export GOOGLE_GEMINI_BASE_URL="https://papaiapi.com/v1direct"
export GEMINI_API_KEY="sk_live_your_api_key_here"

# Done — same CLI, same agent, your papaiapi balance pays
gemini -m gemini-2.5-flash --yolo -p "List the .ts files here and tell me how many"
gemini -m gemini-2.5-pro -i "Refactor this codebase"

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({
  apiKey: 'sk_live_your_api_key_here',
  httpOptions: { baseUrl: 'https://papaiapi.com/v1direct' },
});

// Generate
const result = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is 17 times 23?',
});
console.log(result.text);

// Stream
const stream = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about proxies.',
});
for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}

// Function calling
const tools = [{ functionDeclarations: [{
  name: 'get_weather',
  description: 'Get the weather for a city',
  parameters: { type: 'object', properties: { city: { type: 'string' } } },
}]}];
const fc = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather in Tokyo?',
  config: { tools },
});
console.log(fc.functionCalls);

from google import genai
from google.genai import types

client = genai.Client(
    api_key="sk_live_your_api_key_here",
    http_options=types.HttpOptions(base_url="https://papaiapi.com/v1direct"),
)

# Generate
result = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="What is 17 times 23?",
)
print(result.text)

# Stream
for chunk in client.models.generate_content_stream(
    model="gemini-2.5-flash",
    contents="Write a haiku about proxies.",
):
    print(chunk.text, end="")

# generateContent
curl -X POST "https://papaiapi.com/v1direct/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "x-goog-api-key: sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "What is 17 times 23?"}]}]
  }'

# streamGenerateContent (SSE)
curl -N -X POST "https://papaiapi.com/v1direct/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
  -H "x-goog-api-key: sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "Count from 1 to 5."}]}]
  }'

# List models
curl "https://papaiapi.com/v1direct/v1beta/models" \
  -H "x-goog-api-key: sk_live_your_api_key_here"

How rotation works

Every request picks a random (key, US residential proxy) pair. On any of:

429 / quota exhausted → key cooled off 5 min, retried with a fresh pair
403 "denied access" / API_KEY_INVALID → key permanently invalidated, retried with a fresh pair
5xx upstream error → retried with a fresh pair (no key penalty)
Network / proxy fetch failure → retried with a fresh pair

Up to 8 retries per request. After that, Google's actual error response is forwarded back to you verbatim. You're charged only on a final 2xx — failures cost you nothing.

Video Artifact Removal `/v1/videos/artifact-removal`

Per-frame alpha-blend inversion for videos. You supply a source video, a reference PNG of the overlay you want to remove (logo, subtitle burn-in, timestamp, network bug, etc.), and the pixel position. We extract raw frames with ffmpeg, reverse the alpha blend on the artifact region of each frame, and re-encode with the original audio.

The math is the same alpha-blend inverse used in the existing image watermark remover — generalised for arbitrary reference assets and applied per-frame: original = (output − α·logo) / (1 − α), with α taken from the reference image.

Endpoint

POST https://papaiapi.com/v1/videos/artifact-removal
Authorization: Bearer sk_live_your_api_key_here
Content-Type: multipart/form-data

Multipart fields

Field	Type	Description
`video`	file	required* Source video (mp4/mov/webm), ≤100 MB.
`reference`	file (PNG)	required* Reference asset. RGBA PNG with the artifact drawn in white where opacity → alpha channel or brightness-encoded (white = full opacity on black background) — we auto-detect.
`position`	JSON string	required Top-left corner of the artifact in the frame: `{"x":1100,"y":640}`.
`ref_width`	int	optional Resize the reference to this width before applying. Must be set together with `ref_height`.
`ref_height`	int	optional See above.
`logo_value`	int 0-255	optional Override the logo color. Defaults to 255 (white). Use 0 for a black overlay.

* Alternatively send a JSON body with video_url and reference_url as publicly-accessible HTTPS URLs.

Response

{
  "id": "vid_1700000000_abc123",
  "url": "https://papaiapi.com/temp/<outputfile>.mp4",
  "duration_seconds": 12.5,
  "frames_processed": 300,
  "processing_ms": 5400,
  "output_bytes": 2456789,
  "cost": 0.125
}

Output URLs expire after 1 hour. Download promptly if you need to keep them.

Pricing

$0.01 per second of source video. A 30-second clip costs $0.30. Failed requests (bad input, processing error) cost $0.

Limits

Max source duration: 5 minutes (configurable per-account on request).
Max file size: 100 MB per file (source video + reference).
Sync endpoint — request returns when processing completes (typical 0.5–3× real-time depending on resolution).

Preparing the reference asset

You need one PNG showing what to remove. Two formats work:

RGBA (most editors): logo on transparent background, alpha = opacity. White (255,255,255) RGB inside the logo, alpha channel carries the gradient.
Brightness-encoded: opaque PNG, black background, logo drawn in white. The brightness value at each pixel = α × 255.

If you don't have a clean reference, the simplest capture: render a black-background sample of whatever produces the overlay, crop the artifact region, save as PNG.

Quick test

# Remove a 130x40 logo at corner (1100, 640) of a 1280x720 video
curl -X POST "https://papaiapi.com/v1/videos/artifact-removal" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -F "video=@input.mp4" \
  -F "reference=@logo.png" \
  -F 'position={"x":1100,"y":640}'

Caveats

Pixels where α ≈ 1 (fully opaque points) are mathematically unrecoverable — they get capped near the logo color. Most real-world artifacts are partially transparent (α ≈ 0.6–0.85 in the strokes) and recover cleanly.
Re-encoding to H.264 reintroduces small chroma compression artifacts — the cleaned region will be slightly softer than the rest. Use a high reference resolution and the lowest crf you can afford if quality matters.
Position is fixed across the whole video. If the artifact moves or animates, this technique doesn't apply — you'd want per-frame detection (out of scope for this endpoint).

Code Examples

Python

cURL

Node.js

from openai import OpenAI

client = OpenAI(
    base_url="https://papaiapi.com/v1",
    api_key="sk_live_your_api_key_here"
)

# Simple request
response = client.chat.completions.create(
    model="gemini-flash",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="gemini-flash",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

# Simple request
curl -X POST "https://papaiapi.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Streaming
curl -X POST "https://papaiapi.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-flash",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://papaiapi.com/v1',
  apiKey: 'sk_live_your_api_key_here'
});

// Simple request
const response = await client.chat.completions.create({
  model: 'gemini-flash',
  messages: [{ role: 'user', content: 'Hello!' }]
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: 'gemini-flash',
  messages: [{ role: 'user', content: 'Write a poem' }],
  stream: true
});
for await (const chunk of stream) {
  if (chunk.choices[0].delta.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

Response Format

Chat Completion

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "gemini-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Streaming Response (OpenAI)

data: {"id":"chatcmpl-abc","choices":[{"delta":{"role":"assistant"},"index":0}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"},"index":0}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"!"},"index":0}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{},"finish_reason":"stop","index":0}]}

data: [DONE]

Image Generation

{
  "created": 1234567890,
  "data": [{
    "url": "https://papaiapi.com/temp/abc123.jpg",
    "revised_prompt": "your original prompt"
  }]
}

Note: Image URLs expire after 1 hour. Download images promptly if you need to keep them.

Image Generation Examples

Python

cURL

Node.js

from openai import OpenAI

client = OpenAI(
    base_url="https://papaiapi.com/v1",
    api_key="sk_live_your_api_key_here"
)

# Grok (default)
response = client.images.generate(
    prompt="A cat wearing a tiny hat",
    n=1
)
print(response.data[0].url)

# Gemini Image
response = client.images.generate(
    model="gemini-image",
    prompt="A sunset over mountains"
)
print(response.data[0].url)

# Flow Image (NanoBanana Pro 2K, landscape)
response = client.images.generate(
    model="nanobananapro2k",
    prompt="A futuristic cityscape",
    size="1792x1024"
)
print(response.data[0].url)

# Grok (default)
curl -X POST "https://papaiapi.com/v1/images/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "prompt": "A cat wearing a tiny hat"
  }'

# Gemini Image
curl -X POST "https://papaiapi.com/v1/images/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-image",
    "prompt": "A sunset over mountains"
  }'

# Flow Image (NanoBanana Pro 2K, portrait)
curl -X POST "https://papaiapi.com/v1/images/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "nanobananapro2k",
    "prompt": "A futuristic cityscape",
    "size": "1024x1792"
  }'

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://papaiapi.com/v1',
  apiKey: 'sk_live_your_api_key_here'
});

// Grok (default)
const response = await client.images.generate({
  prompt: 'A cat wearing a tiny hat'
});
console.log(response.data[0].url);

// Gemini Image
const geminiResponse = await client.images.generate({
  model: 'gemini-image',
  prompt: 'A sunset over mountains'
});
console.log(geminiResponse.data[0].url);

// Flow Image (NanoBanana Pro 2K, landscape)
const flowResponse = await client.images.generate({
  model: 'nanobananapro2k',
  prompt: 'A futuristic cityscape',
  size: '1792x1024'
});
console.log(flowResponse.data[0].url);

Voice Generation Examples

Python

cURL

Node.js

import requests

response = requests.post(
    "https://papaiapi.com/v1/audio/speech",
    headers={
        "Authorization": "Bearer sk_live_your_api_key_here",
        "Content-Type": "application/json"
    },
    json={
        "model": "gemini-voice",
        "input": "Hello, welcome to our podcast!"
    }
)
data = response.json()
print(data["url"])  # WAV file URL

# Generate voice audio
curl -X POST "https://papaiapi.com/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-voice",
    "input": "Hello, welcome to our podcast!"
  }'

const response = await fetch('https://papaiapi.com/v1/audio/speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk_live_your_api_key_here',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'gemini-voice',
    input: 'Hello, welcome to our podcast!'
  })
});
const data = await response.json();
console.log(data.url); // WAV file URL

File Analysis (Gemini Files)

Analyze documents, images, and videos with Gemini AI. Upload a file and provide a prompt.

Endpoint

POST /v1/files/chat

Request (Multipart Upload)

Send a file via multipart/form-data with a file field and prompt field.

Request (URL)

Or send JSON with file_url (publicly accessible URL) and prompt.

Response

Returns an OpenAI-compatible chat completion response with the analysis in choices[0].message.content.

File Analysis Examples

cURL

Python

Node.js

# Upload a file
curl -X POST "https://papaiapi.com/v1/files/chat" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -F "file=@document.pdf" \
  -F "prompt=Summarize this document"

# Or use a file URL
curl -X POST "https://papaiapi.com/v1/files/chat" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "file_url": "https://example.com/document.pdf",
    "prompt": "Summarize this document"
  }'

import requests

# Upload a file
with open('document.pdf', 'rb') as f:
    response = requests.post(
        'https://papaiapi.com/v1/files/chat',
        headers={'Authorization': 'Bearer sk_live_your_api_key_here'},
        files={'file': f},
        data={'prompt': 'Summarize this document'}
    )
print(response.json()['choices'][0]['message']['content'])

const form = new FormData();
form.append('file', fs.createReadStream('document.pdf'));
form.append('prompt', 'Summarize this document');

const response = await fetch('https://papaiapi.com/v1/files/chat', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk_live_your_api_key_here',
  },
  body: form
});
const data = await response.json();
console.log(data.choices[0].message.content);

Gemini SDK Examples

Python

cURL

Node.js

import google.generativeai as genai

genai.configure(
    api_key="sk_live_your_api_key_here",
    transport="rest",
    client_options={"api_endpoint": "papaiapi.com"}
)

model = genai.GenerativeModel('gemini-flash')
response = model.generate_content("Hello!")
print(response.text)

# Generate content
curl -X POST "https://papaiapi.com/v1beta/models/gemini-flash:generateContent?key=sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{"text": "Hello!"}]
    }]
  }'

# Streaming
curl -X POST "https://papaiapi.com/v1beta/models/gemini-flash:streamGenerateContent?key=sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{"text": "Write a poem"}]
    }]
  }'

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('sk_live_your_api_key_here', {
  baseUrl: 'https://papaiapi.com/v1beta'
});

const model = genAI.getGenerativeModel({ model: 'gemini-flash' });
const result = await model.generateContent('Hello!');
console.log(result.response.text());

Gemini Response Format

{
  "candidates": [{
    "content": {
      "parts": [{"text": "Hello! How can I help you?"}],
      "role": "model"
    },
    "finishReason": "STOP",
    "index": 0
  }],
  "usageMetadata": {
    "promptTokenCount": 0,
    "candidatesTokenCount": 0,
    "totalTokenCount": 0
  }
}

Gemini Streaming Response

data: {"candidates":[{"content":{"parts":[{"text":"Hello"}],"role":"model"},"index":0}]}

data: {"candidates":[{"content":{"parts":[{"text":"!"}],"role":"model"},"index":0}]}

data: {"candidates":[{"content":{"parts":[{"text":""}],"role":"model"},"finishReason":"STOP","index":0}]}

Rate Limits

Limit	Value
Concurrent requests per user	3 (configurable)
Request timeout	3 minutes (chat), 5 minutes (images)
Max prompt length	~95,000 characters

Error Codes

HTTP Code	OpenAI Error	Gemini Error	Description
`401`	invalid_api_key	UNAUTHENTICATED	Invalid or missing API key
`402`	insufficient_balance	FAILED_PRECONDITION	Insufficient balance - top up required
`429`	rate_limit_exceeded	RESOURCE_EXHAUSTED	Too many concurrent requests
`500`	internal_error	INTERNAL	Server error - try again
`504`	timeout	DEADLINE_EXCEEDED	Request timeout

OpenAI Error Format

{
  "error": {
    "message": "Insufficient balance",
    "type": "insufficient_balance",
    "code": 402
  }
}

Gemini Error Format

{
  "error": {
    "code": 402,
    "message": "Insufficient balance",
    "status": "FAILED_PRECONDITION"
  }
}

Important Notes

Timeouts: Chat requests may take up to 3 minutes. Use stream: true for better reliability.
Images: Generation may take up to 5 minutes. URLs expire after 1 hour - download promptly.
Voice: Generation may take up to 5 minutes (beta: ~40% success rate). WAV URLs expire after 1 hour.
Retries: Failed requests are automatically retried up to 3 times before returning an error.

Use papaiapi with Google's `gemini-cli`

The official Gemini CLI is Google's open-source agent (file edits, shell, browser, etc.). Point it at papaiapi instead of Google's endpoint and your papaiapi balance pays for everything — no Google Cloud account, no project setup, no quota of your own to manage.

Setup takes about 2 minutes.

Get a papaiapi API key

Open Telegram and message @papaiapibot:

Send /start
Tap API Keys → Generate Key
Copy the key (starts with sk_live_…)
Tap Top up balance and add at least a few dollars (each request costs $0.0005 — $5 buys ~10,000 calls)

Install gemini-cli

Requires Node.js 20+ (check with node -v).

npm install -g @google/gemini-cli

# verify
gemini --version

If npm permission errors, prefix with sudo or use a node version manager (nvm, fnm, volta).

Point gemini-cli at papaiapi

Just two environment variables. Open a terminal and:

export GOOGLE_GEMINI_BASE_URL="https://papaiapi.com/v1direct"
export GEMINI_API_KEY="sk_live_your_papaiapi_key_here"

To make it permanent, add those two lines to your shell profile (~/.zshrc on macOS, ~/.bashrc on Linux). Then run source ~/.zshrc.

How it works

GOOGLE_GEMINI_BASE_URL tells the CLI to send all requests to papaiapi instead of generativelanguage.googleapis.com. GEMINI_API_KEY is your papaiapi key, which we use to authenticate and charge your balance. We then forward each request through a rotated pool of 468 free Google AI Studio keys + 47 US residential proxies, transparently retrying on quota errors.

Pick a model that won't 429

Our pool runs on free-tier keys, so Google's Pro models hit quota almost immediately. Stick to Flash models — they're fast, cheap, and have generous free quota.

Recommended models, in order:

Model	When to use	Status
`gemini-2.5-flash`	Default pick — battle-tested, fastest, GA	✅ Stable
`gemini-3-flash-preview`	When you want the newest model	⚠️ Preview (may change)
`gemini-flash-latest`	Set-and-forget — auto-tracks the latest stable Flash	✅ Alias

Avoid: any *-pro* or "Auto" picker — they'll escalate to Pro and 429.

Run it

One-shot prompt:

gemini -m gemini-2.5-flash -p "What is 17 times 23?"
# → 391

Interactive REPL:

gemini -m gemini-2.5-flash

Full agent mode (auto-approve all tool calls, file edits, shell commands):

cd ~/your-project
gemini -m gemini-2.5-flash --yolo -p "Add a README explaining what this repo does"

Streaming a long task (the CLI streams by default; just use -i to start a session):

gemini -m gemini-2.5-flash -i "Refactor all .ts files to use async/await"

Switching models

You have three ways to choose / change the model:

A. Per command (one-off):

gemini -m gemini-3-flash-preview -p "Hello"

B. Inside an interactive session — open the model picker:

Start gemini
Press Esc to enter command mode
Type /model and press Enter
Pick option 3. Manual
Type gemini-2.5-flash (or gemini-3-flash-preview)
Press Tab to toggle "Remember model for future sessions" to true
Press Enter

After step 6, the CLI saves your choice to ~/.gemini/settings.json and you won't see the picker again.

C. Set a default in your shell profile:

# add to ~/.zshrc or ~/.bashrc
alias gemini='gemini -m gemini-2.5-flash'

Verify everything works

Quick sanity check:

gemini -m gemini-2.5-flash -p "Reply with the word OK only."
# expect: OK

If you see OK — you're done. ✅

Troubleshooting

What you see	What it means	Fix
`401 invalid api key`	Your `GEMINI_API_KEY` isn't a papaiapi key, or it's revoked	Re-generate via @papaiapibot
`402 Insufficient balance`	Your papaiapi balance is below $0.0005	Top up via Telegram bot
`429 quota exhausted`	You picked a Pro model — free-tier keys can't fund it	Switch to `gemini-2.5-flash` or `gemini-3-flash-preview`
`404 model not found`	You typed a model name Google doesn't recognize (e.g. `gemini-flash`)	Use the full ID like `gemini-2.5-flash`
`429 Rate limit exceeded. Maximum N concurrent`	This is papaiapi's per-account thread limit — your client is firing too many parallel requests	Reduce parallelism, or DM us to bump your limit
CLI prompts for Google login / OAuth flow	Cli is in OAuth mode — we only support API-key auth	Make sure `GEMINI_API_KEY` is set; in the auth picker, choose "Use Gemini API key"
Slow/hanging responses	One of our proxies is flaky — we'll retry up to 3× automatically	Wait. Most resolve under 30s; the 1% tail can take ~minute

FAQ

Does this work with the agentic / yolo mode?

Yes — function calling, multi-turn tool use, file edits, shell commands all work. We forward Google's API verbatim.

What about file uploads?

Inline base64 (in the request body) works. Multipart upload to /upload/v1beta/files isn't supported yet — workaround is to base64-encode the file and pass it as inlineData.

Does my conversation history leak to other users?

No. Each request is stateless from our side; we only forward it. The CLI itself maintains your conversation context locally.

How is this different from `/v1/chat/completions` and `/v1beta/...`?

Those endpoints use browser automation behind the scenes — slower (3-30s) but works without API keys. /v1direct uses the real Gemini API with rotated free keys — much faster (1-5s typical) and supports every Gemini feature, but only works because we have a key pool. For agentic work, /v1direct is the right choice.

Can I use any other gemini-compatible tool?

Yes. Anything that accepts a custom base URL and uses x-goog-api-key auth works. Examples that have been tested: @google/genai (Node), google-genai (Python), gemini-cli, custom apps using the REST API.

papaiapi

Available Models

Base URLs

Authentication

Endpoints

Request Body

Request Body

Request Body

Response

Request (Multipart)

Request (JSON)

Response

Gemini API Endpoints

Request Body

Direct Gemini API Proxy /v1direct

Base URL & Auth

Pricing

Quickstart

How rotation works

Video Artifact Removal /v1/videos/artifact-removal

Endpoint

Multipart fields

Response

Pricing

Limits

Preparing the reference asset

Quick test

Caveats

Code Examples

Response Format

Chat Completion

Streaming Response (OpenAI)

Image Generation

Image Generation Examples

Voice Generation Examples

File Analysis (Gemini Files)

Endpoint

Request (Multipart Upload)

Request (URL)

Response

File Analysis Examples

Gemini SDK Examples

Gemini Response Format

Gemini Streaming Response

Rate Limits

Error Codes

OpenAI Error Format

Gemini Error Format

Use papaiapi with Google's gemini-cli

Get a papaiapi API key

Install gemini-cli

Point gemini-cli at papaiapi

Pick a model that won't 429

Run it

Switching models

Verify everything works

Troubleshooting

FAQ

Does this work with the agentic / yolo mode?

What about file uploads?

Does my conversation history leak to other users?

How is this different from /v1/chat/completions and /v1beta/...?

Can I use any other gemini-compatible tool?

Direct Gemini API Proxy `/v1direct`

Video Artifact Removal `/v1/videos/artifact-removal`

Use papaiapi with Google's `gemini-cli`

How is this different from `/v1/chat/completions` and `/v1beta/...`?