papaiapi

OpenAI & Gemini compatible API โ€ข 12x cheaper

Getting Started

1. Get an API key from our Telegram Bot
2. Use OpenAI SDK, Gemini SDK, or point gemini-cli at /v1direct โ€” all three work!
3. Pricing: Gemini Chat $0.50/1k requests โ€ข Direct Gemini $0.50/1k requests โ€ข Grok Image $0.005/image โ€ข Gemini Image $0.02/image โ€ข Flow Image $0.03/image โ€ข Gemini Voice $0.03/generation โ€ข Gemini Files $0.01/request โ€ข Video Artifact Removal $0.01/source-second

Available Models

Model Type Description
gemini-flash Chat Fast text generation via Gemini
grok Image Grok image generation
gemini-image Image Gemini image generation
imagen4 Image Flow Image - Imagen 4
nanobanana Image Flow Image - NanoBanana
nanobananapro1k Image Flow Image - NanoBanana Pro 1K
nanobananapro2k Image Flow Image - NanoBanana Pro 2K
gemini-voice Audio Gemini Voice Generation (Beta) - $0.03/generation
gemini-files File Analysis Gemini Files (Beta) - File + Prompt Analysis - $0.01/request
any Gemini model Direct Proxy Use the official model name (e.g. gemini-2.5-flash, gemini-2.5-pro) via /v1direct โ€” see below

Base URLs

SDK / Style Base URL
OpenAI https://papaiapi.com/v1
Gemini (managed) https://papaiapi.com/v1beta
Gemini (direct proxy) https://papaiapi.com/v1direct

Authentication

OpenAI style:

Authorization: Bearer sk_live_your_api_key_here

Gemini style:

?key=sk_live_your_api_key_here
# or header:
x-goog-api-key: sk_live_your_api_key_here

Endpoints

POST /v1/chat/completions

Create a chat completion (OpenAI-compatible)

Request Body

Parameter Type Description
model string required Model to use: gemini-flash
messages array required Array of message objects with role and content
stream boolean optional Enable streaming responses (recommended)
POST /v1/images/generations

Generate images (OpenAI-compatible). Supports Grok, Gemini Image, and Flow Image models.

Request Body

Parameter Type Description
prompt string required Text description of the image to generate
model string optional Model to use: grok (default), gemini-image, imagen4, nanobanana, nanobananapro1k, nanobananapro2k
n integer optional Number of images (default: 1, currently only 1 supported)
size string optional Aspect ratio for Flow Image models: 1792x1024 (landscape) or 1024x1792 (portrait). Ignored for Grok/Gemini.
POST /v1/audio/speech

Generate voice audio from text (OpenAI-compatible TTS). Beta: ~40% success rate.

Request Body

Parameter Type Description
model string optional Model to use: gemini-voice (default)
input string required Text to generate audio from

Response

{
  "url": "https://papaiapi.com/temp/abc123.wav"
}

Note: Audio URLs expire after 1 hour. Download files promptly if you need to keep them.

POST /v1/files/chat

Analyze files (documents, images, videos) with a prompt using Gemini AI (Beta).

Request (Multipart)

Field Type Description
file file required* File to analyze (PDF, image, video, etc.). Max 50MB.
prompt string required Prompt/question about the file

Request (JSON)

Parameter Type Description
file_url string required* Publicly accessible URL of the file
prompt string required Prompt/question about the file

* Provide either file (multipart) or file_url (JSON).

Response

Returns an OpenAI-compatible chat completion response.

GET /v1/models

List available models (OpenAI)

Gemini API Endpoints

POST /v1beta/models/{model}:generateContent

Generate content (Gemini-compatible)

Request Body

Parameter Type Description
contents array required Array of content objects with parts
generationConfig object optional Generation config (temperature, maxOutputTokens, etc.)
POST /v1beta/models/{model}:streamGenerateContent

Stream generate content (Gemini-compatible SSE)

GET /v1beta/models

List available models (Gemini)

Direct Gemini API Proxy /v1direct

A transparent proxy in front of Google's generativelanguage.googleapis.com. Anything you'd send to the Google endpoint, send to /v1direct instead and we forward it through a rotated pool of Gemini API keys + US residential proxies. You get every official Gemini model and feature (text, multimodal, function calling, streaming, long context, thinking, etc.) with no quota of your own โ€” billed per request through your papaiapi balance.

Why use it

โ€ข Drop-in compatible with the official @google/genai / google-genai SDKs and Google's own gemini-cli agent โ€” just change the base URL.
โ€ข Access every Gemini model by its real name (gemini-2.5-flash, gemini-2.5-pro, etc.) โ€” no need for our managed model aliases.
โ€ข Auto-retry and key rotation on quota / 429 / 403 errors, totally transparent to you.

Base URL & Auth

BASE_URL  https://papaiapi.com/v1direct
HEADER    x-goog-api-key: sk_live_your_api_key_here
# or:     Authorization: Bearer sk_live_your_api_key_here
# or:     ?key=sk_live_your_api_key_here

The path you send is forwarded verbatim to Google. So:

POST  https://papaiapi.com/v1direct/v1beta/models/gemini-2.5-flash:generateContent
   โ†’  https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent

Pricing

$0.50 per 1,000 requests, regardless of model or response size. (No token-based billing โ€” flat rate.)

Limitations

โ€ข OAuth / Vertex AI auth flows are not supported โ€” must use x-goog-api-key / Bearer mode.
โ€ข Multipart file upload (upload/v1beta/files) is not yet supported. Inline base64 in the request body works fine.
โ€ข 5-minute soft timeout per request to match Railway's proxy ceiling.

Quickstart

gemini-cli
Node.js (@google/genai)
Python (google-genai)
cURL
# Install Google's official agent CLI
npm install -g @google/gemini-cli

# Point it at papaiapi instead of Google's endpoint
export GOOGLE_GEMINI_BASE_URL="https://papaiapi.com/v1direct"
export GEMINI_API_KEY="sk_live_your_api_key_here"

# Done โ€” same CLI, same agent, your papaiapi balance pays
gemini -m gemini-2.5-flash --yolo -p "List the .ts files here and tell me how many"
gemini -m gemini-2.5-pro -i "Refactor this codebase"
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({
  apiKey: 'sk_live_your_api_key_here',
  httpOptions: { baseUrl: 'https://papaiapi.com/v1direct' },
});

// Generate
const result = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is 17 times 23?',
});
console.log(result.text);

// Stream
const stream = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about proxies.',
});
for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}

// Function calling
const tools = [{ functionDeclarations: [{
  name: 'get_weather',
  description: 'Get the weather for a city',
  parameters: { type: 'object', properties: { city: { type: 'string' } } },
}]}];
const fc = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather in Tokyo?',
  config: { tools },
});
console.log(fc.functionCalls);
from google import genai
from google.genai import types

client = genai.Client(
    api_key="sk_live_your_api_key_here",
    http_options=types.HttpOptions(base_url="https://papaiapi.com/v1direct"),
)

# Generate
result = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="What is 17 times 23?",
)
print(result.text)

# Stream
for chunk in client.models.generate_content_stream(
    model="gemini-2.5-flash",
    contents="Write a haiku about proxies.",
):
    print(chunk.text, end="")
# generateContent
curl -X POST "https://papaiapi.com/v1direct/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "x-goog-api-key: sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "What is 17 times 23?"}]}]
  }'

# streamGenerateContent (SSE)
curl -N -X POST "https://papaiapi.com/v1direct/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
  -H "x-goog-api-key: sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "Count from 1 to 5."}]}]
  }'

# List models
curl "https://papaiapi.com/v1direct/v1beta/models" \
  -H "x-goog-api-key: sk_live_your_api_key_here"

How rotation works

Every request picks a random (key, US residential proxy) pair. On any of:

Up to 8 retries per request. After that, Google's actual error response is forwarded back to you verbatim. You're charged only on a final 2xx โ€” failures cost you nothing.

Video Artifact Removal /v1/videos/artifact-removal

Per-frame alpha-blend inversion for videos. You supply a source video, a reference PNG of the overlay you want to remove (logo, subtitle burn-in, timestamp, network bug, etc.), and the pixel position. We extract raw frames with ffmpeg, reverse the alpha blend on the artifact region of each frame, and re-encode with the original audio.

The math is the same alpha-blend inverse used in the existing image watermark remover โ€” generalised for arbitrary reference assets and applied per-frame: original = (output โˆ’ ฮฑยทlogo) / (1 โˆ’ ฮฑ), with ฮฑ taken from the reference image.

Endpoint

POST https://papaiapi.com/v1/videos/artifact-removal
Authorization: Bearer sk_live_your_api_key_here
Content-Type: multipart/form-data

Multipart fields

FieldTypeDescription
videofilerequired* Source video (mp4/mov/webm), โ‰ค100 MB.
referencefile (PNG)required* Reference asset. RGBA PNG with the artifact drawn in white where opacity โ†’ alpha channel or brightness-encoded (white = full opacity on black background) โ€” we auto-detect.
positionJSON stringrequired Top-left corner of the artifact in the frame: {"x":1100,"y":640}.
ref_widthintoptional Resize the reference to this width before applying. Must be set together with ref_height.
ref_heightintoptional See above.
logo_valueint 0-255optional Override the logo color. Defaults to 255 (white). Use 0 for a black overlay.

* Alternatively send a JSON body with video_url and reference_url as publicly-accessible HTTPS URLs.

Response

{
  "id": "vid_1700000000_abc123",
  "url": "https://papaiapi.com/temp/<outputfile>.mp4",
  "duration_seconds": 12.5,
  "frames_processed": 300,
  "processing_ms": 5400,
  "output_bytes": 2456789,
  "cost": 0.125
}

Output URLs expire after 1 hour. Download promptly if you need to keep them.

Pricing

$0.01 per second of source video. A 30-second clip costs $0.30. Failed requests (bad input, processing error) cost $0.

Limits

Preparing the reference asset

You need one PNG showing what to remove. Two formats work:

  1. RGBA (most editors): logo on transparent background, alpha = opacity. White (255,255,255) RGB inside the logo, alpha channel carries the gradient.
  2. Brightness-encoded: opaque PNG, black background, logo drawn in white. The brightness value at each pixel = ฮฑ ร— 255.

If you don't have a clean reference, the simplest capture: render a black-background sample of whatever produces the overlay, crop the artifact region, save as PNG.

Quick test

# Remove a 130x40 logo at corner (1100, 640) of a 1280x720 video
curl -X POST "https://papaiapi.com/v1/videos/artifact-removal" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -F "video=@input.mp4" \
  -F "reference=@logo.png" \
  -F 'position={"x":1100,"y":640}'

Caveats

Code Examples

Python
cURL
Node.js
from openai import OpenAI

client = OpenAI(
    base_url="https://papaiapi.com/v1",
    api_key="sk_live_your_api_key_here"
)

# Simple request
response = client.chat.completions.create(
    model="gemini-flash",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="gemini-flash",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
# Simple request
curl -X POST "https://papaiapi.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Streaming
curl -X POST "https://papaiapi.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-flash",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://papaiapi.com/v1',
  apiKey: 'sk_live_your_api_key_here'
});

// Simple request
const response = await client.chat.completions.create({
  model: 'gemini-flash',
  messages: [{ role: 'user', content: 'Hello!' }]
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: 'gemini-flash',
  messages: [{ role: 'user', content: 'Write a poem' }],
  stream: true
});
for await (const chunk of stream) {
  if (chunk.choices[0].delta.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

Response Format

Chat Completion

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "gemini-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Streaming Response (OpenAI)

data: {"id":"chatcmpl-abc","choices":[{"delta":{"role":"assistant"},"index":0}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"},"index":0}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"!"},"index":0}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{},"finish_reason":"stop","index":0}]}

data: [DONE]

Image Generation

{
  "created": 1234567890,
  "data": [{
    "url": "https://papaiapi.com/temp/abc123.jpg",
    "revised_prompt": "your original prompt"
  }]
}

Note: Image URLs expire after 1 hour. Download images promptly if you need to keep them.

Image Generation Examples

Python
cURL
Node.js
from openai import OpenAI

client = OpenAI(
    base_url="https://papaiapi.com/v1",
    api_key="sk_live_your_api_key_here"
)

# Grok (default)
response = client.images.generate(
    prompt="A cat wearing a tiny hat",
    n=1
)
print(response.data[0].url)

# Gemini Image
response = client.images.generate(
    model="gemini-image",
    prompt="A sunset over mountains"
)
print(response.data[0].url)

# Flow Image (NanoBanana Pro 2K, landscape)
response = client.images.generate(
    model="nanobananapro2k",
    prompt="A futuristic cityscape",
    size="1792x1024"
)
print(response.data[0].url)
# Grok (default)
curl -X POST "https://papaiapi.com/v1/images/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "prompt": "A cat wearing a tiny hat"
  }'

# Gemini Image
curl -X POST "https://papaiapi.com/v1/images/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-image",
    "prompt": "A sunset over mountains"
  }'

# Flow Image (NanoBanana Pro 2K, portrait)
curl -X POST "https://papaiapi.com/v1/images/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "nanobananapro2k",
    "prompt": "A futuristic cityscape",
    "size": "1024x1792"
  }'
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://papaiapi.com/v1',
  apiKey: 'sk_live_your_api_key_here'
});

// Grok (default)
const response = await client.images.generate({
  prompt: 'A cat wearing a tiny hat'
});
console.log(response.data[0].url);

// Gemini Image
const geminiResponse = await client.images.generate({
  model: 'gemini-image',
  prompt: 'A sunset over mountains'
});
console.log(geminiResponse.data[0].url);

// Flow Image (NanoBanana Pro 2K, landscape)
const flowResponse = await client.images.generate({
  model: 'nanobananapro2k',
  prompt: 'A futuristic cityscape',
  size: '1792x1024'
});
console.log(flowResponse.data[0].url);

Voice Generation Examples

Python
cURL
Node.js
import requests

response = requests.post(
    "https://papaiapi.com/v1/audio/speech",
    headers={
        "Authorization": "Bearer sk_live_your_api_key_here",
        "Content-Type": "application/json"
    },
    json={
        "model": "gemini-voice",
        "input": "Hello, welcome to our podcast!"
    }
)
data = response.json()
print(data["url"])  # WAV file URL
# Generate voice audio
curl -X POST "https://papaiapi.com/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -d '{
    "model": "gemini-voice",
    "input": "Hello, welcome to our podcast!"
  }'
const response = await fetch('https://papaiapi.com/v1/audio/speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk_live_your_api_key_here',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'gemini-voice',
    input: 'Hello, welcome to our podcast!'
  })
});
const data = await response.json();
console.log(data.url); // WAV file URL

File Analysis (Gemini Files)

Analyze documents, images, and videos with Gemini AI. Upload a file and provide a prompt.

Endpoint

POST /v1/files/chat

Request (Multipart Upload)

Send a file via multipart/form-data with a file field and prompt field.

Request (URL)

Or send JSON with file_url (publicly accessible URL) and prompt.

Response

Returns an OpenAI-compatible chat completion response with the analysis in choices[0].message.content.

File Analysis Examples

cURL
Python
Node.js
# Upload a file
curl -X POST "https://papaiapi.com/v1/files/chat" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -F "file=@document.pdf" \
  -F "prompt=Summarize this document"

# Or use a file URL
curl -X POST "https://papaiapi.com/v1/files/chat" \
  -H "Authorization: Bearer sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "file_url": "https://example.com/document.pdf",
    "prompt": "Summarize this document"
  }'
import requests

# Upload a file
with open('document.pdf', 'rb') as f:
    response = requests.post(
        'https://papaiapi.com/v1/files/chat',
        headers={'Authorization': 'Bearer sk_live_your_api_key_here'},
        files={'file': f},
        data={'prompt': 'Summarize this document'}
    )
print(response.json()['choices'][0]['message']['content'])
const form = new FormData();
form.append('file', fs.createReadStream('document.pdf'));
form.append('prompt', 'Summarize this document');

const response = await fetch('https://papaiapi.com/v1/files/chat', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk_live_your_api_key_here',
  },
  body: form
});
const data = await response.json();
console.log(data.choices[0].message.content);

Gemini SDK Examples

Python
cURL
Node.js
import google.generativeai as genai

genai.configure(
    api_key="sk_live_your_api_key_here",
    transport="rest",
    client_options={"api_endpoint": "papaiapi.com"}
)

model = genai.GenerativeModel('gemini-flash')
response = model.generate_content("Hello!")
print(response.text)
# Generate content
curl -X POST "https://papaiapi.com/v1beta/models/gemini-flash:generateContent?key=sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{"text": "Hello!"}]
    }]
  }'

# Streaming
curl -X POST "https://papaiapi.com/v1beta/models/gemini-flash:streamGenerateContent?key=sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{"text": "Write a poem"}]
    }]
  }'
import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('sk_live_your_api_key_here', {
  baseUrl: 'https://papaiapi.com/v1beta'
});

const model = genAI.getGenerativeModel({ model: 'gemini-flash' });
const result = await model.generateContent('Hello!');
console.log(result.response.text());

Gemini Response Format

{
  "candidates": [{
    "content": {
      "parts": [{"text": "Hello! How can I help you?"}],
      "role": "model"
    },
    "finishReason": "STOP",
    "index": 0
  }],
  "usageMetadata": {
    "promptTokenCount": 0,
    "candidatesTokenCount": 0,
    "totalTokenCount": 0
  }
}

Gemini Streaming Response

data: {"candidates":[{"content":{"parts":[{"text":"Hello"}],"role":"model"},"index":0}]}

data: {"candidates":[{"content":{"parts":[{"text":"!"}],"role":"model"},"index":0}]}

data: {"candidates":[{"content":{"parts":[{"text":""}],"role":"model"},"finishReason":"STOP","index":0}]}

Rate Limits

Limit Value
Concurrent requests per user 3 (configurable)
Request timeout 3 minutes (chat), 5 minutes (images)
Max prompt length ~95,000 characters

Error Codes

HTTP Code OpenAI Error Gemini Error Description
401 invalid_api_key UNAUTHENTICATED Invalid or missing API key
402 insufficient_balance FAILED_PRECONDITION Insufficient balance - top up required
429 rate_limit_exceeded RESOURCE_EXHAUSTED Too many concurrent requests
500 internal_error INTERNAL Server error - try again
504 timeout DEADLINE_EXCEEDED Request timeout

OpenAI Error Format

{
  "error": {
    "message": "Insufficient balance",
    "type": "insufficient_balance",
    "code": 402
  }
}

Gemini Error Format

{
  "error": {
    "code": 402,
    "message": "Insufficient balance",
    "status": "FAILED_PRECONDITION"
  }
}
Important Notes

Timeouts: Chat requests may take up to 3 minutes. Use stream: true for better reliability.
Images: Generation may take up to 5 minutes. URLs expire after 1 hour - download promptly.
Voice: Generation may take up to 5 minutes (beta: ~40% success rate). WAV URLs expire after 1 hour.
Retries: Failed requests are automatically retried up to 3 times before returning an error.

Use papaiapi with Google's gemini-cli

The official Gemini CLI is Google's open-source agent (file edits, shell, browser, etc.). Point it at papaiapi instead of Google's endpoint and your papaiapi balance pays for everything โ€” no Google Cloud account, no project setup, no quota of your own to manage.

Setup takes about 2 minutes.

1

Get a papaiapi API key

Open Telegram and message @papaiapibot:

  1. Send /start
  2. Tap API Keys โ†’ Generate Key
  3. Copy the key (starts with sk_live_โ€ฆ)
  4. Tap Top up balance and add at least a few dollars (each request costs $0.0005 โ€” $5 buys ~10,000 calls)
2

Install gemini-cli

Requires Node.js 20+ (check with node -v).

npm install -g @google/gemini-cli

# verify
gemini --version

If npm permission errors, prefix with sudo or use a node version manager (nvm, fnm, volta).

3

Point gemini-cli at papaiapi

Just two environment variables. Open a terminal and:

export GOOGLE_GEMINI_BASE_URL="https://papaiapi.com/v1direct"
export GEMINI_API_KEY="sk_live_your_papaiapi_key_here"

To make it permanent, add those two lines to your shell profile (~/.zshrc on macOS, ~/.bashrc on Linux). Then run source ~/.zshrc.

How it works

GOOGLE_GEMINI_BASE_URL tells the CLI to send all requests to papaiapi instead of generativelanguage.googleapis.com. GEMINI_API_KEY is your papaiapi key, which we use to authenticate and charge your balance. We then forward each request through a rotated pool of 468 free Google AI Studio keys + 47 US residential proxies, transparently retrying on quota errors.

4

Pick a model that won't 429

Our pool runs on free-tier keys, so Google's Pro models hit quota almost immediately. Stick to Flash models โ€” they're fast, cheap, and have generous free quota.

Recommended models, in order:

ModelWhen to useStatus
gemini-2.5-flashDefault pick โ€” battle-tested, fastest, GAโœ… Stable
gemini-3-flash-previewWhen you want the newest modelโš ๏ธ Preview (may change)
gemini-flash-latestSet-and-forget โ€” auto-tracks the latest stable Flashโœ… Alias

Avoid: any *-pro* or "Auto" picker โ€” they'll escalate to Pro and 429.

5

Run it

One-shot prompt:

gemini -m gemini-2.5-flash -p "What is 17 times 23?"
# โ†’ 391

Interactive REPL:

gemini -m gemini-2.5-flash

Full agent mode (auto-approve all tool calls, file edits, shell commands):

cd ~/your-project
gemini -m gemini-2.5-flash --yolo -p "Add a README explaining what this repo does"

Streaming a long task (the CLI streams by default; just use -i to start a session):

gemini -m gemini-2.5-flash -i "Refactor all .ts files to use async/await"
6

Switching models

You have three ways to choose / change the model:

A. Per command (one-off):

gemini -m gemini-3-flash-preview -p "Hello"

B. Inside an interactive session โ€” open the model picker:

  1. Start gemini
  2. Press Esc to enter command mode
  3. Type /model and press Enter
  4. Pick option 3. Manual
  5. Type gemini-2.5-flash (or gemini-3-flash-preview)
  6. Press Tab to toggle "Remember model for future sessions" to true
  7. Press Enter

After step 6, the CLI saves your choice to ~/.gemini/settings.json and you won't see the picker again.

C. Set a default in your shell profile:

# add to ~/.zshrc or ~/.bashrc
alias gemini='gemini -m gemini-2.5-flash'
7

Verify everything works

Quick sanity check:

gemini -m gemini-2.5-flash -p "Reply with the word OK only."
# expect: OK

If you see OK โ€” you're done. โœ…

Troubleshooting

What you seeWhat it meansFix
401 invalid api key Your GEMINI_API_KEY isn't a papaiapi key, or it's revoked Re-generate via @papaiapibot
402 Insufficient balance Your papaiapi balance is below $0.0005 Top up via Telegram bot
429 quota exhausted You picked a Pro model โ€” free-tier keys can't fund it Switch to gemini-2.5-flash or gemini-3-flash-preview
404 model not found You typed a model name Google doesn't recognize (e.g. gemini-flash) Use the full ID like gemini-2.5-flash
429 Rate limit exceeded. Maximum N concurrent This is papaiapi's per-account thread limit โ€” your client is firing too many parallel requests Reduce parallelism, or DM us to bump your limit
CLI prompts for Google login / OAuth flow Cli is in OAuth mode โ€” we only support API-key auth Make sure GEMINI_API_KEY is set; in the auth picker, choose "Use Gemini API key"
Slow/hanging responses One of our proxies is flaky โ€” we'll retry up to 3ร— automatically Wait. Most resolve under 30s; the 1% tail can take ~minute

FAQ

Does this work with the agentic / yolo mode?

Yes โ€” function calling, multi-turn tool use, file edits, shell commands all work. We forward Google's API verbatim.

What about file uploads?

Inline base64 (in the request body) works. Multipart upload to /upload/v1beta/files isn't supported yet โ€” workaround is to base64-encode the file and pass it as inlineData.

Does my conversation history leak to other users?

No. Each request is stateless from our side; we only forward it. The CLI itself maintains your conversation context locally.

How is this different from /v1/chat/completions and /v1beta/...?

Those endpoints use browser automation behind the scenes โ€” slower (3-30s) but works without API keys. /v1direct uses the real Gemini API with rotated free keys โ€” much faster (1-5s typical) and supports every Gemini feature, but only works because we have a key pool. For agentic work, /v1direct is the right choice.

Can I use any other gemini-compatible tool?

Yes. Anything that accepts a custom base URL and uses x-goog-api-key auth works. Examples that have been tested: @google/genai (Node), google-genai (Python), gemini-cli, custom apps using the REST API.