Gemini API
Use Google Gemini API via REST for text generation, multimodal analysis, image generation, and more.
Prerequisites
-
Environment variable GOOGLE_API_KEY must be set
-
API endpoint: https://generativelanguage.googleapis.com/v1beta
Available Models
Model Use Case
gemini-2.5-flash
Fast text generation (default)
gemini-2.5-pro
High quality text generation
gemini-3-flash-preview
Latest flash model
gemini-3-pro-preview
Latest pro model
gemini-2.5-flash-image
Image generation (Nano Banana)
gemini-3-pro-image-preview
Advanced image generation with thinking & search
Workflow
Phase 1: Determine Task Type
Based on user request, identify which capability to use:
-
Text Generation: Basic prompts, chat, Q&A
-
Multimodal Analysis: Analyze images, videos, or audio
-
Image Generation: Create or edit images (Nano Banana)
-
Function Calling: Execute custom functions
-
Search Grounding: Real-time web search integration
Phase 2: Execute API Call
Use the appropriate curl command based on task type.
- Text Generation
Basic Prompt
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [{"text": "Your prompt here"}]
}]
}'
With Configuration
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [{"text": "Your prompt here"}]
}],
"generationConfig": {
"temperature": 0.9,
"maxOutputTokens": 2000,
"stopSequences": ["END"]
}
}'
Multi-turn Chat
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [
{"role": "user", "parts": [{"text": "First message"}]},
{"role": "model", "parts": [{"text": "Model response"}]},
{"role": "user", "parts": [{"text": "Follow-up question"}]}
]
}'
System Instructions
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"system_instruction": {
"parts": [{"text": "You are a helpful assistant that speaks like a pirate."}]
},
"contents": [{
"parts": [{"text": "Hello!"}]
}]
}'
JSON Mode Output
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [{"text": "List 3 colors as JSON array"}]
}],
"generationConfig": {
"response_mime_type": "application/json"
}
}'
Streaming Response
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse&key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [{"text": "Write a long story"}]
}]
}'
Safety Settings
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [{"text": "Your prompt"}]
}],
"safetySettings": [
{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_ONLY_HIGH"},
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_ONLY_HIGH"}
]
}'
- Multimodal Analysis
Image Analysis (Base64 Inline)
First encode image to base64
BASE64_IMAGE=$(base64 -w0 image.jpg)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [
{"text": "Describe this image in detail"},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMAGE'"}}
]
}]
}'
Video Analysis (File API)
Step 1: Upload Video
Get upload URL
UPLOAD_URL=$(curl -s "https://generativelanguage.googleapis.com/upload/v1beta/files?key=$GOOGLE_API_KEY"
-H "X-Goog-Upload-Protocol: resumable"
-H "X-Goog-Upload-Command: start"
-H "X-Goog-Upload-Header-Content-Length: $(stat -f%z video.mp4)"
-H "X-Goog-Upload-Header-Content-Type: video/mp4"
-H "Content-Type: application/json"
-d '{"file": {"display_name": "video.mp4"}}'
-D - | grep -i "x-goog-upload-url" | cut -d' ' -f2 | tr -d '\r')
Upload file
curl "$UPLOAD_URL"
-H "X-Goog-Upload-Offset: 0"
-H "X-Goog-Upload-Command: upload, finalize"
-H "Content-Type: video/mp4"
--data-binary @video.mp4
Step 2: Query with Video
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [
{"text": "Describe what happens in this video"},
{"file_data": {"mime_type": "video/mp4", "file_uri": "FILE_URI_FROM_UPLOAD"}}
]
}]
}'
Audio Analysis
Similar to video, upload via File API then query:
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [
{"text": "Transcribe and summarize this audio"},
{"file_data": {"mime_type": "audio/mp3", "file_uri": "FILE_URI_FROM_UPLOAD"}}
]
}]
}'
- Image Generation (Nano Banana)
Basic Image Generation
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{"parts": [{"text": "Create a photorealistic image of a cat wearing a hat"}]}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"]
}
}'
With Aspect Ratio Control
Supported ratios: 1:1 , 2:3 , 3:2 , 3:4 , 4:3 , 4:5 , 5:4 , 9:16 , 16:9 , 21:9
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{"parts": [{"text": "Create a landscape scene"}]}],
"generationConfig": {
"responseModalities": ["IMAGE"],
"imageConfig": {
"aspectRatio": "16:9"
}
}
}'
Image Editing (Character Consistency)
BASE64_IMAGE=$(base64 -w0 original.jpg)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [
{"text": "Put this character in a tropical forest"},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMAGE'"}}
]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"]
}
}'
High Resolution (Pro Model - 2K/4K)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{"parts": [{"text": "A photo of an oak tree in all four seasons"}]}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "1:1",
"imageSize": "4K"
}
}
}'
Image Generation with Search Grounding (Pro)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{"parts": [{"text": "Visualize the current weather forecast for Tokyo as a chart"}]}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "16:9"
}
},
"tools": [{"google_search": {}}]
}'
Multi-Image Fusion
BASE64_IMG1=$(base64 -w0 image1.jpg) BASE64_IMG2=$(base64 -w0 image2.jpg)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"parts": [
{"text": "Combine these two characters in a fantasy world"},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMG1'"}},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMG2'"}}
]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"]
}
}'
- Function Calling
Define and Call Functions
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{
"role": "user",
"parts": [{"text": "What movies are playing in Mountain View?"}]
}],
"tools": [{
"function_declarations": [{
"name": "find_movies",
"description": "Find movies playing in theaters",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City and state"},
"genre": {"type": "string", "description": "Movie genre"}
},
"required": ["location"]
}
}]
}]
}'
Provide Function Response
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [
{"role": "user", "parts": [{"text": "What movies are playing in Mountain View?"}]},
{"role": "model", "parts": [{"functionCall": {"name": "find_movies", "args": {"location": "Mountain View, CA"}}}]},
{"role": "function", "parts": [{"functionResponse": {"name": "find_movies", "response": {"movies": ["Barbie", "Oppenheimer"]}}}]}
],
"tools": [{
"function_declarations": [{
"name": "find_movies",
"description": "Find movies playing in theaters",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"genre": {"type": "string"}
},
"required": ["location"]
}
}]
}]
}'
- Search Grounding
Real-time web search integration:
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"contents": [{"parts": [{"text": "What is the current Google stock price?"}]}],
"tools": [{"google_search": {}}]
}'
Response includes groundingMetadata with sources.
- Context Caching
For repeated queries on the same large content:
Create Cache
curl "https://generativelanguage.googleapis.com/v1beta/cachedContents?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"model": "models/gemini-2.5-flash",
"contents": [{"parts": [{"text": "LARGE_DOCUMENT_TEXT_HERE"}]}],
"ttl": "3600s"
}'
Use Cache
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY"
-H 'Content-Type: application/json'
-X POST
-d '{
"cachedContent": "cachedContents/CACHE_ID",
"contents": [{"parts": [{"text": "Summarize the document"}]}]
}'
- Model Information
List All Models
curl "https://generativelanguage.googleapis.com/v1beta/models?key=$GOOGLE_API_KEY"
Get Specific Model
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash?key=$GOOGLE_API_KEY"
Response Handling
Text Response Structure
{ "candidates": [{ "content": { "parts": [{"text": "Response text here"}], "role": "model" }, "finishReason": "STOP" }], "usageMetadata": { "promptTokenCount": 10, "candidatesTokenCount": 50, "totalTokenCount": 60 } }
Image Response Structure
When using image generation, response includes base64-encoded images:
{ "candidates": [{ "content": { "parts": [ {"text": "Here is your image:"}, {"inlineData": {"mimeType": "image/png", "data": "BASE64_IMAGE_DATA"}} ] } }] }
To save the image:
Extract and decode image from response
echo "BASE64_DATA" | base64 -d > output.png
Error Handling
Error Cause Solution
400 Invalid request Check JSON syntax
401 Invalid API key Verify GOOGLE_API_KEY
429 Rate limit Wait and retry
500 Server error Retry with exponential backoff
Best Practices
-
Use appropriate model: Flash for speed, Pro for quality
-
Set temperature: Lower (0.1-0.3) for factual, higher (0.7-1.0) for creative
-
Limit output tokens: Set maxOutputTokens to avoid excessive responses
-
Use caching: For repeated queries on large documents
-
Handle streaming: For long responses, use streamGenerateContent
-
Image generation tips: Use detailed, descriptive prompts for best results