Gemini Guide

Google Gemini API 开发助手,精通 Gemini Pro/Flash、多模态、函数调用、上下文缓存

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Gemini Guide" with this command: npx skills add zhangifonly/gemini-guide

Gemini API - Google AI 模型接入指南

简介

Gemini 是 Google 的多模态大模型,通过 AI Studio 或 Vertex AI 提供 API。 核心优势:超长上下文(最高 200 万 token)和原生多模态(文本/图片/视频/音频)。

模型矩阵

模型上下文窗口特点适用场景
gemini-2.5-pro100 万最强推理,思维链复杂分析、代码生成
gemini-2.0-flash100 万速度快,性价比高日常对话、批量处理
gemini-2.0-flash-lite100 万最快最便宜简单任务、高并发
gemini-1.5-pro200 万超长上下文长文档分析、代码库理解

SDK 安装与基础调用

pip install google-genai   # 官方 SDK
from google import genai
client = genai.Client(api_key="YOUR_API_KEY")
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="用 Python 实现一个快速排序算法"
)
print(response.text)

多模态能力

from google.genai import types
import pathlib
# 图片理解
image = types.Part.from_bytes(data=pathlib.Path("photo.jpg").read_bytes(), mime_type="image/jpeg")
response = client.models.generate_content(model="gemini-2.0-flash", contents=["描述图片内容", image])
# 视频理解(直接上传文件)
video_file = client.files.upload(file="video.mp4")
response = client.models.generate_content(model="gemini-2.0-flash", contents=["总结视频内容", video_file])
# 音频理解
audio_file = client.files.upload(file="audio.mp3")
response = client.models.generate_content(model="gemini-2.0-flash", contents=["转录并翻译", audio_file])

函数调用与 JSON 模式

# 函数调用
get_weather = types.FunctionDeclaration(
    name="get_weather", description="获取城市天气",
    parameters=types.Schema(type="OBJECT",
        properties={"city": types.Schema(type="STRING", description="城市名")},
        required=["city"]))
tool = types.Tool(function_declarations=[get_weather])
response = client.models.generate_content(
    model="gemini-2.0-flash", contents="北京天气?",
    config=types.GenerateContentConfig(tools=[tool]))
# JSON 模式
response = client.models.generate_content(
    model="gemini-2.0-flash", contents="列出 3 种编程语言",
    config=types.GenerateContentConfig(response_mime_type="application/json"))

上下文缓存(Context Caching)

反复查询同一大文档时可大幅降低成本:

cache = client.caches.create(model="gemini-2.0-flash", contents=[large_document],
    config=types.CreateCachedContentConfig(display_name="my-cache", ttl="3600s"))
response = client.models.generate_content(model="gemini-2.0-flash", contents="第三章讲了什么?",
    config=types.GenerateContentConfig(cached_content=cache.name))

定价对比(每百万 token)

模型输入价格输出价格
Gemini 2.0 Flash$0.10$0.40
Gemini 2.5 Pro$1.25$10.00
Claude Sonnet 4$3.00$15.00
GPT-4o$2.50$10.00

与 OpenAI/Claude API 的差异

特性Gemini APIOpenAI APIClaude API
最大上下文200 万 token12.8 万20 万
原生多模态文本/图片/视频/音频文本/图片/音频文本/图片
免费额度有(AI Studio)
上下文缓存原生支持Prompt Caching
SDK 风格自有 + OpenAI 兼容自有自有

最佳实践

  • 默认用 gemini-2.0-flash,性价比最高
  • 长文档用上下文缓存,节省 75%+ 成本
  • 视频/音频理解是 Gemini 独特优势
  • API Key: https://aistudio.google.com/apikey

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Maliang Image

Generate images from text prompts or edit existing images with AI. Powered by Google Gemini via Maliang API. Free $6 credit on first use (~10 images). Suppor...

Registry SourceRecently Updated
4020Profile unavailable
Automation

Google Suite Skill

Provides unified access to Gmail, Google Calendar, and Drive APIs for managing emails, calendar events, and files with OAuth2 authentication.

Registry SourceRecently Updated
5180Profile unavailable
General

MenuVision

Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation

Registry SourceRecently Updated
5830Profile unavailable
Coding

Aetherlang Karpathy Skill

API connector for AetherLang Omega — execute 10 Karpathy-inspired agent node types (plan, code_interpreter, critique, router, ensemble, memory, tool, loop, t...

Registry SourceRecently Updated
8822Profile unavailable