image-understanding

~~~Markdown --- name: glm-4.6v-connector description: "智谱 GLM-4.6V 多模态视觉模型专业集成插件。支持图像理解、128K 文档解析及自动化工具调用。" version: "1.0.0" homepage: "https://github.com/zai-org/GLM-V" repository: "https://github.com/zai-org/GLM-V.git" authors: ["IsabellaZhangYM"] license: "MIT"

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "image-understanding" with this command: npx skills add IsabellaZhangYM/image-understanding

---
name: glm-4.6v-connector
description: "智谱 GLM-4.6V 多模态视觉模型专业集成插件。支持图像理解、128K 文档解析及自动化工具调用。"
version: "1.0.0"
homepage: "https://github.com/zai-org/GLM-V"
repository: "https://github.com/zai-org/GLM-V.git"
authors: ["IsabellaZhangYM"]
license: "MIT"

# 🛠️ 关键修复:补齐 Registry 所需的元数据声明
requirements:
  environment_variables:
    - ZHIPUAI_API_KEY
  dependencies:
    python:
      - "zhipuai>=2.1.0"
  install_command: "pip install zhipuai"

credentials:
  ZHIPUAI_API_KEY:
    description: "智谱 AI 开放平台 (bigmodel.cn) 的 API Key"
    required: true
    source: "environment_variable"
---

# 👁️ GLM-4.6V 视觉模型集成指南

本 Skill 为开发者提供安全、高效的智谱多模态模型接入能力,适用于自动化文档处理、UI 复刻及智能视觉理解场景。

## 🛡️ 安全合规指引

1. **凭据安全**:本插件强制要求通过环境变量 `ZHIPUAI_API_KEY` 注入凭据。禁止在代码中硬编码任何密钥。
2. **隐私保护**:在上传企业财报、身份证明或敏感截图前,请务必进行局部遮盖或数据脱敏。
3. **调用审计**:建议在 `client` 初始化时启用日志记录,以便追踪工具调用 (Function Call) 的行为。

---

## ⚡ 快速开始

### 1. 环境准备
确保你的环境中已安装 Python 3.8+ 及官方 SDK:
```bash
pip install zhipuai

2. 基础调用示例

import os
from zhipuai import ZhipuAI

# 使用环境变量确保持久安全
client = ZhipuAI(api_key=os.environ.get("ZHIPUAI_API_KEY"))

def analyze_vision(image_path):
    response = client.chat.completions.create(
        model="glm-4.6v",
        messages=[{
            "role": "user", 
            "content": [
                {"type": "text", "text": "提取图中的关键信息并输出为 JSON"},
                {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,...(base64)..."}}
            ]
        }]
    )
    return response.choices[0].message.content

🏗️ 核心功能与场景

场景推荐模型特色能力
高精度 OCRglm-4.6v复杂排版、手写体、公式解析
超长文档/PPTglm-4.6v128K 上下文,支持 200 页文件深度摘要
成本敏感任务glm-4.6v-flash基础识图,完全免费

🔗 开发者资源

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

baidu-search

Comprehensive search API integration for Baidu Qianfan Web Search. Use when Claude needs to perform web searches using Baidu Qianfan's enterprise search API....

Registry SourceRecently Updated
General

Self Memory Manager

管理 Claude 的记忆和工作流程优化。包括:(1) Context 使用管理 (2) 重要信息存档 (3) 定时总结 (4) 工作文件夹维护 用于:context 超过 80%、重要信息需要记录、每日总结、清理旧 session

Registry SourceRecently Updated
General

Seedance Video

Generate AI videos using ByteDance Seedance. Use when the user wants to: (1) generate videos from text prompts, (2) generate videos from images (first frame,...

Registry SourceRecently Updated