Multimodal Analyzer

使用GLM-4.6V模型进行多模态内容理解(图片、视频、文档)

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Multimodal Analyzer" with this command: npx skills add tridefender/multimodal

Multimodal Understanding Skill

使用智谱GLM-4.6V模型理解图片、视频、文档内容。

功能

  • 图片理解:OCR、场景分析、物体检测、属性识别
  • 视频理解:内容摘要、关键帧分析
  • 文档理解:PDF、复杂表格解析
  • 深度思考模式:开启后进行深层推理分析

使用方式

理解这张图片:[图片URL或本地路径]
分析这个视频:[视频URL]
这个PDF讲什么:[PDF URL]

技术细节

限制

  • 不支持同时处理图片+视频+文件(只能选一种模态)
  • 视频URL需要公网可访问

调用脚本

调用 scripts/analyze.py 进行分析:

python scripts/analyze.py --type image|video|file --input <url_or_path> --prompt "你的问题"

参数:

  • --type: 输入类型 (image/video/file)
  • --input: URL或本地文件路径
  • --prompt: 分析提示词
  • --thinking: 启用深度思考模式
  • --stream: 流式输出

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Accelo

Accelo integration. Manage Organizations, Leads, Pipelines, Users, Goals, Filters. Use when the user wants to interact with Accelo data.

Registry SourceRecently Updated
General

8X8

8x8 integration. Manage Persons, Organizations, Deals, Leads, Activities, Notes and more. Use when the user wants to interact with 8x8 data.

Registry SourceRecently Updated
General

7Shifts

7shifts integration. Manage Companies. Use when the user wants to interact with 7shifts data.

Registry SourceRecently Updated
General

46Elks

46elks integration. Manage Organizations. Use when the user wants to interact with 46elks data.

Registry SourceRecently Updated