Multimodal Memory
Store and retrieve visual content — user images, charts, diagrams, website UIs — across conversations.
Important: Image Analysis
The primary model may not support vision. Always use analyze.py to analyze images — it calls GPT-4o directly via API and does not rely on your own vision capability.
Storage Location
All data lives in ~/.multimodal-memory/:
images/— saved copies of captured imagesmetadata.db— SQLite database (auto-created)memory.md— human-readable summary (auto-updated)
Read ~/.multimodal-memory/memory.md at session start for a quick overview.
Scenarios & Actions
1. User Sends an Image / Chart / Diagram
When a user sends an image, OpenClaw saves it locally and provides the file path in the message context (look for a path like /tmp/... or ~/.openclaw/...).
Run analyze.py with that path — it calls GPT-4o to analyze and stores the result automatically:
python {baseDir}/scripts/analyze.py \
--image-path "/absolute/path/to/image.jpg" \
--source "image"
For charts use --source "chart", for diagrams use --source "image".
If you cannot find the file path in the message context, ask the user:
"请问这张图片保存在哪个路径?或者你可以直接粘贴文件路径给我。"
2. User Asks to Capture / Remember a Website
Step 1 — take the screenshot:
python {baseDir}/scripts/capture_url.py --url "https://example.com"
The script prints the saved screenshot path.
Step 2 — analyze and store it:
python {baseDir}/scripts/analyze.py \
--image-path "/path/printed/above.png" \
--source "website" \
--url "https://example.com"
3. User Searches by Text
python {baseDir}/scripts/search.py --query "login screen dark theme"
Show results with descriptions and image paths.
4. User Sends an Image to Search (find similar memories)
Step 1 — analyze the query image to get its description:
python {baseDir}/scripts/analyze.py \
--image-path "/path/to/query/image.jpg" \
--source "image"
Step 2 — the analysis is stored; also search for similar past content using the description keywords:
python {baseDir}/scripts/search.py --query "key concepts from the analysis output"
Step 3 — present matching memories and explain why they're relevant.
5. List Recent Memories
python {baseDir}/scripts/list.py --limit 20
Core Rules
- Never try to analyze images yourself — always delegate to
analyze.py. - After storing, confirm to user: description + tags saved.
- Image paths must be absolute.
- The
--extra-tagsarg accepts comma-separated additional tags.
One-Time Setup for URL Capture
If capture_url.py fails:
pip install playwright && python -m playwright install chromium
Script Reference
| Script | Purpose | Key args |
|---|---|---|
analyze.py | Analyze image with GPT-4o + store | --image-path, --source, --url, --extra-tags |
store.py | Store pre-analyzed result | --image-path, --description, --tags, --source, --url |
search.py | Text search | --query, [--limit N] |
list.py | List memories | [--limit N] |
capture_url.py | Screenshot a URL | --url |