Photo Dedup — Find & Select Unique Photos from Duplicates

Use this skill when the user wants to deduplicate photos, find unique images from a large set, remove similar/duplicate photos, or organize photos by uniqueness. Trigger phrases: "dedup photos", "find duplicate photos", "unique photos", "remove duplicate images", "photo dedup", "/photo-dedup".

Overview

This skill processes a folder of photos (typically hundreds from events like school photography), identifies duplicates and near-duplicates using perceptual hashing, and organizes them into unique vs duplicate folders. It's designed for the workflow where ~500 photos come in but only ~100 are truly unique.

How It Works

Perceptual Hashing — Each image is converted to a perceptual hash (pHash) that represents its visual content. Similar-looking images produce similar hashes, even if they differ in resolution, compression, or minor edits.
Clustering — Images are grouped by hash similarity. Each cluster represents one "scene" or "shot". The best image from each cluster (largest file size = highest quality) is selected as the unique representative.
Output — Unique photos are copied to a unique/ folder. A report is generated showing how many duplicates were found and the cluster groupings.

Usage

Basic — Dedup a folder:

/photo-dedup ~/Photos/school-event/

With custom threshold:

/photo-dedup ~/Photos/school-event/ --threshold 8

Threshold controls similarity sensitivity (default: 6, range 0-20). Lower = stricter matching, higher = more aggressive grouping.

Preview mode (no file copying):

/photo-dedup ~/Photos/school-event/ --preview

Workflow

When the user invokes this skill:

Validate input — Confirm the source folder exists and contains images
Install dependencies if needed — pip3 install Pillow imagehash pillow-heif

Run the dedup scan:

python3 ~/.claude/skills/photo-dedup/scripts/dedup.py <source_folder> --preview [--threshold N]

Report results to the user (total, unique, duplicates)
Launch the review server (runs locally, opens browser automatically):
```
python3 ~/.claude/skills/photo-dedup/scripts/review_server.py /tmp/dedup_report_<folder_name>.json --output ~/Desktop/selected_photos &
```
The user can then:
- See all duplicate groups side by side (Notion-style UI)
- Click to select which photo to keep from each group
- Use "Auto-select best" for one-click defaults
- Hit "Save selected" — photos are copied instantly, no Terminal needed
After user saves, tell them where the selected photos are
Kill the server when done: kill $(pgrep -f review_server)

The review server is a lightweight local HTTP server — no install, no config, just Python. Non-technical users only interact with the browser. One click to save.

Output Structure

~/Desktop/selected_photos/     ← User's selected photos (copies, originals untouched)
/tmp/dedup_report_*.json       ← Clustering report
(original photos are NEVER modified or deleted)

Important Notes

Non-destructive — Original photos are NEVER moved or deleted. Unique photos are copied to a subfolder.
Supported formats — JPG, JPEG, PNG, HEIC, WEBP, TIFF, BMP
Performance — Handles 500+ photos in under a minute on modern hardware
Selection criteria — When duplicates are found, the largest file (highest quality) is picked as the representative

photo-dedup

Safety Notice

Copy this and send it to your AI assistant to learn

Photo Dedup — Find & Select Unique Photos from Duplicates

Overview

How It Works

Usage

Basic — Dedup a folder:

With custom threshold:

Preview mode (no file copying):

Workflow

Output Structure

Important Notes

Source Transparency

Related Skills

google-maps-api

voicebox

docx-design-agent

xlsx-design-agent