pf-clone-gitlab

Batch clone GitLab group projects with branch checkout and Excel indexing. Use when user needs to clone all projects from a GitLab group, organize by group structure, checkout multiple branches (master/default + latest + release/prod), and generate an Excel index file with project metadata and branch information.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pf-clone-gitlab" with this command: npx skills add gitlab-batch-cloner

Clone GitLab Skill

Batch clone all projects from GitLab group(s), maintain folder hierarchy, checkout key branches, and generate an Excel index.

When to Use

  • User wants to batch clone all projects from one or more GitLab top-level groups
  • Need to maintain original group/subgroup folder hierarchy locally
  • Need to checkout multiple branches (default + latest active + release/prod)
  • Need an Excel index file (01.Index.xlsx) with per-group sheets

Requirements

Before starting, collect from user:

ParameterRequiredDefaultNotes
GitLab URLe.g. https://gitlab.company.com
Personal Access TokenNeeds read_api + read_repository scopes
Target Group(s)Group/sub-group/project paths, comma-separated. Supports: top-level group (myGroup), sub-group path (myGroup/mySubGroup), or direct project path (myGroup/mySubGroup/my-project)
Local Storage Path~/Desktop/CodeWhere repos are stored
Auth MethodHTTPS+TokenOr SSH if key is configured
Modecloneclone (default), update (pull only), or sync (clone+pull+cleanup stale Excel rows)
Workers4Parallel clone workers (GITLAB_WORKERS)
Total Timeout0 (none)Global timeout in seconds (GITLAB_TOTAL_TIMEOUT). 0=no limit

Workflow

Step 1: Collect Input

Ask user for the 4 parameters above. Pass them as environment variables to the script:

GITLAB_URL, GITLAB_TOKEN, GITLAB_GROUPS (comma-separated), GITLAB_BASE_DIR

Step 2: Run the Script

cd <skill-dir>/scripts
python3 clone_and_index.py

The script handles everything:

  1. Resolves each input path — auto-detects if it's a top-level group, sub-group, or direct project
  2. For top-level groups: fetches all subgroups recursively (skips if access denied)
  3. For sub-group paths (e.g. myGroup/mySubGroup): directly resolves and syncs that sub-group and its descendants only
  4. For direct project paths: syncs only that specific project
  5. Clones new projects / pulls existing ones (with 5-min timeout per project)
  6. Uses multiprocessing for parallel clone/pull (process-group kill on timeout to prevent orphan git processes)
  7. Checks out branches: default + latest active + release/prod
  8. Incrementally updates 01.Index.xlsx every 50 projects (so partial results survive crashes)
  9. On SIGTERM/SIGINT, emergency-flushes pending results to Excel before exit
  10. Prints per-group project counts during discovery and real-time progress with elapsed time
  11. In sync mode, the final Excel write removes rows for deleted/archived projects and handles cross-group migrations

Step 3: Report Results

After the script finishes, report:

  • Total projects cloned/updated
  • Any failed/timed-out projects (the script prints a summary table)

Modes

Update Only

If projects already exist and user just wants to update:

GITLAB_MODE=update python3 clone_and_index.py

This skips clone, only does git fetch --all && git pull on existing repos, re-checkouts branches, and refreshes the Excel.

Sync (Full Sync with Cleanup)

For a complete sync that also cleans up stale data in the Excel:

GITLAB_MODE=sync python3 clone_and_index.py

This behaves like clone mode (new repos are cloned, existing ones are pulled), but additionally:

  • Removes Excel rows for projects that no longer exist on GitLab (deleted/archived)
  • Handles projects that moved between groups (updates path, removes old row)
  • Only cleans up sheets belonging to the groups specified in GITLAB_GROUPS (won't touch other groups' data)

Excel Specification

File: 01.Index.xlsx inside <local-path>/ (e.g. ~/Desktop/Code/01.Index.xlsx)

Sheets: One sheet per top-level group (sheet name = group name, e.g. "myGroup")

Columns:

ColFieldContent
A主Group名称Top-level group (e.g. myGroup)
B子Group路径Full group path without project name
CProject路径Full path (e.g. myGroup/mySubGroup/myProject)
DProject名称Project name
EProject描述GitLab description
F已checkout分支All local branches, one per line
G分支最新提交时间Corresponding commit times, one per line
HSSH Git链接ssh_url_to_repo
I下载时间Clone/update timestamp
JProject IDGitLab project ID (hidden column, used for matching)

Sort: A (asc) → B (asc) → D (asc)

Formatting: Frozen header row, thin borders on all cells, F/G columns left-aligned with wrap, other columns center-aligned, UTF-8 encoding.

Security

  • Token is passed via environment variable, never logged
  • After clone, remote URL is reset to remove embedded token
  • If clone times out or crashes, a cleanup step removes token from .git/config

Output Structure

~/Desktop/Code/
├── 01.Index.xlsx
├── myGroup/
│   ├── SubGroup1/
│   │   ├── project-a/
│   │   └── project-b/
│   └── SubGroup2/
│       └── project-c/
└── AnotherGroup/
    └── ...

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

claw2ui

Generate interactive web pages (dashboards, charts, tables, reports) and serve them via public URL. Use this skill when the user explicitly asks for data vis...

Registry SourceRecently Updated
General

WeChat Article Summarize

Read one or more WeChat public account article links from mp.weixin.qq.com, extract cleaned full text and optional image links, summarize each article in Chi...

Registry SourceRecently Updated
General

Openfinance

Connect bank accounts to AI models using openfinance.sh

Registry SourceRecently Updated
General

---

合同审查清单AI助手 - 5类合同+3大特殊条款,风险识别与修改建议

Registry SourceRecently Updated