chinese-ebook-downloader

Download Chinese-language ebooks from multiple sources with automatic A→B→C fallback. Primary source: online book library with ~100% coverage, no daily limit. Fallback sources: secondary library, Anna's Archive. Supports EPUB→PDF auto-conversion via weasyprint. Handles file host decryption, countdown wait, JS API extraction, GBK-encoded ZIP. Formats: PDF, EPUB, MOBI, AZW3. Use when: "下载电子书", "下载这本书", "找一下某本书的电子版", "帮我下个epub/mobi/azw3/pdf".

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "chinese-ebook-downloader" with this command: npx skills add lb1121/chinese-ebook-downloader

Chinese Ebook Downloader

Download Chinese ebooks from multiple sources with automatic fallback and format conversion.

Quick Start

# Single book download (multi-source fallback)
python scripts/download_book.py --title "超越百岁" --author "彼得·阿提亚"

# Multi-source batch download (A→B→C fallback + EPUB→PDF conversion)
python scripts/multi_source_download.py ~/Books/

# Search Anna's Archive directly
python scripts/search_source_c.py "书名" "作者"

# Convert EPUB to PDF
python scripts/epub_to_pdf.py book.epub book.pdf

Download Sources (Priority Order)

SourceCoverageLimitNotes
Source A (online book library)~100%NonePrimary — high coverage for popular Chinese books
Source B (secondary library)~8%NoneFallback for missing titles
Source C (Anna's Archive)WideRate-limitedLast resort — uses libgen.li mirrors

Note: Z-Library has been deprecated due to 10/day download limit.

Multi-Source Fallback

The multi_source_download.py script automatically tries sources in order:

Source A → Source B → Source C → EPUB→PDF Conversion

Workflow per book:

  1. Try Source A (ZIP → extract PDF/EPUB)
  2. If failed, try Source B (file host download)
  3. If failed, try Source C (Anna's Archive via libgen.li)
  4. If only EPUB found, auto-convert to PDF using weasyprint

Usage:

# Edit BOOKS list in script, then run:
python scripts/multi_source_download.py ~/Books/

EPUB → PDF Conversion

When only EPUB format is available, auto-convert using weasyprint:

# Single file
python scripts/epub_to_pdf.py input.epub output.pdf

# Batch convert directory
python scripts/epub_to_pdf.py --batch ~/Books/

Requirements: ebooklib, weasyprint, CJK fonts installed.

Scripts Reference

ScriptPurpose
download_book.pyPrimary download from Source A
search_secondary_source.pySource B search & download
search_source_c.pyAnna's Archive search & download
batch_download.pyBatch download from JSON list
multi_source_download.pyMulti-source A→B→C fallback
epub_to_pdf.pyEPUB/MOBI to PDF conversion
anna_iso_batch.shAnna's Archive isolated batch (one process per book)

Source A Workflow (Primary)

Search → Get file host link → Decrypt → Wait countdown → API fetch → curl download → Extract ZIP

Step 1: Search

Search the primary library for the book title. Navigate to download page, extract file host URL and password.

Step 2: Decrypt

Navigate to file host URL, enter password, click decrypt.

Step 3: Wait for countdown

File hosting service requires countdown before download. Do not skip.

Step 4: Fetch real download URL

Get page variables:

JSON.stringify({api_server, userid, file_id, share_id, file_chk, start_time, wait_seconds, verifycode})

Call API:

(async () => {
  var url = api_server + '/get_file_url.php?uid=' + userid
    + '&fid=' + file_id + '&folder_id=0&share_id=' + share_id
    + '&file_chk=' + file_chk + '&start_time=' + start_time
    + '&wait_seconds=' + wait_seconds + '&mb=0&app=0&acheck=0'
    + '&verifycode=' + verifycode + '&rd=' + Math.random();
  var headers = typeof getAjaxHeaders === 'function' ? getAjaxHeaders() : {};
  var resp = await fetch(url, {headers: headers});
  return JSON.stringify(await resp.json());
})()

Response code: 200downurl is real URL.

Step 5: Download

curl -L -o "book.zip" "DOWNURL" \
  -H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
  --max-time 1200

Step 6: Extract ZIP (GBK encoding)

import zipfile
with zipfile.ZipFile('book.zip', 'r') as z:
    for info in z.infolist():
        try:
            name = info.filename.encode('cp437').decode('gbk')
        except:
            name = info.filename
        ext = os.path.splitext(name)[1].lower()
        if ext in ('.epub', '.azw3', '.mobi', '.pdf', '.txt'):
            data = z.read(info.filename)
            with open(os.path.basename(name), 'wb') as f:
                f.write(data)

Book Name Matching Strategy

When a book title is long or contains multiple names (e.g. box sets):

  • Removes subtitles (after ":" or ":")
  • Removes parenthetical content ("(...)", "(...)")
  • Removes "套装共X册" bundle descriptions
  • Splits "+"-connected titles into individual books
  • Tries each keyword until match found
  • Falls back to full title + author

Examples:

  • "杨定一全部生命系列:真原医+静坐+好睡(套装3册)" → tries "真原医", "静坐", "好睡"
  • "超越百岁:长寿的科学与艺术" → tries "超越百岁", then "超越百岁 彼得·阿提亚"

Format Selection

FlagDescription
--format pdfPDF only (default, preferred for NotebookLM)
--format epubEPUB only
--format mobiMOBI only
--format azw3AZW3 only
--format anyAccept any available format

Batch Download

python scripts/batch_download.py --book-list books.json --output-dir ~/Books/

JSON format:

[
  {"title": "超越百岁", "file_url": "<file_host_url>", "password": "<password>"}
]

Features: resume via _progress.json, skip existing, rate limiting.

Troubleshooting

ProblemSolution
IP blockingUse browser tool, not web_fetch
Link 404Link expired, re-search
API non-200Re-navigate and re-decrypt
Download is HTMLURL expired, fresh API call needed
ZIP filenames garbledUse Python cp437→gbk, not unzip
Timeout on large filesIncrease --max-time to 1200
Anna's Archive blockedTry different mirror, use anna_iso_batch.sh

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

通义晓蜜 - 智能外呼

触发阿里云晓蜜外呼机器人任务,自动批量拨打电话。适用于批量外呼、客户回访、满意度调查、简历筛查约面试等场景。可从前置工具或节点获取外呼名单。

Registry SourceRecently Updated
General

Letterboxd Watchlist

Scrape a public Letterboxd user's watchlist into a CSV/JSONL list of titles and film URLs without logging in. Use when a user asks to export, scrape, or mirror a Letterboxd watchlist, or to build watch-next queues.

Registry SourceRecently Updated
General

Seedance Video Generation

Generate AI videos using ByteDance Seedance. Use when the user wants to: (1) generate videos from text prompts, (2) generate videos from images (first frame, first+last frame, reference images), or (3) query/manage video generation tasks. Supports Seedance 1.5 Pro (with audio), 1.0 Pro, 1.0 Pro Fast, and 1.0 Lite models.

Registry SourceRecently Updated
4.2K17jackycser
General

Universal Skills Manager

The master coordinator for AI skills. Discovers skills from multiple sources (SkillsMP.com, SkillHub, and ClawHub), manages installation, and synchronization...

Registry SourceRecently Updated