weibo-hot-search

微博多频道热搜数据采集与可视化 | Weibo Multi-Channel Hot Search Data Collection & Visualization 支持热搜总榜、社会榜、文娱榜、生活榜同时抓取 | Supports hot search, social, entertainment, life channels 数据持久化存储,提供HTML可视化报告 | Data persistence with HTML visualization reports

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "weibo-hot-search" with this command: npx skills add noah-1106/weibo-trending

Weibo Hot Search - 微博热搜数据采集 | Weibo Hot Search Data Collection

多频道微博热搜数据采集工具,支持数据持久化存储和可视化展示。 Multi-channel Weibo hot search data collection tool with persistence and visualization.


功能特性 | Features

  • 多频道采集 / Multi-Channel Collection - 同时抓取热搜总榜、社会榜、文娱榜、生活榜 | Fetch hot search, social, entertainment, life channels simultaneously
  • 数据持久化 / Data Persistence - 自动保存到SQLite数据库,支持历史查询 | Auto-save to SQLite database with historical query support
  • HTML可视化 / HTML Visualization - 生成交互式报告,支持日期/频道/关键词筛选 | Generate interactive reports with date/channel/keyword filters
  • 频道标签 / Channel Tags - 热/新/商/官宣等标签识别 | Hot/New/Commercial/Official tag recognition

快速开始 | Quick Start

1. 初始化数据库 | Initialize Database

cd scripts
python3 init_db.py

2. 采集数据 | Collect Data

# 采集所有频道(每频道30条)/ Fetch all channels (30 per channel)
python3 save_to_db.py

# 指定数量 / Specify count
python3 save_to_db.py 50

3. 查询数据 | Query Data

# 查看今天的热搜 / View today's hot search
python3 query.py today

# 查看指定频道 / View specific channel
python3 query.py today hot

# 查看指定日期 / View specific date
python3 query.py date 2026-03-15

# 查看统计 / View statistics
python3 query.py stats 7

4. 生成HTML报告 | Generate HTML Report

python3 generate_html.py
open ../data/index.html

文件结构 | File Structure

weibo-fresh-posts-0/
├── SKILL.md                    # 本文档 | This document
├── data/
│   ├── weibo.db                # SQLite数据库 | SQLite database
│   └── index.html              # HTML可视化报告 | HTML visualization report
└── scripts/
    ├── init_db.py              # 数据库初始化 | DB initialization
    ├── db.py                   # 数据库操作模块 | DB operations module
    ├── fetch-hot-search.py     # 核心采集脚本 | Core collection script
    ├── save_to_db.py           # 采集并保存到数据库 | Collection & save
    ├── query.py                # 数据查询工具 | Data query tool
    └── generate_html.py        # HTML报告生成 | HTML report generation

数据库结构 | Database Schema

hot_items 表 | hot_items Table

CREATE TABLE hot_items (
    id TEXT PRIMARY KEY,              -- URL+日期+频道的哈希 | Hash of URL+date+channel
    platform TEXT DEFAULT 'weibo',    -- 平台标识 | Platform identifier
    channel_id TEXT,                  -- hot/social/entertainment/life
    channel_name TEXT,                -- 频道名称 | Channel name
    rank INTEGER,                     -- 排名 | Ranking
    title TEXT NOT NULL,              -- 标题 | Title
    url TEXT NOT NULL,                -- 链接 | Link
    heat INTEGER,                     -- 热度值 | Heat value
    tag TEXT,                         -- 热/新/商/官宣等 | Hot/New/Commercial/Official
    fetched_at INTEGER,               -- 抓取时间 | Fetch time
    fetch_date TEXT                   -- 抓取日期 YYYY-MM-DD | Fetch date
);

topic_posts 表 | topic_posts Table

CREATE TABLE topic_posts (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    hot_item_id TEXT,                 -- 关联的热搜条目 | Related hot item
    author TEXT,                      -- 作者 | Author
    author_type TEXT,                 -- media/user | Media or user
    content TEXT,                     -- 内容 | Content
    url TEXT,                         -- 链接 | Link
    is_media BOOLEAN                  -- 是否媒体账号 | Is media account
);

使用示例 | Usage Examples

采集数据 | Collect Data

# 基础采集 / Basic collection
python3 scripts/save_to_db.py

# 显示输出 / Sample output:
# ============================================================
# 📱 微博热搜采集 → 数据库 / Weibo Hot Search → Database
#    每频道采集数量 / Per channel count: 30
# # ============================================================
# 🔥 开始采集微博热搜 / Starting Weibo hot search collection...
# 
# 📡 [热搜总榜 / Hot Search]
#    ✅ 30 条热搜 / 30 hot items
#    新增 / New: 30 条
# 
# 📡 [社会榜 / Social]
#    ✅ 30 条热搜 / 30 hot items
#    新增 / New: 30 条
# ...

查询数据 | Query Data

# 今天的热搜 / Today's hot search
$ python3 query.py today

📱 微博热搜 - 2026-03-16 / Weibo Hot Search
================================================================================

【热搜总榜 / Hot Search】
  1. 315晚会曝光... [热] / 315 Gala exposure... [Hot]
      🔥 5,000,000
  2. 明星离婚... [爆] / Celebrity divorce... [Viral]
      🔥 3,200,000
...

【社会榜 / Social】
  1. 交通事故... / Traffic accident...
  2. 天气预报... / Weather forecast...

生成报告 | Generate Report

$ python3 generate_html.py

✅ HTML报告已生成 / HTML report generated: data/index.html
   共 / Total: 120 条记录
   日期范围 / Date range: 2026-03-15 ~ 2026-03-16
   频道 / Channels: 热搜总榜, 社会榜, 文娱榜, 生活榜

   打开方式 / Open methods:
   - Mac: open data/index.html

HTML报告功能 | HTML Report Features

  • 📅 日期筛选 / Date Filter - 选择具体日期 | Select specific date
  • 📺 频道筛选 / Channel Filter - 点击频道标签过滤 | Click channel tags to filter
  • 🔍 关键词搜索 / Keyword Search - 实时搜索标题 | Real-time title search
  • 🔥 热度显示 / Heat Display - 显示热度值 | Show heat values
  • 🏷️ 标签展示 / Tag Display - 热/新/商/官宣等标签 | Hot/New/Commercial/Official tags
  • 🏆 排名标识 / Ranking Display - Top 3 特殊颜色标识 | Top 3 special color marking

支持频道 | Supported Channels

频道ID / Channel ID频道名称 / Channel Name说明 / Description
hot热搜总榜 / Hot Search综合热搜 / Comprehensive hot
social社会榜 / Social社会新闻 / Social news
entertainment文娱榜 / Entertainment娱乐文化 / Entertainment & culture
life生活榜 / Life生活方式 / Lifestyle

原始采集脚本 | Original Collection Script

如需直接获取JSON数据: For direct JSON output:

# 输出到文件 / Output to file
python3 fetch-hot-search.py -o weibo-hot.json

# 输出到stdout(静默模式)/ Output to stdout (quiet mode)
python3 fetch-hot-search.py -q

# 抓取详细内容(前10个话题的帖子)/ Fetch detailed content (posts for top 10 topics)
python3 fetch-hot-search.py -c --content-limit 2 -o weibo-hot.json

数据查询SQL示例 | SQL Query Examples

# 进入数据库 / Enter database
sqlite3 data/weibo.db

# 今天的热搜总榜 / Today's hot search
SELECT rank, title, heat, tag FROM hot_items 
WHERE fetch_date = date('now') AND channel_id = 'hot'
ORDER BY rank LIMIT 10;

# 最近7天每天各频道数量 / Daily channel counts for last 7 days
SELECT fetch_date, channel_name, COUNT(*) 
FROM hot_items 
WHERE fetch_date >= date('now', '-7 days')
GROUP BY fetch_date, channel_id;

# 包含"315"的热搜 / Hot search containing "315"
SELECT * FROM hot_items 
WHERE title LIKE '%315%' 
ORDER BY fetch_date DESC, heat DESC;

注意事项 | Notes

  1. 需要登录 / Login Required - 使用 browser open https://weibo.com 登录 | Use browser open https://weibo.com to login
  2. 频率限制 / Rate Limiting - 每次抓取有短暂延迟,避免触发反爬 | Brief delay between fetches to avoid anti-crawl
  3. 数据去重 / Deduplication - 同一天同一URL同一频道只保存一次 | Same URL on same day/channel saved once only
  4. 热度更新 / Heat Update - 重新抓取会更新热度值 | Refetching updates heat values

更新记录 | Changelog

  • 2026-03-16: 添加数据库持久化和HTML可视化功能 / Added database persistence and HTML visualization

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Youtube Podcast Generator

Extracts the original text of Youtube video and converts it into a multi-voice AI podcast using a local Node.js API and FFmpeg. It also can show you the text...

Registry SourceRecently Updated
General

ERPClaw

AI-native ERP system with self-extending OS. Full accounting, invoicing, inventory, purchasing, tax, billing, HR, payroll, advanced accounting (ASC 606/842,...

Registry SourceRecently Updated
General

Whisper AI Audio to Text Transcriber

Turn raw transcripts into structured summaries, meeting minutes, and action items.

Registry SourceRecently Updated
General

Task Planner

- **name**: Task Planner. Use when you need task planner capabilities. Triggers on: task planner.

Registry SourceRecently Updated