ai-tech-rss-fetch

Subscribe to AI and tech RSS feeds and persist normalized metadata into SQLite using mature Python tooling (feedparser + sqlite3). Use when adding feed URLs/OPML sources, running incremental sync with deduplication, and storing entry metadata without full-text extraction or summarization.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ai-tech-rss-fetch" with this command: npx skills add fadeloo/skills/fadeloo-skills-ai-tech-rss-fetch

AI Tech RSS Fetch

Core Goal

  • Subscribe to RSS/Atom sources.
  • Persist feed and entry metadata to SQLite.
  • Deduplicate entries with layered identity keys plus content fingerprints.
  • Keep only metadata; do not fetch full article bodies and do not summarize.

Triggering Conditions

  • Receive a request to subscribe RSS feeds from URLs or OPML.
  • Receive a request to run incremental RSS sync reliably.
  • Need stable metadata persistence for downstream processing.
  • Need dedupe-safe storage of feed items over repeated runs.

Workflow

  1. Prepare runtime and database.
  • Ensure dependency is installed: python3 -m pip install feedparser.
  • In multi-agent runtimes, pin DB to an absolute path before any command:
export AI_RSS_DB_PATH="/absolute/path/to/workspace-rss-bot/ai_rss.db"
  • Initialize SQLite schema once:
python3 scripts/rss_subscribe.py init-db --db "$AI_RSS_DB_PATH"
  1. Add feed subscriptions.
  • Add one feed URL:
python3 scripts/rss_subscribe.py add-feed --db "$AI_RSS_DB_PATH" --url "https://example.com/feed.xml"
  • Import feeds from OPML:
python3 scripts/rss_subscribe.py import-opml --db "$AI_RSS_DB_PATH" --opml assets/hn-popular-blogs-2025.opml
  1. Run incremental sync.
  • Fetch active feeds and store metadata:
python3 scripts/rss_subscribe.py sync --db "$AI_RSS_DB_PATH" --max-feeds 20 --max-items-per-feed 100
  • Optional one-feed sync:
python3 scripts/rss_subscribe.py sync --db "$AI_RSS_DB_PATH" --feed-url "https://example.com/feed.xml"
  1. Query persisted metadata.
  • List feeds:
python3 scripts/rss_subscribe.py list-feeds --db "$AI_RSS_DB_PATH" --limit 50
  • List recent entries:
python3 scripts/rss_subscribe.py list-entries --db "$AI_RSS_DB_PATH" --limit 100

Input Requirements

  • Supported inputs:
    • RSS XML feed URLs.
    • OPML feed list files.

Output Contract (Metadata Only)

  • Persist feeds metadata to SQLite:
    • feed_url, feed_title, site_url, etag, last_modified, status fields.
  • Persist entries metadata to SQLite:
    • id, dedupe_key (compat primary identity snapshot), guid, url, canonical_url, title, author, published_at, updated_at, summary, categories, content_hash, match_confidence, timestamps.
  • Persist entry_identities mapping table to SQLite:
    • entry_id, key_type, key_value, created_at.
    • Supported key types: guid, canonical_url, legacy_guid, fallback_hash.
  • Do not store generated summaries and do not create archive markdown files.

Configurable Parameters

  • db_path
  • AI_RSS_DB_PATH (recommended absolute path in multi-agent runtime)
  • opml_path
  • feed_urls
  • max_feeds_per_run
  • max_items_per_feed
  • user_agent
  • seen_ttl_days
  • enable_conditional_get
  • Example config: assets/config.example.json

Error and Boundary Handling

  • Feed HTTP/network failure: keep syncing other feeds and record last_error.
  • Feed 304 Not Modified: skip entry parsing and keep state.
  • Missing guid and link: use hashed fallback identity and set match_confidence=low.
  • Dependency missing (feedparser): return install guidance.

Final Output Checklist (Required)

  • core goal
  • trigger conditions
  • input requirements
  • metadata schema
  • dedupe and sync rules
  • command workflow
  • configurable parameters
  • error handling

Use the following simplified checklist verbatim when the user requests it:

核心目标
输入需求
触发条件
元数据模型
去重与同步规则
命令流程
可配置参数
错误处理

References

  • references/input-model.md
  • references/output-rules.md
  • references/time-range-rules.md

Assets

  • assets/hn-popular-blogs-2025.opml (candidate feed pool)
  • assets/config.example.json

Scripts

  • scripts/rss_subscribe.py

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

ai-tech-rss-fetch

No summary provided by upstream source.

Repository SourceNeeds Review
General

email-imap-fetch

No summary provided by upstream source.

Repository SourceNeeds Review
General

ai-tech-summary

No summary provided by upstream source.

Repository SourceNeeds Review