hn-podcast-transcriber

Automatically fetch, transcribe, and archive Hacker News podcast episodes (Hacker News Morning Brief). Use when the user wants to set up a podcast transcription pipeline, archive HN podcast episodes as searchable text, transcribe podcast audio to markdown, or schedule periodic HN podcast ingestion. Also use for any podcast RSS feed transcription workflow.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "hn-podcast-transcriber" with this command: npx skills add terrycarter1985/hn-podcast-transcriber

HN Podcast Transcriber

Fetch new episodes from the Hacker News Morning Brief podcast RSS feed, transcribe with Whisper, and archive as searchable markdown.

Prerequisites

whisper CLI installed (pip install openai-whisper)
ffmpeg on PATH (required by whisper; download from https://ffmpeg.org)
python3 with standard library (no extra deps for the fetch script)
Disk space for audio files (~5-10 MB per episode)

Quick Start

Run the main script to fetch and transcribe all new episodes:

bash scripts/fetch_and_transcribe.sh --archive ~/hn-podcast-archive

First run processes all episodes. Subsequent runs only process new ones (tracked via state.json).

Options

Flag	Default	Description
`--feed URL`	HN Morning Brief RSS	Podcast RSS feed URL
`--archive DIR`	`./hn-podcast-archive`	Archive root directory
`--model MODEL`	`turbo`	Whisper model (tiny/base/small/medium/large/turbo)
`--limit N`	0 (all)	Max new episodes to process per run

Custom Feeds

Point at any podcast RSS feed:

bash scripts/fetch_and_transcribe.sh --feed "https://example.com/podcast/feed.xml" --archive ./my-podcast-archive

Scheduling

Set up an OpenClaw cron job for daily checks:

Create an isolated cron job that runs the script
Or add a heartbeat check in HEARTBEAT.md

Archive Structure

See references/archive-layout.md for directory layout and state.json schema.

Workflow Summary

Download RSS feed → parse <item> entries
Skip already-processed episodes (state.json lookup)
Download audio (mp3/m4a) to episode directory
Run whisper to produce .txt transcript
Generate cleaned transcript.md with title + date header
Update state.json with processed episode ID

Notes

Whisper models cache to ~/.cache/whisper after first download
Use --model tiny for speed, --model large for best accuracy
Average episode (~6 min) takes ~1-2 min with turbo model on CPU
For GPU acceleration, install ffmpeg with CUDA support

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

Obsidian Cleaner

Automatically clean up loose images and attachments in Obsidian vault root, moving them to the Attachments folder. Trigger when user says "clean obsidian", "clean attachments", or "整理附件".

Registry SourceRecently Updated

9180sparkingskin-tech

General

tradealpha实时新闻

获取 TradeAlpha 实时新闻和语义检索结果。适用于用户提到 TradeAlpha 新闻、今日新闻、路透、彭博、Truth、国内资讯、研报快讯，或要求按主题、事件、公司、叙事检索相关新闻的场景。通过聊天向用户索取 token，并在当前会话中复用，不读取环境变量，不写入本地文件。

Registry SourceRecently Updated

00tradealpha-coder

General

Everclaw — Inference You Own

Open-source first AI inference — GLM-5 as default, Claude as fallback only. Own your inference forever via the Morpheus decentralized network. Stake MOR toke...

Registry SourceRecently Updated

1.2K0davidajohnston

General

Identitygram Signin

Registry SourceRecently Updated

1K0waqas-orcalo