KB Collector
Knowledge Base Collector - Save YouTube, URLs, and text to Obsidian with automatic transcription and summarization.
Features
- YouTube Collection - Download audio, transcribe with Whisper, auto-summarize
- URL Collection - Fetch and summarize web pages
- Plain Text - Direct save with tags
- Digest - Weekly/Monthly/Yearly review emails
- Nightly Research - Automated AI/LLM/tech trend tracking
Installation
# Install dependencies
pip install yt-dlp faster-whisper requests beautifulsoup4
# For AI summarization (optional)
pip install openai anthropic
Usage (Python Version - Recommended)
# Collect YouTube video
python3 scripts/collect.py youtube "https://youtu.be/xxxxx" "stock,investing"
# Collect URL
python3 scripts/collect.py url "https://example.com/article" "python,api"
# Collect plain text
python3 scripts/collect.py text "My note content" "tag1,tag2"
Usage (Bash Version - Legacy)
# Collect YouTube
./scripts/collect.sh "https://youtu.be/xxxxx" "stock,investing" youtube
# Collect URL
./scripts/collect.sh "https://example.com/article" "python,api" url
# Collect plain text
./scripts/collect.sh "My note" "tag1,tag2" text
Nightly Research (New!)
Automated AI/LLM/tech trend tracking - runs daily and saves to Obsidian.
# Save to Obsidian only
./scripts/nightly-research.sh --save
# Save to Obsidian AND send email
./scripts/nightly-research.sh --save --send
# Send email only
./scripts/nightly-research.sh --send
Features
- Searches multiple sources (Hacker News, Reddit, Twitter)
- LLM summarization (optional)
- Saves to Obsidian with tags
- Optional email digest
Cron Setup (optional)
# Run every night at 10 PM
0 22 * * * /path/to/nightly-research.sh --save --send
Configuration
Edit the script to customize:
VAULT_PATH = os.path.expanduser("~/Documents/YourVault")
NOTE_AUTHOR = "YourName"
Output Format
Notes saved to: {VAULT_PATH}/yyyy-mm-dd-title.md
---
created: 2026-03-03T12:00:00
source: https://...
tags: [stock, investing]
author: George
---
# Title
> **TLDR:** Summary here...
---
Content...
---
*Saved: 2026-03-03*
Dependencies
- yt-dlp
- faster-whisper (for transcription)
- requests + beautifulsoup4 (for URL fetching)
- Optional: openai/anthropic (for AI summarization)
Credits
Automated note-taking workflow for Obsidian.