knowledge-base-setup

在 Mac Mini (M4) 上快速搭建本地知识库 + RAG 自然语言搜索系统。 适用场景: - 新 Mac 配置知识库:从零开始安装配置 Ollama、embedding模型、定时任务、OCR文档分析 - 遇到 PDF 提取乱码、定时任务超时、skill 加载失败等问题 - 想要建立每日自动分析文档 + 08:00发送摘要到飞书的流程 - 迁移或复现知识库:打包整个 knowledge 目录和配置到新电脑 本 skill 会引导完成:目录结构创建、依赖安装、脚本部署、定时任务注册、OpenClaw 配置。

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "knowledge-base-setup" with this command: npx skills add seairteng/macmini-knowledge-base

Knowledge Base Setup

在 Mac Mini 上快速搭建本地知识库 + RAG 搜索系统。

快速开始

方法一:一键安装(推荐)

cd ~/.openclaw/workspace/skills/knowledge-base-setup/scripts
bash setup.sh <飞书用户ID>

setup.sh 会自动完成:

  1. 创建目录结构
  2. 安装 tesseract + Python 依赖
  3. 下载 nomic-embed-text 模型
  4. 部署分析脚本到 knowledge/.analysis/
  5. 更新 OpenClaw 配置(ollama provider + memorySearch)
  6. 重启网关

安装完成后手动注册定时任务(setup.sh 会打印具体命令)。

方法二:手动分步安装

按顺序执行以下步骤:

Step 1: 环境准备

brew install tesseract
pip3 install pytesseract pymupdf pdfplumber
# 安装 Ollama: https://ollama.com/download

Step 2: 下载 embedding 模型

ollama pull nomic-embed-text

Step 3: 创建目录结构

mkdir -p ~/.openclaw/workspace/knowledge/.analysis/summaries/archives
mkdir -p ~/.openclaw/workspace/knowledge/temp_docs
mkdir -p ~/.openclaw/workspace/knowledge/"Macro Financials"
touch ~/.openclaw/workspace/knowledge/文章目录.md

Step 4: 部署脚本(从 skill 目录复制)

cp ~/.openclaw/workspace/skills/knowledge-base-setup/scripts/run_analysis.py \
   ~/.openclaw/workspace/knowledge/.analysis/
cp ~/.openclaw/workspace/skills/knowledge-base-setup/scripts/generate_catalog.js \
   ~/.openclaw/workspace/knowledge/.analysis/
chmod +x ~/.openclaw/workspace/knowledge/.analysis/*.py
chmod +x ~/.openclaw/workspace/knowledge/.analysis/*.js

Step 5: 配置 OpenClaw

编辑 ~/.openclaw/openclaw.json,加入:

{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://127.0.0.1:11434",
        "api": "ollama",
        "models": [
          {"id": "nomic-embed-text", "name": "Nomic Embed Text"}
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "memorySearch": {
        "provider": "ollama",
        "model": "nomic-embed-text"
      }
    }
  }
}

确保 tools 区块有:

"tools": {
    "alsoAllow": ["exec", "process"]
}

然后重启:openclaw gateway restart

Step 6: 注册定时任务

# 22:00 分析任务
openclaw cron add \
  --name "22:00分析新文档" \
  --cron "0 22 * * *" \
  --tz "Asia/Shanghai" \
  --session isolated \
  --timeout-seconds 300 \
  --message "运行 run_analysis.py 和 generate_catalog.js" \
  --announce --channel feishu --to "user:<飞书用户ID>"

# 08:00 发送任务
openclaw cron add \
  --name "08:00发送文档摘要" \
  --cron "0 8 * * *" \
  --tz "Asia/Shanghai" \
  --session isolated \
  --timeout-seconds 120 \
  --message "读取 summaries/ 目录发送摘要到飞书,完成后移动到 archives/" \
  --announce --channel feishu --to "user:<飞书用户ID>"

迁移到新电脑

  1. 复制整个目录:
    scp -r ~/.openclaw/workspace/knowledge user@new-mac:~/.openclaw/workspace/
    
  2. 在新电脑运行 setup.sh 或手动分步安装
  3. 重新注册定时任务(Job ID会变)

避坑指南

问题原因解决
PDF提取乱码自定义字体无ToUnicodepymupdf+tesseract OCR
定时任务超时默认120秒太短--timeout-seconds 300
飞书无exec工具tools策略限制添加 alsoAllow: [exec, process]
skill加载失败导出名称错误CodeChunkerFileChunker
BGE-M3卡顿16GB内存不足继续用 nomic-embed-text
brew install ollama慢网络问题直接下载 dmg 安装

关键路径

  • Skill目录:~/.openclaw/workspace/skills/knowledge-base-setup/
  • 知识库:~/.openclaw/workspace/knowledge/
  • 分析脚本:~/.openclaw/workspace/knowledge/.analysis/
  • 摘要输出:~/.openclaw/workspace/knowledge/.analysis/summaries/

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Construction Claim Strategy

• Strategy before drafting. Decide direction, scope, and argument before choosing clauses. • 7 dimensions. Situation, scope, direction, argument, disclosure, response architecture, risk. • All major contract forms. FIDIC, PSSCOC, NEC, SIA, JCT, and bespoke.

Registry SourceRecently Updated
General

Goldman Sachs Co

提供高盛公司历史、业务模式、市场地位及关键数据,助力研究投资银行和金融机构角色分析。

Registry SourceRecently Updated
General

Goldfish

Goldfish是Pepperidge Farm旗下标志性小鱼形儿童零食,始于1962年,北美市场销量领先且覆盖多口味多渠道。

Registry SourceRecently Updated
General

Dailymotion

法国的视频分享平台,2005年创立,Vivendi持有,聚焦欧洲和中东市场,提供优质内容与媒体合作,月活超3亿用户。

Registry SourceRecently Updated