gradient-knowledge-base

Community skill (unofficial) for DigitalOcean Gradient Knowledge Bases. Build RAG pipelines: store documents in DO Spaces, configure data sources, manage indexing, and run semantic or hybrid search queries.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "gradient-knowledge-base" with this command: npx skills add Rogue Iteration/gradient-knowledge-base

🦞 Gradient AI — Knowledge Bases & RAG

⚠️ This is an unofficial community skill, not maintained by DigitalOcean. Use at your own risk.

"A lobster never forgets. Neither should your agent." — the KB lobster

Build a Retrieval-Augmented Generation pipeline using DigitalOcean's Gradient Knowledge Bases. Store your documents in DO Spaces, index them into a managed Knowledge Base (backed by OpenSearch), and query them with semantic or hybrid search.

Architecture

Your Agent                   DigitalOcean
┌─────────────┐     upload    ┌──────────────┐
│  Documents  │ ──────────▶  │  DO Spaces   │
└─────────────┘              │  (S3-compat) │
                              └──────┬───────┘
                                     │ auto-index
                              ┌──────▼───────┐
                              │ Knowledge    │
                              │ Base (KBaaS) │
                              │ ┌──────────┐ │
                              │ │OpenSearch│ │
                              │ └──────────┘ │
                              └──────┬───────┘
                                     │ retrieve
┌─────────────┐     answer    ┌──────▼───────┐
│  Your Agent │ ◀──────────  │  RAG Results │
│  + LLM      │              │  + Citations │
└─────────────┘              └──────────────┘

📖 Knowledge Base docs

API Endpoints

This skill connects to three official DigitalOcean service endpoints:

HostnamePurposeDocs
api.digitalocean.comKB management (create, list, delete, data sources)DO API Reference
kbaas.do-ai.runKB retrieval — semantic/hybrid search queriesKB Retrieval docs
inference.do-ai.runLLM chat completions for RAG synthesisInference docs
<region>.digitaloceanspaces.comS3-compatible object storageSpaces docs

All endpoints are owned and operated by DigitalOcean. The *.do-ai.run hostnames are the Gradient AI Platform's service domains.

Authentication

This skill uses two different credentials — think of it as a two-claw approach:

CredentialUsed ForEnv Var
DO API TokenKB management, indexing, queriesDO_API_TOKEN
Gradient API KeyLLM inference for RAG synthesisGRADIENT_API_KEY
Spaces KeysS3-compatible uploadsDO_SPACES_ACCESS_KEY + DO_SPACES_SECRET_KEY

Credential scoping: Use minimally-scoped tokens. Create a dedicated Model Access Key for GRADIENT_API_KEY. For DO_API_TOKEN, use a scoped API token with only Knowledge Base and Spaces permissions. Avoid using your account-root token.

Optional but recommended:

export GRADIENT_KB_UUID="your-kb-uuid"     # Default KB for queries
export DO_SPACES_BUCKET="your-bucket"      # Default bucket for uploads
export DO_SPACES_ENDPOINT="https://nyc3.digitaloceanspaces.com"

Tools

📦 Store Documents in Spaces

Upload files to DO Spaces for Knowledge Base indexing. This is the storage layer — documents land here before being indexed.

# Upload a file
python3 gradient_spaces.py --upload /path/to/report.md --bucket my-kb-data

# Upload with a key prefix (folder structure)
python3 gradient_spaces.py --upload report.md --bucket my-kb-data --prefix "research/2026-02-15/"

# List files in a bucket
python3 gradient_spaces.py --list --bucket my-kb-data

# List files with a prefix filter
python3 gradient_spaces.py --list --bucket my-kb-data --prefix "research/"

# Delete a file
python3 gradient_spaces.py --delete "research/old_report.md" --bucket my-kb-data

📖 DO Spaces docs


🏗️ Create and Manage Knowledge Bases

Full CRUD for Knowledge Bases. Create them programmatically instead of clicking through the console like a land-dweller.

# List all Knowledge Bases
python3 gradient_kb_manage.py --list

# Create a new KB
python3 gradient_kb_manage.py --create --name "My Research KB" --region nyc3

# Show details for a specific KB
python3 gradient_kb_manage.py --show --kb-uuid "your-kb-uuid"

# Delete a KB (⚠️ permanent!)
python3 gradient_kb_manage.py --delete --kb-uuid "your-kb-uuid"

📖 Create KBs via API


📁 Manage Data Sources

Connect your Spaces bucket (or web URLs) to a Knowledge Base. This is what tells the KB "index these documents."

# Add a DO Spaces data source
python3 gradient_kb_manage.py --add-source \
  --kb-uuid "your-kb-uuid" \
  --bucket my-kb-data \
  --prefix "research/"

# List data sources for a KB
python3 gradient_kb_manage.py --list-sources --kb-uuid "your-kb-uuid"

# Trigger re-indexing (auto-detects the data source)
python3 gradient_kb_manage.py --reindex --kb-uuid "your-kb-uuid"

# Trigger re-indexing for a specific source
python3 gradient_kb_manage.py --reindex --kb-uuid "your-kb-uuid" --source-uuid "ds-uuid"

🦞 Pro tip: Auto-indexing. If your KB has auto-indexing enabled, you can skip manual re-index triggers. The KB will detect changes in your Spaces bucket automatically. Configure it in the DigitalOcean Console → Knowledge Base → Settings.


🔍 Query the Knowledge Base

Search your indexed documents with semantic or hybrid queries. This is where the magic happens — your documents become answers.

# Basic query
python3 gradient_kb_query.py --query "What happened with the Q4 earnings?"

# Control number of results
python3 gradient_kb_query.py --query "Revenue trends" --num-results 20

# Tune hybrid search balance (see below)
python3 gradient_kb_query.py --query "$CAKE price movement" --alpha 0.5

# JSON output (for piping to other tools)
python3 gradient_kb_query.py --query "SEC filings summary" --json

Direct API call:

curl -s https://kbaas.do-ai.run/v1/{kb-uuid}/retrieve \
  -H "Authorization: Bearer $DO_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What happened with Q4 earnings?",
    "num_results": 10,
    "alpha": 0.5
  }'

📖 KB retrieval API


🎛️ The alpha Parameter — Hybrid Search Tuning

This is the secret sauce. The alpha parameter controls the balance between lexical (keyword) and semantic (meaning) search:

AlphaBehaviorBest For
0.0Pure lexical (keyword matching)Exact terms: ticker symbols, filing numbers, dates
0.5Balanced hybridGeneral research queries
1.0Pure semantic (meaning-based)Open-ended: "what happened with...", "summarize..."

🦞 Rule of claw: Start at 0.5. Go lower when searching for specific things ($CAKE, 10-K, 2026-02-15). Go higher when exploring ideas ("What's the market sentiment?").


🧠 RAG-Enhanced Queries

The full pipeline: query the KB → build a context prompt → call an LLM to synthesize. One command, complete answers with citations.

python3 gradient_kb_query.py \
  --query "Summarize all research on $CAKE" \
  --rag \
  --model "openai-gpt-oss-120b"

This automatically:

  1. 🔍 Queries the Knowledge Base for relevant documents
  2. 📝 Builds a prompt with the retrieved context
  3. 🤖 Calls the LLM to synthesize an answer

Note: RAG queries call the Gradient Inference API under the hood, so you'll need GRADIENT_API_KEY set. If you have the gradient-inference skill loaded too, you're all set.


Advanced Configuration

Embedding Models & Chunking

When creating a Knowledge Base, you can choose how documents are split into searchable chunks:

StrategyHow It WorksBest For
Section-basedSplits on document structure (headings, paragraphs)Structured reports
SemanticSplits on meaning boundariesNarrative content
HierarchicalPreserves document hierarchy in chunksTechnical docs
Fixed-lengthEqual-sized chunksUniform data

Configure these in the DigitalOcean Console when creating the KB, or via the API's embedding_model and chunking parameters.

📖 KB configuration options


CLI Reference

All scripts accept --json for machine-readable output.

gradient_spaces.py      --upload FILE | --list | --delete KEY
                        [--bucket NAME] [--prefix PATH] [--key KEY] [--json]

gradient_kb_manage.py   --list | --create | --show | --delete
                        | --list-sources | --add-source | --reindex
                        [--kb-uuid UUID] [--source-uuid UUID]
                        [--name NAME] [--region REGION] [--bucket NAME]
                        [--prefix PATH] [--json]

gradient_kb_query.py    --query TEXT [--kb-uuid UUID] [--num-results N]
                        [--alpha F] [--rag] [--model ID] [--json]

Environment Variables

VariableRequiredDescription
DO_API_TOKENDO API token (scopes: GenAI + Spaces)
DO_SPACES_ACCESS_KEYSpaces access key
DO_SPACES_SECRET_KEYSpaces secret key
DO_SPACES_ENDPOINTOptionalSpaces endpoint (default: https://nyc3.digitaloceanspaces.com)
DO_SPACES_BUCKETOptionalDefault bucket name
GRADIENT_KB_UUIDOptionalDefault KB UUID (saves typing --kb-uuid every time)
GRADIENT_API_KEYFor RAGNeeded when using --rag for LLM synthesis

External Endpoints

EndpointPurpose
https://kbaas.do-ai.run/v1/{uuid}/retrieveKB retrieval API
https://api.digitalocean.com/v2/gen-ai/knowledge_bases/KB management API
https://{region}.digitaloceanspaces.comDO Spaces (S3-compatible)

Security & Privacy

  • Your DO_API_TOKEN is sent as a Bearer token to api.digitalocean.com and kbaas.do-ai.run
  • Spaces credentials are used for S3-compatible uploads to {region}.digitaloceanspaces.com
  • Documents you upload become private in your Spaces bucket by default
  • KB queries are scoped to your account — no cross-tenant access
  • No credentials or data are sent to any third-party endpoints

Trust Statement

By using this skill, documents and queries are sent to DigitalOcean's Knowledge Base and Spaces APIs. Only install if you trust DigitalOcean with the documents you index.

Important Notes

  • Documents uploaded to Spaces are private by default
  • Re-indexing is best-effort — if the API call fails, auto-indexing kicks in on its own schedule
  • The retrieval API returns document chunks, not full documents
  • Deleting a KB is permanent — the indexed data is gone. The source files in Spaces are not affected.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Memora - Personal Knowledge Base (RAG)

Memora — A self-hosted RAG (Retrieval-Augmented Generation) personal knowledge base. Built with FastAPI + Qdrant + DashScope/OpenAI Embedding + DeepSeek/Open...

Registry SourceRecently Updated
1871Profile unavailable
Coding

Fleet Embeddings

Embeddings with nomic-embed-text, mxbai-embed, and snowflake-arctic-embed across your device fleet. Fleet-routed via Ollama for RAG, semantic search, and vec...

Registry SourceRecently Updated
2291Profile unavailable
Research

UUMuse Brain

Access, search, manage, and retrieve information from your UUMuse uploaded documents, knowledge bases, and long-term memory across sessions.

Registry SourceRecently Updated
850Profile unavailable
Research

流式AI检索问答技能

通用流式AI检索问答技能 — 为任意行业应用提供四步流式分析交互界面。 触发场景:用户输入关键词 → AI自动执行:理解意图 → 检索知识库 → 流式生成 → 来源标记 → 完整回答。 当需要实现以下任意场景时激活: (1) AI搜索框 / 智能咨询组件重构 (2) 知识库问答(医疗/法律/金融/教育等垂直领域)...

Registry SourceRecently Updated
1270Profile unavailable