Ragaai Catalyst

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like a ragaai catalyst, python, agentic-ai.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Ragaai Catalyst" with this command: npx skills add bytesagain1/rag-evaluator

Rag Evaluator

AI-powered RAG (Retrieval-Augmented Generation) evaluation toolkit. Configure, benchmark, compare, and optimize your RAG pipelines from the command line. Track prompts, evaluations, fine-tuning experiments, costs, and usage — all with persistent local logging and full export capabilities.

Commands

Run rag-evaluator <command> [args] to use.

CommandDescription
configureConfigure RAG evaluation settings and parameters
benchmarkRun benchmarks against your RAG pipeline
compareCompare results across different RAG configurations
promptLog and manage prompt templates and variations
evaluateEvaluate RAG output quality and relevance
fine-tuneTrack fine-tuning experiments and parameters
analyzeAnalyze evaluation results and identify patterns
costTrack and log API/inference costs
usageMonitor token usage and API call volumes
optimizeLog optimization strategies and results
testRun test cases against RAG configurations
reportGenerate evaluation reports
statsShow summary statistics across all categories
export <fmt>Export data in json, csv, or txt format
search <term>Search across all logged entries
recentShow recent activity from history log
statusHealth check — version, data dir, disk usage
helpShow help and available commands
versionShow version (v2.0.0)

Each domain command (configure, benchmark, compare, etc.) works in two modes:

  • Without arguments: displays the most recent 20 entries from that category
  • With arguments: logs the input with a timestamp and saves to the category log file

Data Storage

All data is stored locally in ~/.local/share/rag-evaluator/:

  • Each command creates its own log file (e.g., configure.log, benchmark.log)
  • A unified history.log tracks all activity across commands
  • Entries are stored in timestamp|value pipe-delimited format
  • Export supports JSON, CSV, and plain text formats

Requirements

  • Bash 4+ with set -euo pipefail strict mode
  • Standard Unix utilities: date, wc, du, tail, grep, sed, cat
  • No external dependencies or API keys required

When to Use

  1. Evaluating RAG pipeline quality — log evaluation scores, compare retrieval strategies, and track improvements over time
  2. Benchmarking different configurations — run benchmarks across embedding models, chunk sizes, or retrieval methods and compare results side by side
  3. Tracking costs and usage — monitor API costs and token usage across experiments to stay within budget
  4. Managing prompt engineering — log prompt variations, test them against your pipeline, and analyze which templates perform best
  5. Generating reports for stakeholders — export evaluation data as JSON/CSV for dashboards, or generate text reports summarizing RAG performance

Examples

# Configure a new evaluation run
rag-evaluator configure "model=gpt-4 chunks=512 overlap=50 top_k=5"

# Run a benchmark and log results
rag-evaluator benchmark "latency=230ms recall@5=0.82 precision@5=0.71"

# Compare two retrieval strategies
rag-evaluator compare "bm25 vs dense: bm25 recall=0.78, dense recall=0.85"

# Track evaluation scores
rag-evaluator evaluate "faithfulness=0.91 relevance=0.87 coherence=0.93"

# Log API cost for a run
rag-evaluator cost "run-042: $0.23 (1.2k tokens input, 800 tokens output)"

# View summary statistics
rag-evaluator stats

# Export all data as CSV
rag-evaluator export csv

# Search for specific entries
rag-evaluator search "gpt-4"

# Check recent activity
rag-evaluator recent

# Health check
rag-evaluator status

Output

All commands output to stdout. Redirect to a file if needed:

rag-evaluator report "weekly summary" > report.txt
rag-evaluator export json  # saves to ~/.local/share/rag-evaluator/export.json

Configuration

Set DATA_DIR by modifying the script, or use the default: ~/.local/share/rag-evaluator/


Powered by BytesAgain | bytesagain.com | hello@bytesagain.com

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Spicy Ai Video

Turn a 60-second talking head clip into 1080p high-energy edited videos just by typing what you need. Whether it's turning bland footage into visually intens...

Registry SourceRecently Updated
Coding

Video Maker Fast

Get polished MP4 videos ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "trim...

Registry SourceRecently Updated
Coding

Generation Generator

generate text prompts or clips into AI generated videos with this skill. Works with MP4, MOV, PNG, JPG files up to 500MB. marketers, content creators, social...

Registry SourceRecently Updated
Coding

Editor On Android

Get edited MP4 clips ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "trim th...

Registry SourceRecently Updated