metrillm

Find the best local LLM for your machine. Tests speed, quality and RAM fit, then tells you if a model is worth running on your hardware.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "metrillm" with this command: npx skills add metrillm

MetriLLM — Find the Best LLM for Your Hardware

Test any local model and get a clear verdict: is it worth running on your machine?

Prerequisites

  1. Node.js 20+ — check with node -v
  2. Ollama or LM Studio installed and running
  3. MetriLLM CLI — install globally:
npm install -g metrillm

Usage

List available models

ollama list

Run a full benchmark

metrillm bench --model $ARGUMENTS --json

This measures:

  • Performance: tokens/second, time to first token, memory usage
  • Quality: reasoning, math, coding, instruction following, structured output, multilingual
  • Fitness verdict: EXCELLENT / GOOD / MARGINAL / NOT RECOMMENDED

Performance-only benchmark (faster)

metrillm bench --model $ARGUMENTS --perf-only --json

Skips quality evaluation — measures speed and memory only.

View previous results

ls ~/.metrillm/results/

Read any JSON file to see full benchmark details.

Share to the public leaderboard

metrillm bench --model $ARGUMENTS --share

Uploads your result to the MetriLLM community leaderboard — an open, community-driven ranking of local LLM performance across real hardware. Compare your results with others and help the community find the best models for every setup. Shared data includes: model name, scores, hardware specs (CPU, RAM, GPU). No personal data is sent.

Interpreting Results

VerdictScoreMeaning
EXCELLENT>= 80Fast and accurate — great fit
GOOD>= 60Solid — suitable for most tasks
MARGINAL>= 40Usable but with tradeoffs
NOT RECOMMENDED< 40Too slow or inaccurate

Key metrics to highlight:

  • tokensPerSecond > 30 = good for interactive use
  • ttft < 500ms = responsive
  • memoryUsedGB vs available RAM = will it fit?

Tips

  • Use --perf-only for quick tests
  • Close GPU-intensive apps before benchmarking
  • Benchmark duration varies depending on model speed and response length

Open Source

MetriLLM is free and open source (Apache 2.0). Contributions, issues, and feedback are welcome: github.com/MetriLLM/metrillm

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Fitbit Tracker

Personal Fitbit integration for daily health tracking with adaptive sleep and activity reporting

Registry SourceRecently Updated
General

Ollama Load Balancer

Ollama load balancer for Llama, Qwen, DeepSeek, and Mistral inference across multiple machines. Load balancing with auto-discovery via mDNS, health checks, q...

Registry SourceRecently Updated
General

Google Merchant Center

Google Merchant Center integration. Manage Accounts. Use when the user wants to interact with Google Merchant Center data.

Registry SourceRecently Updated
General

Twitter/X All-in-One — Search, Monitor & Publish Text & Media Posts

Searches and reads X (Twitter): profiles, timelines, mentions, followers, tweet search, trends, lists, communities, and Spaces. Publishes posts, likes/unlike...

Registry SourceRecently Updated