ubuntu-ollama

Ubuntu Ollama — run Ollama on Ubuntu with fleet routing across multiple Ubuntu machines. Ubuntu Ollama setup with apt, systemd, and NVIDIA CUDA. Route Ollama inference across Ubuntu servers and desktops. Ubuntu Ollama load balancing, auto-discovery, and health monitoring. Ubuntu Ollama本地推理。Ubuntu Ollama enrutador IA.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ubuntu-ollama" with this command: npx skills add twinsgeeks/ubuntu-ollama

Ubuntu Ollama — Fleet Routing for Ollama on Ubuntu

Run Ollama on Ubuntu with multi-machine load balancing. Ubuntu Ollama Herd turns your Ubuntu servers and desktops into one smart Ollama endpoint. Install with apt + pip, manage with systemd, monitor with the web dashboard.

Ubuntu Ollama setup

Step 1: Install Ollama on Ubuntu

# Install Ollama on Ubuntu
curl -fsSL https://ollama.ai/install.sh | sh

# Verify Ollama is running on Ubuntu
ollama --version
systemctl status ollama

Step 2: Install Ubuntu Ollama Herd

# Ubuntu prerequisites
sudo apt update && sudo apt install python3-pip curl -y

# Install Ubuntu Ollama fleet router
pip install ollama-herd

Step 3: Start Ubuntu Ollama router

On one Ubuntu machine (the router):

herd          # start Ubuntu Ollama router on port 11435
herd-node     # register this Ubuntu Ollama node

On every other Ubuntu machine:

herd-node     # auto-discovers the Ubuntu Ollama router via mDNS

No mDNS? Connect Ubuntu Ollama nodes directly: herd-node --router-url http://router-ip:11435

Step 4: Verify Ubuntu Ollama fleet

curl -s http://localhost:11435/fleet/status | python3 -m json.tool

Ubuntu Ollama systemd services

Run Ubuntu Ollama as systemd services for automatic startup:

# Ubuntu Ollama router service
sudo tee /etc/systemd/system/herd-router.service << 'EOF'
[Unit]
Description=Ubuntu Ollama Router
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

# Ubuntu Ollama node service
sudo tee /etc/systemd/system/herd-node.service << 'EOF'
[Unit]
Description=Ubuntu Ollama Node
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd-node
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable --now herd-router
sudo systemctl enable --now herd-node

Use Ubuntu Ollama

OpenAI SDK

from openai import OpenAI

# Your Ubuntu Ollama fleet
client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

response = client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "Write an Ubuntu cron job for log rotation"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

curl (Ollama format)

# Ubuntu Ollama inference
curl http://localhost:11435/api/chat -d '{
  "model": "qwen3.5:32b",
  "messages": [{"role": "user", "content": "Explain Ubuntu apt package management"}],
  "stream": false
}'

curl (OpenAI format)

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "phi4", "messages": [{"role": "user", "content": "Hello from Ubuntu Ollama"}]}'

Ubuntu Ollama NVIDIA CUDA setup

# Install NVIDIA drivers on Ubuntu for Ollama CUDA
sudo apt install nvidia-driver-550 -y
sudo reboot

# Verify Ubuntu NVIDIA CUDA
nvidia-smi

# Ubuntu Ollama automatically uses CUDA when NVIDIA drivers are installed
ollama ps    # should show GPU acceleration

Ubuntu Ollama environment

# Optimize Ollama on Ubuntu via systemd
sudo systemctl edit ollama
# Add under [Service]:
#   Environment="OLLAMA_KEEP_ALIVE=-1"
#   Environment="OLLAMA_MAX_LOADED_MODELS=-1"
#   Environment="OLLAMA_NUM_PARALLEL=2"
sudo systemctl restart ollama

# Verify Ubuntu Ollama settings
systemctl show ollama | grep Environment

Ubuntu Ollama model recommendations

Ubuntu MachineGPUBest Ubuntu Ollama models
Ubuntu desktop (RTX 4090)24GBllama3.3:70b, qwen3.5:32b, deepseek-r1:32b
Ubuntu desktop (RTX 4080)16GBphi4, codestral, qwen3.5:14b
Ubuntu Server (A100)80GBdeepseek-v3, qwen3.5:72b
Ubuntu Server (no GPU)CPUphi4-mini, gemma3:4b
Ubuntu on Raspberry Pi 5CPUgemma3:1b, phi4-mini

Ubuntu Ollama firewall

# Ubuntu UFW
sudo ufw allow 11435/tcp
sudo ufw reload

Monitor Ubuntu Ollama

# Ubuntu Ollama fleet status
curl -s http://localhost:11435/fleet/status | python3 -m json.tool

# Ubuntu Ollama health — 15 automated checks
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Ubuntu Ollama models loaded
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# Ubuntu Ollama logs
journalctl -u herd-router -f
tail -f ~/.fleet-manager/logs/herd.jsonl.$(date +%Y-%m-%d)

Dashboard at http://localhost:11435/dashboard — live Ubuntu Ollama monitoring.

Also available on Ubuntu Ollama

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "Ubuntu penguin in space", "width": 1024, "height": 1024}'

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Ubuntu Ollama local inference routing"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Ubuntu Ollama users welcome:

Guardrails

  • Ubuntu Ollama model downloads require explicit user confirmation.
  • Ubuntu Ollama model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

GigaChat (Sber AI) Proxy

Integrate GigaChat (Sber AI) with OpenClaw via gpt2giga proxy

Registry SourceRecently Updated
3600smvlx
General

TencentCloud Video Face Fusion

通过提取两张人脸核心特征并实现自然融合,支持多种风格适配,提升创意互动性和内容传播力,广泛应用于创意营销、娱乐互动和社交分享场景。

Registry SourceRecently Updated
General

TencentCloud Image Face Fusion

图片人脸融合(专业版)为同步接口,支持自定义美颜、人脸增强、牙齿增强、拉脸等参数,最高支持8K分辨率,有多个模型类型供选择。

Registry SourceRecently Updated
General

YoudaoNote News

有道云笔记资讯推送:基于收藏笔记分析关注话题,推送最新相关资讯。支持对话触发与每日定时推送(如早上9点)。触发词:资讯推送、设置资讯推送、生成资讯推送。

Registry SourceRecently Updated
1.5K1lephix