windows-ai

Windows AI — run local AI on Windows with LLM inference, image generation, and embeddings. Windows AI server for Llama, Qwen, DeepSeek, Phi, Mistral. Turn Windows PCs into a Windows AI cluster. No cloud APIs, no subscriptions — Windows AI runs entirely on your hardware. Windows AI本地推理。Windows IA local sin dependencias en la nube.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "windows-ai" with this command: npx skills add twinsgeeks/windows-ai

Windows AI — Local AI on Your Windows PCs

Run AI entirely on Windows. No cloud APIs, no subscriptions, no data leaving your network. Windows AI via Ollama Herd routes LLM requests across your Windows machines — your gaming PC, your work desktop, your laptop. One Windows AI endpoint serves them all.

Why Windows AI locally

  • Zero cost — no per-token charges. Your Windows PC runs unlimited AI inference.
  • Privacy — prompts and responses never leave your Windows network.
  • No rate limits — cloud APIs throttle. Your Windows AI hardware doesn't.
  • NVIDIA GPU support — Windows AI uses your RTX GPU via CUDA for fast inference.
  • Fleet routing — multiple Windows PCs share the AI workload automatically.

Windows AI quick start

# Install Windows AI router
pip install ollama-herd

# Start Windows AI on your main PC
herd          # Windows AI router on port 11435
herd-node     # register this Windows AI node

# On other Windows PCs
herd-node     # joins the Windows AI cluster automatically

Windows Firewall: Allow port 11435 — netsh advfirewall firewall add rule name="Windows AI" dir=in action=allow protocol=tcp localport=11435

Use Windows AI

OpenAI SDK

from openai import OpenAI

# Your Windows AI endpoint
client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Windows AI routes to the best available GPU
response = client.chat.completions.create(
    model="qwen3.5:32b",
    messages=[{"role": "user", "content": "Explain local AI vs cloud AI for Windows users"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Windows AI for coding

# Windows AI code generation
response = client.chat.completions.create(
    model="codestral",
    messages=[{"role": "user", "content": "Write a C# Windows service that monitors GPU temperature"}],
)
print(response.choices[0].message.content)

curl (PowerShell)

# Windows AI chat
curl http://localhost:11435/api/chat -d '{
  "model": "llama3.3:70b",
  "messages": [{"role": "user", "content": "Hello from Windows AI"}],
  "stream": false
}'

Windows AI hardware guide

Windows PCGPURAMBest Windows AI models
Gaming desktopRTX 4090 (24GB)32GB+llama3.3:70b, qwen3.5:32b — full quality Windows AI
Gaming desktopRTX 4080 (16GB)16GB+phi4, codestral, qwen3.5:14b
Work laptopRTX 4060 (8GB)16GBphi4-mini, gemma3:4b — fast Windows AI
Office desktopIntel/AMD (no GPU)16GBphi4-mini, gemma3:1b — CPU Windows AI

Windows AI works with or without a GPU. NVIDIA GPUs dramatically accelerate inference.

Windows AI environment setup

# Optimize Windows AI performance
[System.Environment]::SetEnvironmentVariable("OLLAMA_KEEP_ALIVE", "-1", "User")
[System.Environment]::SetEnvironmentVariable("OLLAMA_MAX_LOADED_MODELS", "-1", "User")
# Restart Ollama from the Windows system tray

Windows AI features

  • 7-signal scoring — picks the best Windows PC for every AI request
  • 15 health checks — monitors all Windows AI nodes in real-time
  • Auto-retry — transparent failover between Windows AI machines
  • vRAM-aware routing — knows which Windows GPU has room for the model
  • Request tagging — track per-project Windows AI usage
  • Web dashboardhttp://localhost:11435/dashboard

Windows AI integrations

Works with any OpenAI-compatible tool on Windows:

  • Continue.dev (VS Code) — set endpoint to http://localhost:11435/v1
  • Cursor — Windows AI as local backend
  • LangChain — drop-in OpenAI replacement
  • CrewAI — multi-agent workflows on Windows AI
  • Open WebUI — chat interface for Windows AI

Also available on Windows AI

Image generation

curl http://localhost:11435/api/generate-image `
  -d '{"model": "z-image-turbo", "prompt": "futuristic Windows desktop", "width": 1024, "height": 1024}'

Embeddings

curl http://localhost:11435/api/embed `
  -d '{"model": "nomic-embed-text", "input": "Windows AI local inference embeddings"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Windows AI enthusiasts welcome:

Guardrails

  • Windows AI model downloads require explicit user confirmation.
  • Windows AI model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Img2img

Generate images from text descriptions using DALL-E 3 while adhering to usage policies and avoiding realistic human faces.

Registry SourceRecently Updated
General

Habitat-GS-Navigator

Navigate and interact with photo-realistic 3DGS environments via the Habitat-GS Bridge. Use when: user asks to explore a 3D scene, perform embodied navigatio...

Registry SourceRecently Updated
General

Memory Palace

持久化记忆管理。Use when: 用户告诉你个人信息/偏好/习惯、需要记住项目状态/技术决策、完成任务后有可复用经验、用户说"记住""别忘了""下次注意"、需要回忆之前的对话内容。支持语义搜索和时间推理。

Registry SourceRecently Updated
General

Podcast Transcript Mining Authority Positioning

Extract guest appearances, speaking topics, and soundbites from podcast transcripts to build authority portfolios and generate podcast pitch templates. Use w...

Registry SourceRecently Updated