musa-torch-coding

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "musa-torch-coding" with this command: npx skills add lipeidcc/musa-torch-coding

MUSA Torch Coding

Guide for generating PyTorch code that runs on Moore Threads (摩尔线程) MUSA GPUs using torch_musa.

Overview

MUSA (Metaverse Unified System Architecture) is Moore Threads' GPU computing platform. This skill helps generate code that:

  • Runs on Moore Threads GPUs via torch_musa
  • Converts CUDA code to MUSA-compatible code
  • Sets up proper environments (conda v1.2/v1.3)
  • Follows MUSA best practices

Key Differences: CUDA vs MUSA

CUDAMUSA
torch.cudatorch.musa
torch.device("cuda")torch.device("musa")
torch.cuda.is_available()torch.musa.is_available()
backend='nccl'backend='mccl'
torch.cuda.device_count()torch.musa.device_count()
torch.cuda.get_device_name()torch.musa.get_device_name()

Environment Setup

⚠️ Important: MUSA Uses Pre-configured Conda Environments

DO NOT install PyTorch, vLLM, or related packages manually. MUSA environments are custom-built and include:

  • MUSA-specific PyTorch builds (not compatible with standard PyTorch)
  • MUSA-customized vLLM versions
  • MUSA drivers and SDK integration

Installing standard packages from PyPI will break the environment.

Conda Environment (v1.2/v1.3)

MUSA provides pre-configured conda environments. Common environment names:

  • v1.2 - MUSA SDK v1.2 environment
  • v1.3 - MUSA SDK v1.3 environment (newer)
# List available MUSA environments
conda env list | grep -E "(v1\.2|v1\.3|musa)"

# Activate the appropriate environment
conda activate v1.2  # or v1.3

# Verify MUSA availability
python -c "import torch_musa; import torch; print(torch.musa.is_available())"

Environment Detection & Setup

If no MUSA conda environment is detected:

  1. Check if MUSA is installed:

    which musaInfo  # Should show musaInfo path
    ls /usr/local/musa/  # MUSA SDK location
    
  2. If MUSA is not set up:

    • Use the musa-env-setup skill for complete environment installation
    • The skill covers SDK installation, conda setup, and vLLM-MUSA configuration
  3. Common conda environment locations:

    • /opt/conda/envs/
    • ~/conda/envs/
    • /usr/local/conda/envs/

Key Environment Variables

VariablePurpose
MUSA_VISIBLE_DEVICES=0,1,2,3Control visible GPU IDs
MUSA_LAUNCH_BLOCKING=1Synchronous kernel launch
MUDNN_LOG_LEVEL=INFOEnable MUDNN logging
TORCH_SHOW_CPP_STACKTRACES=1Show C++ stack traces

Code Generation Rules

When generating PyTorch code for MUSA:

  1. Always import torch_musa

    import torch_musa  # Must import before using torch.musa
    
  2. Use torch.device("musa")

    device = torch.device("musa") if torch.musa.is_available() else torch.device("cpu")
    tensor = torch.tensor([1.0, 2.0], device=device)
    
  3. Use 'mccl' for distributed training

    dist.init_process_group(backend='mccl', ...)
    
  4. Mixed precision (AMP) is supported

    from torch.cuda.amp import autocast, GradScaler  # Same API
    
  5. TensorCore optimization available

    • Set torch.backends.musa.matmul.allow_tf32 = True for TensorFloat32

Model Templates

For common model types, see templates in references/:

  • reference.md - Complete MUSA API reference

Common Tasks

Check GPU Availability

import torch
import torch_musa

print(f"MUSA available: {torch.musa.is_available()}")
print(f"Device count: {torch.musa.device_count()}")
print(f"Device name: {torch.musa.get_device_name(0)}")

Training Loop Pattern

import torch_musa

# Device setup
device = torch.device("musa") if torch.musa.is_available() else torch.device("cpu")

# Model and data to device
model = model.to(device)
inputs = inputs.to(device)

# Training (same as CUDA)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()

Distributed Training (DDP)

import torch.distributed as dist
import torch_musa

# Initialize with mccl backend
dist.init_process_group(backend='mccl', rank=rank, world_size=world_size)

# Create process group on MUSA
torch.cuda.set_device(local_rank)  # torch_musa extends torch.cuda API

Code Conversion

When converting existing CUDA code to MUSA:

  1. Add import torch_musa at the top
  2. Replace cuda with musa in device strings
  3. Replace nccl with mccl for distributed backend
  4. Keep all other PyTorch API calls unchanged

Troubleshooting

  • Device not found: Ensure user is in render group: sudo usermod -aG render $(whoami)
  • Library not found: Check LD_LIBRARY_PATH includes /usr/local/musa/lib/
  • Build issues: Clean and rebuild: python setup.py clean && bash build.sh
  • Docker issues: Use --env MTHREADS_VISIBLE_DEVICES=all

Reference

For detailed API reference and examples, see references/reference.md.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Wechat Mp Writer

WeChat Official Account (公众号) content writer with article formatting, headline optimization, and engagement tips. Use when you need to write WeChat articles,...

Registry SourceRecently Updated
General

OpenClaw EverMemory Installer

Use this skill when installing, upgrading, verifying, or publishing the EverMemory OpenClaw plugin and its companion skill, including local path install, npm...

Registry SourceRecently Updated
General

Ip Advisor

知识产权顾问。专利、版权、商业秘密、注册流程、保护策略。IP advisor for patents, copyrights, trade secrets. 知识产权、专利、版权。

Registry SourceRecently Updated
1950ckchzh
General

炒股大师模拟器

炒股大师模拟器 | 股市模拟交易练习 | A股/港股/美股投资学习 | 化身文主任/股神老徐/炒股养家/孙宇晨等各位大师学习投资思路 | 多智能体股票讨论群

Registry SourceRecently Updated