npu-smi

Huawei Ascend NPU npu-smi command reference. Use for device queries (health, temperature, power, memory, processes, ECC), configuration (thresholds, modes, fan), firmware upgrades (MCU, bootloader, VRD), virtualization (vNPU), and certificate management.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "npu-smi" with this command: npx skills add ascend-ai-coding/awesome-ascend-skills/ascend-ai-coding-awesome-ascend-skills-npu-smi

npu-smi Command Reference

Quick reference for Huawei Ascend NPU device management commands.

Quick Start

npu-smi info -l                           # List all devices
npu-smi info -t health -i 0               # Check device health
npu-smi info -t temp -i 0 -c 0            # Check temperature
npu-smi info -t power -i 0 -c 0           # Check power
npu-smi info -t memory -i 0 -c 0          # Check memory

Device Queries

Basic Information

npu-smi info -l                           # List devices
npu-smi info -t health -i <id>            # Health status (OK/Warning/Error)
npu-smi info -t board -i <id>             # Board details (firmware, software version)
npu-smi info -t npu -i <id> -c <chip>     # Chip details (name, health, usage)
npu-smi info -m                           # List all chips

Real-time Metrics

npu-smi info -t temp -i <id> -c <chip>    # Temperature (NPU, AI Core)
npu-smi info -t power -i <id> -c <chip>   # Power usage and limit
npu-smi info -t memory -i <id> -c <chip>  # Memory usage, total, rate

Advanced Queries

npu-smi info proc -i <id> -c <chip>       # Running processes (PID, memory, AI Core usage)
npu-smi info -t ecc -i <id> -c <chip>     # ECC errors and mode
npu-smi info -t usages -i <id> -c <chip>  # Utilization (AI Core, memory, bandwidth)
npu-smi info -t pcie-info -i <id> -c <chip>  # PCIe speed and width
npu-smi info -t p2p -i <id> -c <chip>     # P2P status and mode
npu-smi info -t product -i <id> -c <chip> # Product name and serial

See: references/device-queries.md for output formats, examples, monitoring scripts, and platform identification (A2 vs A3).

Configuration

Temperature and Power Thresholds

npu-smi set -t temperature -i <id> -c <chip> -d <value>   # Temperature threshold (°C)
npu-smi set -t power-limit -i <id> -c <chip> -d <value>   # Power limit (W)

Mode Configuration

npu-smi set -t ecc-mode -i <id> -c <chip> -d <0|1>        # 0=Disable, 1=Enable
npu-smi set -t compute-mode -i <id> -c <chip> -d <mode>   # 0=Default, 1=Exclusive, 2=Prohibited
npu-smi set -t persistence-mode -i <id> -d <0|1>          # Persistence mode
npu-smi set -t p2p-mem-cfg -i <id> -c <chip> -d <0|1>     # P2P configuration

Fan Control

npu-smi set -t pwm-mode -d <0|1>                          # 0=Manual, 1=Automatic
npu-smi set -t pwm-duty-ratio -d <0-100>                  # Fan speed (percent)

System Settings

npu-smi set -t mac-addr -i <id> -c <chip> -d <mac_id> -s "XX:XX:XX:XX:XX:XX"
npu-smi set -t boot-select -i <id> -c <chip> -d <3|4>     # 3=M.2 SSD, 4=eMMC
npu-smi set -t cpu-freq-up -i <id> -d <0|1>               # 0=1.9GHz/800MHz, 1=1.0GHz/800MHz
npu-smi set -t sys-log-enable -d <0|1>                    # System logging

Clear Commands

npu-smi clear -t ecc-info -i <id> -c <chip>               # Clear ECC errors
npu-smi clear -t tls-cert-period -i <id> -c <chip>        # Restore cert threshold

See: references/configuration.md for parameter tables and examples.

Firmware Management

Upgrade Workflow

Query → Upgrade → Check Status → Activate → Restart
npu-smi upgrade -b <item> -i <id>                         # Query current version
npu-smi upgrade -t <item> -i <id> -f <file.hpm>           # Upload firmware
npu-smi upgrade -q <item> -i <id>                         # Check upgrade status
npu-smi upgrade -a <item> -i <id>                         # Activate firmware

Components and Restart Requirements

ComponentItem NameRestart Required
MCUmcuYes (restart)
BootloaderbootloaderYes (restart)
VRDvrdYes (power cycle 30s)

See: references/firmware-upgrade.md for complete procedures.

Virtualization (vNPU)

Queries

npu-smi info -t vnpu-mode                                 # Query AVI mode (0=Container, 1=VM)
npu-smi info -t template-info                             # List all templates
npu-smi info -t template-info -i <id>                     # Templates for specific device
npu-smi info -t info-vnpu -i <id> -c <chip>               # View vNPU info

Management

npu-smi set -t vnpu-mode -d <0|1>                         # Set AVI mode
npu-smi set -t create-vnpu -i <id> -c <chip> -f <template> [-v <vnpu_id>] [-g <vgroup_id>]
npu-smi set -t destroy-vnpu -i <id> -c <chip> -v <vnpu_id>

vNPU ID Range: [phy_id*16+100, phy_id*16+115]

See: references/virtualization.md for vNPU creation and management.

Certificate Management

Queries

npu-smi info -t tls-csr-get -i <id> -c <chip>             # Generate CSR (PEM format)
npu-smi info -t tls-cert -i <id> -c <chip>                # View certificate details
npu-smi info -t tls-cert-period -i <id> -c <chip>         # Check expiration threshold
npu-smi info -t rootkey -i <id> -c <chip>                 # Rootkey status

Management

npu-smi set -t tls-cert -i <id> -c <chip> -f "<tls.pem> <ca.pem> <subca.pem>"
npu-smi set -t tls-cert-period -i <id> -c <chip> -s <days>  # Set threshold (7-180 days)
npu-smi clear -t tls-cert-period -i <id> -c <chip>        # Restore default (90 days)

See: references/certificate-management.md for certificate lifecycle management.

Parameters Reference

ParameterDescriptionHow to Get
idDevice ID (NPU ID)npu-smi info -l
chip_idChip IDnpu-smi info -m
vnpu_idvNPU IDAuto-assigned or specified in range
mac_idMAC interface0=eth0, 1=eth1, 2=eth2, 3=eth3

Supported Platforms

  • Atlas 200I DK A2 Developer Kit
  • Atlas 500 A2 Smart Station
  • Atlas 200I A2 Acceleration Module (RC/EP scenarios)
  • Atlas A2/A3 Training Series
  • Atlas Training Series

Note: Chip name (e.g., 910B3) does not indicate server platform (A2 vs A3). Use dmidecode -t system | grep Product or npu-smi info -t product to identify the server model. See references/device-queries.md for details.

Important Notes

  • Most configuration commands require root permissions
  • Device IDs from npu-smi info -l
  • Chip IDs from npu-smi info -m
  • MCU/bootloader upgrades require restart after activation
  • VRD upgrades require power cycle (30+ seconds off)
  • MAC/boot changes require restart
  • Command availability varies by hardware platform

Scripts

Official Documentation

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

atc-model-converter

No summary provided by upstream source.

Repository SourceNeeds Review
General

hccl-test

No summary provided by upstream source.

Repository SourceNeeds Review
General

ascend-docker

No summary provided by upstream source.

Repository SourceNeeds Review
General

ascendc

No summary provided by upstream source.

Repository SourceNeeds Review