mhc-algorithm

Implement mHC (Manifold-Constrained Hyper-Connections) for stabilizing deep network training. Use when implementing residual connection improvements with doubly stochastic matrices via Sinkhorn-Knopp algorithm. Based on DeepSeek's 2025 paper (arXiv:2512.24880).

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "mhc-algorithm" with this command: npx skills add lnj22/mhc-layer-impl-mhc-algorithm

mHC: Manifold-Constrained Hyper-Connections

Overview

mHC (Manifold-Constrained Hyper-Connections) stabilizes deep network training by constraining residual mixing matrices to be doubly stochastic. It provides:

  • Stable Training: Lower gradient norm variance via doubly stochastic constraints
  • Multiple Streams: Hyper-Connections with learnable mixing across residual streams
  • Sinkhorn Projection: Log-space Sinkhorn-Knopp algorithm for doubly stochastic projection
  • GPT Integration: Pattern for wrapping attention and MLP layers

Two components:

  • HyperConnections Module: Core PyTorch module with H_res, H_pre, H_post matrices
  • Sinkhorn-Knopp: Log-space projection to doubly stochastic manifold

Quick Reference

TopicReference
Core Concepts & MathCore Concepts
Sinkhorn AlgorithmSinkhorn-Knopp
HyperConnections ModuleModule Implementation
GPT IntegrationGPT Integration
Common PitfallsPitfalls

Installation

# Required packages
pip install torch einops numpy

Minimal Example

import torch
import torch.nn as nn
from einops import rearrange, einsum

def sinkhorn_knopp(logits, num_iters=20, tau=0.05):
    log_alpha = logits / tau
    for _ in range(num_iters):
        log_alpha = log_alpha - torch.logsumexp(log_alpha, dim=-1, keepdim=True)
        log_alpha = log_alpha - torch.logsumexp(log_alpha, dim=-2, keepdim=True)
    return torch.exp(log_alpha)

class HyperConnections(nn.Module):
    def __init__(self, num_streams, dim, branch=None, layer_idx=0):
        super().__init__()
        self.num_streams = num_streams
        self.branch = branch

        # Initialize H_res near identity (use small negative for gradient flow)
        init_h_res = torch.full((num_streams, num_streams), -0.1)
        init_h_res.fill_diagonal_(0.0)
        self.H_res_logits = nn.Parameter(init_h_res)

        # H_pre/H_post for depth connections
        init_h_pre = torch.full((1, num_streams), -0.1)
        init_h_pre[0, layer_idx % num_streams] = 0.0
        self.H_pre_logits = nn.Parameter(init_h_pre)
        self.H_post_logits = nn.Parameter(torch.zeros(1, num_streams))

    def forward(self, x):
        s = self.num_streams
        x = rearrange(x, "(b s) t d -> b t s d", s=s)

        h_res = sinkhorn_knopp(self.H_res_logits)
        x_mixed = einsum(h_res, x, "s t, b n s d -> b n t d")

        h_pre = self.H_pre_logits.softmax(dim=-1)
        branch_in = einsum(h_pre, x, "v s, b n s d -> b n v d").squeeze(-2)

        branch_out = self.branch(branch_in) if self.branch else branch_in

        h_post = self.H_post_logits.softmax(dim=-1)
        depth_out = einsum(branch_out, h_post, "b t d, v s -> b t s d")

        output = x_mixed + depth_out
        return rearrange(output, "b t s d -> (b s) t d")

Common Imports

import torch
import torch.nn as nn
import torch.nn.functional as F
from einops import rearrange, einsum, repeat, reduce

When to Use What

ScenarioApproach
Standard residual connectionNo mHC needed
Deep networks (>12 layers) with stability issuesUse mHC with num_streams=4
GPT/Transformer trainingWrap both attention and MLP with HyperConnections
Custom Sinkhorn iterationsAdjust num_iters (20 default) and tau (0.05 default)
Memory-constrained trainingReduce num_streams or batch size

External Resources

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Microsoft Sharepoint

Microsoft Sharepoint integration. Manage Sites. Use when the user wants to interact with Microsoft Sharepoint data.

Registry SourceRecently Updated
General

Baidu Wenku AI picture book of video

百度文库AI绘本是一个基于人工智能制作绘本视频的工具,支持生成静态绘本和动态绘本(URL输出)。能帮助文本内容创作者们在缺乏绘画技能的情况下,快速生成精美绘本视频,提高内容生产效率。无论是在儿童教育、亲子互动、品牌营销,还是在社交媒体内容创作等领域都能应用。

Registry SourceRecently Updated
General

即刻手机号码归属地查询

手机号码归属地查询。输入中国大陆 11 位手机号码,查询省份、城市、运营商、运营商类型、邮编、区号和行政区划编码。适用场景:用户说“查一下 17611491111 是哪里的号码”“这个手机号是什么运营商”“帮我查下手机号归属地”等。通过即刻数据开放接口实时查询。

Registry SourceRecently Updated
General

Daily Meal Planner

每日智能菜谱推荐。触发词:今天吃什么/中午吃什么/晚餐推荐/下午茶/夜宵/一周菜单/清淡/辣的/快手菜/减肥。支持按餐次、口味、心情、季节、天气、地域智能推荐,带详细做法和营养数据。

Registry SourceRecently Updated
1540gmmg55