deep-hjb-solver

Create or refactor code for solving HJB equations with this repository's TensorFlow DGM framework. Use when users ask to generate new HJB training code, add a new problem (config/problem/loss/train script), adapt sampling/training hyperparameters, or create plotting/analysis code from trainer CSV outputs.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "deep-hjb-solver" with this command: npx skills add Reedcgx/deep-hjb-solver-skill

Deep HJB Solver

Before writing any loss or config code, read references/repo-conventions.md — it contains the exact constructor signatures and return-type contracts that must be followed.
Before writing any CSV or plotting code, read references/training-output-contract.md.

Never modify DGMTrainer, DGMNet, or any file under src/trainers/ or src/models/ unless explicitly asked.


Output layout

Every new HJB problem is created as a self-contained folder named after the problem slug. Nothing is written outside it.

<slug>/
├── src/
│   ├── __init__.py
│   ├── configs/
│   │   ├── __init__.py
│   │   ├── common_config.py          ← copied from assets
│   │   └── <slug>_config.py          ← generated
│   ├── models/
│   │   ├── __init__.py
│   │   └── dgm_net.py                ← copied from assets
│   ├── problems/
│   │   ├── __init__.py
│   │   ├── base_problem.py           ← copied from assets
│   │   └── <slug>_problem.py         ← generated
│   ├── losses/
│   │   ├── __init__.py
│   │   └── <slug>_loss.py            ← generated
│   ├── samplers/
│   │   ├── __init__.py
│   │   ├── base_sampler.py           ← copied from assets
│   │   ├── uniform_sampler.py        ← copied from assets
│   │   └── uniform_sampler_2d.py     ← copied from assets
│   ├── trainers/
│   │   ├── __init__.py
│   │   └── dgm_trainer.py            ← copied from assets
│   └── utils/
│       ├── __init__.py
│       └── visualization.py          ← copied from assets
├── examples/
│   └── <slug>_train.py               ← generated
├── plot_training_csv.py              ← copied from assets
└── requirements.txt                  ← copied from assets

Workflow

Use the following steps in order. In every template below, replace:

  • <slug> → snake_case problem name, e.g. linear_control
  • <Prefix> → CamelCase version, e.g. LinearControl
  • <dim> → spatial dimension (1 or 2)
  • <N> → number of control variables
  • <controls> → Python list literal of control names, e.g. ['Z', 'W']

Step 1 — Copy the DGM framework into <slug>/src/

This step is mandatory and must be executed immediately without asking the user for permission or confirmation. Do not say "should I copy the assets?" — just do it.

Run the following shell commands to copy the bundled framework. Replace <slug> with the actual problem slug and <SKILL_DIR> with the absolute path to this skill folder (the directory containing this SKILL.md):

mkdir -p <slug>/src
cp -r <SKILL_DIR>/assets/src/. <slug>/src/
cp <SKILL_DIR>/assets/plot_training_csv.py <slug>/plot_training_csv.py
cp <SKILL_DIR>/assets/requirements.txt <slug>/requirements.txt

Do not proceed to Step 2 until the copy commands have completed successfully.

Step 2 — Create <slug>/src/configs/<slug>_config.py

1D domain (<dim> = 1):

"""Configuration for <Prefix>."""

from dataclasses import dataclass, field
from .common_config import CommonConfig


@dataclass
class <Prefix>Config(CommonConfig):

    dimension: int = 1
    T: float = 1.0
    t_low: float = 0.0
    X_low: float = 0.0
    X_high: float = 1.0

    num_controls: int = <N>
    control_names: list = field(default_factory=lambda: <controls>)
    metrics_config: list = field(default_factory=lambda: ['maxdiff_V', 'maxdiff_terminal'])
    extra_info_mapping: dict = field(default_factory=dict)
    early_stop_metric: str = 'maxdiff_V'
    early_stop_threshold: float = 1e-4
    problem_params_keys: list = field(default_factory=list)

    saveName: str = '<slug>'

2D domain (<dim> = 2) — replace the bounds lines with:

    dimension: int = 2
    X_low: list = field(default_factory=lambda: [0.0, 0.0])
    X_high: list = field(default_factory=lambda: [1.0, 1.0])

Step 3 — Create <slug>/src/problems/<slug>_problem.py

"""Problem definition for <Prefix>."""

from .base_problem import BaseProblem


def terminal_utility_<slug>(x):
    """TODO: implement terminal payoff g(x). x shape: (batch, dim)."""
    return -x[:, :1]


class <Prefix>Problem(BaseProblem):

    def get_terminal_condition(self, x):
        return terminal_utility_<slug>(x)

Step 4 — Create <slug>/src/losses/<slug>_loss.py

"""Loss functions for <Prefix>."""

import tensorflow as tf


class <Prefix>Loss:

    def __init__(self, problem):
        self.problem = problem

    def compute_value_loss(self, model, control, t_interior, X_interior, t_terminal, X_terminal):
        # TODO: replace with real HJB PDE residual.
        # IMPORTANT: if you need more than one gradient from the same tape (e.g.
        # both V_t and V_x), you MUST use persistent=True and delete the tape
        # afterward. A non-persistent tape raises RuntimeError on the second call.
        with tf.GradientTape(persistent=True, watch_accessed_variables=False) as gt:
            gt.watch(t_interior)
            gt.watch(X_interior)
            V = model(t_interior, X_interior)
        V_t = gt.gradient(V, t_interior)   # ∂V/∂t
        V_x = gt.gradient(V, X_interior)   # ∂V/∂x  (use if needed by the HJB)
        del gt  # release persistent tape immediately after use

        ctrl = control(t_interior, X_interior)  # u from control network — substitute into HJB
        residual = V_t  # TODO: replace with actual HJB residual, e.g. V_t + ctrl * V_x + ...
        L1 = tf.reduce_mean(tf.square(residual))

        target_terminal = self.problem.get_terminal_condition(X_terminal)
        fitted_terminal = model(t_terminal, X_terminal)
        diff_terminal = fitted_terminal - target_terminal
        L3 = tf.reduce_mean(tf.square(diff_terminal))

        # diff_V can be either:
        #   (a) a plain tensor — the HJB residual (e.g. diff_V = residual), OR
        #   (b) a dict of debug tensors that MUST contain a 'residual' key
        #       (e.g. {'residual': residual, 'V': V, 'V_t': V_t, 'V_x': V_x})
        # base_problem.extract_metrics handles both forms automatically.
        # MUST return exactly this 4-tuple — DGMTrainer unpacks it directly.
        diff_V = residual          # option (a): simplest form
        # diff_V = {'residual': residual, 'V': V, 'V_t': V_t, 'V_x': V_x}  # option (b)
        return L1, L3, diff_V, diff_terminal

    def compute_control_loss(self, model, control, t_interior, X_interior, t_terminal, X_terminal):
        # TODO: implement FOC / control objective.
        # Always use persistent=True here: you need at least V_x, and some problems
        # also need V_xx (second-order), which requires a nested tape inside this one.
        # persistent=True lets you reuse the outer tape multiple times safely.
        with tf.GradientTape(persistent=True, watch_accessed_variables=False) as gt:
            gt.watch(X_interior)
            V = model(t_interior, X_interior)
        V_x = gt.gradient(V, X_interior)          # ∂V/∂x
        del gt
        ctrl = control(t_interior, X_interior)    # u from control network
        # TODO: compute L2 from the inf{...} terms of the HJB using ctrl and V_x.
        # See "How to translate an HJB equation into loss functions" below for the rule.
        # Example (LQ, inf_u { u*V_x + ½u² }): L2 = tf.reduce_mean(ctrl * V_x + 0.5 * tf.square(ctrl))
        L2 = tf.reduce_mean(ctrl)      # placeholder — replace with real Hamiltonian terms
        # MUST return exactly this 2-tuple — DGMTrainer unpacks it directly
        return L2, {'V_x': V_x, 'ctrl': ctrl}

Step 5 — Register new classes in <slug>/src/configs/__init__.py, <slug>/src/problems/__init__.py, <slug>/src/losses/__init__.py

Append one import line to each file:

# src/configs/__init__.py
from .<slug>_config import <Prefix>Config

# src/problems/__init__.py
from .<slug>_problem import <Prefix>Problem

# src/losses/__init__.py
from .<slug>_loss import <Prefix>Loss

Step 6 — Create <slug>/examples/<slug>_train.py

All imports are relative to <slug>/ (the script is run from inside that folder).

"""Training script for <slug>."""

import os
import sys
import tensorflow as tf

# Ensure the problem folder is on the path when run directly
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

from src.configs.<slug>_config import <Prefix>Config
from src.problems.<slug>_problem import <Prefix>Problem
from src.losses.<slug>_loss import <Prefix>Loss
from src.models import DGMNet
from src.samplers import UniformSampler, UniformSampler2D
from src.trainers import DGMTrainer
from plot_training_csv import plot_training_history


def main():
    config = <Prefix>Config()
    config.save_dir = './results'
    os.makedirs(config.save_dir, exist_ok=True)

    problem = <Prefix>Problem(config)
    # UniformSampler: scalar 1D bounds only.
    # UniformSampler2D: supports any dimension >= 2 (array bounds).
    sampler = UniformSampler(config, problem) if config.dimension == 1 else UniformSampler2D(config, problem)

    model = DGMNet(
        layer_width=config.nodes_per_layer,
        n_layers=config.num_layers,
        input_dim=config.dimension,
        output_dim=1,
        control_output=False,
        activation=config.activation,
    )
    control = DGMNet(
        layer_width=config.nodes_per_layer,
        n_layers=config.num_layers,
        input_dim=config.dimension,
        output_dim=config.num_controls,
        control_output=True,
        activation=config.activation,
    )

    loss_fn = <Prefix>Loss(problem)
    lr_schedule = config.get_optimizer_config()

    trainer = DGMTrainer(
        model=model,
        control=control,
        loss_fn=loss_fn,
        sampler=sampler,
        optimizer_value=tf.keras.optimizers.Adam(lr_schedule),
        optimizer_control=tf.keras.optimizers.Adam(lr_schedule),
        config=config,
        problem=problem,
    )

    history = trainer.train()
    if config.saveOutput:
        trainer.save_models(config.save_dir)

    csv_path = f"{config.save_dir}/{config.saveName}_training_history.csv"
    if os.path.exists(csv_path):
        plot_training_history(csv_path, config.save_dir)

    print('final maxdiff_V:', history['maxdiff_V'][-1])


if __name__ == '__main__':
    main()

Step 7 — Fill the real PDE

In <slug>/src/losses/<slug>_loss.py, replace the placeholder body of compute_value_loss with the actual HJB residual. The return signature must not change: return L1, L3, diff_V, diff_terminal.

In <slug>/src/problems/<slug>_problem.py, implement terminal_utility_<slug> with the correct payoff.

See "How to translate an HJB equation into loss functions" below for the exact decomposition rule.


How to translate an HJB equation into loss functions

Given an HJB equation with an inf (or sup) operator, split it into two losses following this fixed rule.

The rule

LossWhat to put in it
compute_value_lossL1Drop the inf/sup symbol; substitute the control network output for u; sum all remaining terms into a residual; squared mean.
compute_control_lossL2Keep only the terms inside inf{…}; evaluate them with the control network; take the mean directly (no square). For sup{…}, negate first.

The intuition: L1 trains the value network so the PDE residual → 0 (equation is satisfied). L2 trains the control network to actually minimise (or maximise) the Hamiltonian — it IS a gradient-descent step on the inf objective, so no squaring.


Running the training

cd <slug>
python examples/<slug>_train.py

Results are saved to <slug>/results/.


Common pitfall: GradientTape reuse

Rule

Always use persistent=True in both compute_value_loss and compute_control_loss. Even if you think you only need one gradient today, problems that include V_xx (second-order / diffusion terms) require a nested tape inside compute_control_loss, which only works when the outer tape is persistent. Using persistent=True consistently prevents hard-to-diagnose errors.

MethodMinimum gradientsMust use persistent=True?
compute_value_lossV_t + V_x (2 gradients)Yes
compute_control_lossV_x + possibly V_xx via nested tapeYes

compute_value_loss — always persistent=True

with tf.GradientTape(persistent=True, watch_accessed_variables=False) as gt:
    gt.watch(t_interior)
    gt.watch(X_interior)
    V = model(t_interior, X_interior)
V_t = gt.gradient(V, t_interior)
V_x = gt.gradient(V, X_interior)
del gt  # required — frees memory held by persistent tape

compute_control_loss — always persistent=True

with tf.GradientTape(persistent=True, watch_accessed_variables=False) as gt:
    gt.watch(X_interior)
    V = model(t_interior, X_interior)
V_x = gt.gradient(V, X_interior)
del gt  # always delete, even if only one gradient was taken

Omitting persistent=True when computing two or more gradients raises:

RuntimeError: A non-persistent GradientTape can only be used to compute one set of gradients

Guardrails

  • Never modify DGMTrainer, DGMNet, or sampler internals.
  • Keep all tensors as tf.float32.
  • Never hardcode CSV column names in plotting code — detect control columns dynamically.

Environment setup (before running)

If the user wants to run the training script, make sure the Python environment is correctly configured first.

1. Install dependencies

cd <slug>
pip install -r requirements.txt

requirements.txt includes: tensorflow, numpy, matplotlib, tqdm, pandas.

2. Verify TensorFlow can see the GPU (optional but recommended)

import tensorflow as tf
print(tf.__version__)                        # should be >= 2.10
print(tf.config.list_physical_devices('GPU'))  # empty list = CPU-only mode

If GPU is available but not listed, install the matching tensorflow-gpu or cuda/cudnn version.

3. Run from the correct directory

The training script uses relative imports rooted at <slug>/. Always cd into the problem folder first:

cd <slug>
python examples/<slug>_train.py

Running from the workspace root will cause ModuleNotFoundError for src.*.

4. Results location

Output is written to <slug>/results/ (configurable via CommonConfig.save_dir).
CSV training history: <slug>/results/<saveName>_training_history.csv
Saved models: <slug>/results/<saveName>_value_model/ and <slug>/results/<saveName>_control_model/

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

A Python CLI skill for Cutout.Pro visual APIs — background removal, face cutout, and photo enhancement. Supports file upload & image URL input.

Call Cutout.Pro visual processing APIs to perform background removal, face cutout, and photo enhancement. Supports both file upload and image URL input, retu...

Registry SourceRecently Updated
Coding

client-onboarding-agent

Client onboarding and business diagnostic framework for AI agent deployments. Covers 4-round diagnostic process, 6 constraint categories, deployment SOP with...

Registry SourceRecently Updated
Coding

Ai Tools

AI Tools Box - Search and invoke 100+ AI tools. Categories: Writing, Image, Video, Coding, Office, Search, Chat, Audio, Design, Agent, Translation, Dev Platf...

Registry SourceRecently Updated