Docker-Based ROS2 Development Skill

When to Use This Skill

Writing Dockerfiles for ROS2 workspaces with colcon builds
Setting up docker-compose for multi-container robotic systems
Debugging DDS discovery failures between containers (CycloneDDS, FastDDS)
Configuring GPU passthrough with NVIDIA Container Toolkit for perception nodes
Forwarding X11 or Wayland displays for rviz2 and rqt tools
Managing USB device passthrough for cameras, LiDARs, and serial devices
Building CI/CD pipelines with Docker-based ROS2 builds and test runners
Creating devcontainer configurations for VS Code with ROS2 extensions
Optimizing Docker layer caching for colcon workspace builds
Designing dev-vs-deploy container strategies with multi-stage builds

ROS2 Docker Image Hierarchy

Official OSRF images follow a layered hierarchy. Always choose the smallest base that satisfies dependencies.

┌──────────────────────────────────────────────────────────────────┐
│  ros:<distro>-desktop-full  (~3.5 GB)                            │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │  ros:<distro>-desktop     (~2.8 GB)                        │  │
│  │  ┌──────────────────────────────────────────────────────┐  │  │
│  │  │  ros:<distro>-perception (~2.2 GB)                    │  │  │
│  │  │  ┌────────────────────────────────────────────────┐   │  │  │
│  │  │  │  ros:<distro>-ros-base  (~1.1 GB)              │   │  │  │
│  │  │  │  ┌──────────────────────────────────────────┐  │   │  │  │
│  │  │  │  │  ros:<distro>-ros-core (~700 MB)         │  │   │  │  │
│  │  │  │  └──────────────────────────────────────────┘  │   │  │  │
│  │  │  └────────────────────────────────────────────────┘   │  │  │
│  │  └──────────────────────────────────────────────────────┘  │  │
│  └────────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────┘

Image Tag	Base OS	Size	Contents	Use Case
`ros:humble-ros-core`	Ubuntu 22.04	~700 MB	rclcpp, rclpy, rosout, launch	Minimal runtime for single nodes
`ros:humble-ros-base`	Ubuntu 22.04	~1.1 GB	ros-core + common_interfaces, rosbag2	Most production deployments
`ros:humble-perception`	Ubuntu 22.04	~2.2 GB	ros-base + image_transport, cv_bridge, PCL	Camera/lidar perception pipelines
`ros:humble-desktop`	Ubuntu 22.04	~2.8 GB	perception + rviz2, rqt, demos	Development with GUI tools
`ros:jazzy-ros-core`	Ubuntu 24.04	~750 MB	rclcpp, rclpy, rosout, launch	Minimal runtime (Jazzy/Noble)
`ros:jazzy-ros-base`	Ubuntu 24.04	~1.2 GB	ros-core + common_interfaces, rosbag2	Production deployments (Jazzy)

Multi-Stage Dockerfiles for ROS2

Dev Stage

The development stage includes build tools, debuggers, and editor support for interactive use.

FROM ros:humble-desktop AS dev
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential cmake gdb python3-pip \
    python3-colcon-common-extensions python3-rosdep \
    ros-humble-ament-lint-auto ros-humble-ament-cmake-pytest \
    ccache \
    && rm -rf /var/lib/apt/lists/*
ENV CCACHE_DIR=/ccache
ENV CC="ccache gcc"
ENV CXX="ccache g++"

Build Stage

Copies only src/ and package.xml files to maximize cache hits during dependency resolution.

FROM ros:humble-ros-base AS build
RUN apt-get update && apt-get install -y --no-install-recommends \
    python3-colcon-common-extensions python3-rosdep \
    && rm -rf /var/lib/apt/lists/*
WORKDIR /ros2_ws
# Copy package manifests first for dependency caching
COPY src/my_pkg/package.xml src/my_pkg/package.xml
RUN . /opt/ros/humble/setup.sh && apt-get update && \
    rosdep install --from-paths src --ignore-src -r -y && \
    rm -rf /var/lib/apt/lists/*
# Source changes invalidate only this layer and below
COPY src/ src/
RUN . /opt/ros/humble/setup.sh && \
    colcon build --cmake-args -DCMAKE_BUILD_TYPE=Release \
      --event-handlers console_direct+

Runtime Stage

Contains only the built install space and runtime dependencies. No compilers, no source code.

FROM ros:humble-ros-core AS runtime
RUN apt-get update && apt-get install -y --no-install-recommends \
    python3-yaml ros-humble-rmw-cyclonedds-cpp \
    && rm -rf /var/lib/apt/lists/*
COPY --from=build /ros2_ws/install /ros2_ws/install
RUN groupadd -r rosuser && useradd -r -g rosuser -m rosuser
USER rosuser
COPY ros_entrypoint.sh /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]

Full Multi-Stage Dockerfile

# syntax=docker/dockerfile:1
# Usage:
#   docker build --target dev -t my_robot:dev .
#   docker build --target runtime -t my_robot:latest .
ARG ROS_DISTRO=humble
ARG BASE_IMAGE=ros:${ROS_DISTRO}-ros-base

# Stage 1: Dependency base — install apt and rosdep deps
FROM ${BASE_IMAGE} AS deps
RUN apt-get update && apt-get install -y --no-install-recommends \
    python3-colcon-common-extensions python3-rosdep \
    && rm -rf /var/lib/apt/lists/*
WORKDIR /ros2_ws
# Copy only package.xml files for rosdep resolution (maximizes cache reuse)
COPY src/my_robot_bringup/package.xml src/my_robot_bringup/package.xml
COPY src/my_robot_perception/package.xml src/my_robot_perception/package.xml
COPY src/my_robot_msgs/package.xml src/my_robot_msgs/package.xml
COPY src/my_robot_navigation/package.xml src/my_robot_navigation/package.xml
RUN . /opt/ros/${ROS_DISTRO}/setup.sh && \
    apt-get update && \
    rosdep install --from-paths src --ignore-src -r -y && \
    rm -rf /var/lib/apt/lists/*

# Stage 2: Development — full dev environment
FROM deps AS dev
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential gdb valgrind ccache python3-pip python3-pytest \
    ros-${ROS_DISTRO}-ament-lint-auto \
    ros-${ROS_DISTRO}-launch-testing-ament-cmake \
    ros-${ROS_DISTRO}-rviz2 ros-${ROS_DISTRO}-rqt-graph \
    && rm -rf /var/lib/apt/lists/*
ENV CCACHE_DIR=/ccache CC="ccache gcc" CXX="ccache g++"
COPY src/ src/
COPY ros_entrypoint.sh /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["bash"]

# Stage 3: Build — compile workspace
FROM deps AS build
COPY src/ src/
RUN . /opt/ros/${ROS_DISTRO}/setup.sh && \
    colcon build \
      --cmake-args -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=OFF \
      --event-handlers console_direct+ \
      --parallel-workers $(nproc)

# Stage 4: Runtime — minimal production image
FROM ros:${ROS_DISTRO}-ros-core AS runtime
ARG ROS_DISTRO=humble
RUN apt-get update && apt-get install -y --no-install-recommends \
    python3-yaml ros-${ROS_DISTRO}-rmw-cyclonedds-cpp \
    && rm -rf /var/lib/apt/lists/*
COPY --from=build /ros2_ws/install /ros2_ws/install
RUN groupadd -r rosuser && useradd -r -g rosuser -m -s /bin/bash rosuser
USER rosuser
ENV RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
COPY ros_entrypoint.sh /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["ros2", "launch", "my_robot_bringup", "robot.launch.py"]

The entrypoint script both dev and runtime stages use:

#!/bin/bash
set -e
source /opt/ros/${ROS_DISTRO}/setup.bash
if [ -f /ros2_ws/install/setup.bash ]; then
    source /ros2_ws/install/setup.bash
fi
exec "$@"

Docker Compose for Multi-Container ROS2 Systems

Basic Multi-Container Setup

Each ROS2 subsystem runs in its own container with process isolation, independent scaling, and per-service resource limits.

# docker-compose.yml
version: "3.8"

x-ros-common: &ros-common
  environment:
    - ROS_DOMAIN_ID=${ROS_DOMAIN_ID:-0}
    - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
    - CYCLONEDDS_URI=file:///cyclonedds.xml
  volumes:
    - ./config/cyclonedds.xml:/cyclonedds.xml:ro
    - /dev/shm:/dev/shm
  network_mode: host
  restart: unless-stopped

services:
  rosbridge:
    <<: *ros-common
    image: my_robot:latest
    command: ros2 launch rosbridge_server rosbridge_websocket_launch.xml port:=9090

  perception:
    <<: *ros-common
    image: my_robot_perception:latest
    command: ros2 launch my_robot_perception perception.launch.py
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    devices:
      - /dev/video0:/dev/video0            # USB camera passthrough

  navigation:
    <<: *ros-common
    image: my_robot_navigation:latest
    command: >
      ros2 launch my_robot_navigation navigation.launch.py
        use_sim_time:=false map:=/maps/warehouse.yaml
    volumes:
      - ./maps:/maps:ro

  driver:
    <<: *ros-common
    image: my_robot_driver:latest
    command: ros2 launch my_robot_driver driver.launch.py
    devices:
      - /dev/ttyUSB0:/dev/ttyUSB0          # Serial motor controller
      - /dev/ttyACM0:/dev/ttyACM0          # IMU over USB-serial
    group_add:
      - dialout

Service Dependencies with Health Checks

services:
  driver:
    <<: *ros-common
    image: my_robot_driver:latest
    healthcheck:
      test: ["CMD", "bash", "-c",
             "source /opt/ros/humble/setup.bash && ros2 topic list | grep -q /joint_states"]
      interval: 5s
      timeout: 10s
      retries: 5
      start_period: 15s

  navigation:
    <<: *ros-common
    image: my_robot_navigation:latest
    depends_on:
      driver:
        condition: service_healthy         # Wait for driver topics

  perception:
    <<: *ros-common
    image: my_robot_perception:latest
    depends_on:
      driver:
        condition: service_healthy         # Camera driver must be ready

Profiles for Dev vs Deploy

services:
  driver:
    <<: *ros-common
    image: my_robot_driver:latest
    command: ros2 launch my_robot_driver driver.launch.py

  rviz:
    <<: *ros-common
    profiles: ["dev"]
    image: my_robot:dev
    command: ros2 run rviz2 rviz2 -d /rviz/config.rviz
    environment:
      - DISPLAY=${DISPLAY}
      - QT_X11_NO_MITSHM=1
    volumes:
      - /tmp/.X11-unix:/tmp/.X11-unix:rw

  rosbag_record:
    <<: *ros-common
    profiles: ["dev"]
    image: my_robot:dev
    command: ros2 bag record -a --storage sqlite3 --max-bag-duration 300 -o /bags/session
    volumes:
      - ./bags:/bags

  watchdog:
    <<: *ros-common
    profiles: ["deploy"]
    image: my_robot:latest
    command: ros2 launch my_robot_bringup watchdog.launch.py
    restart: always

docker compose --profile dev up          # Dev tools (rviz, rosbag)
docker compose --profile deploy up -d    # Production (watchdog, no GUI)

DDS Discovery Across Containers

CycloneDDS XML Config for Unicast Across Containers

When containers use bridge networking (no multicast), configure explicit unicast peer lists.

<!-- cyclonedds.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<CycloneDDS xmlns="https://cdds.io/config">
  <Domain>
    <General>
      <Interfaces>
        <NetworkInterface autodetermine="true" priority="default"/>
      </Interfaces>
      <AllowMulticast>false</AllowMulticast>
    </General>
    <Discovery>
      <!-- Peer list uses docker-compose service names as hostnames -->
      <Peers>
        <Peer address="perception"/>
        <Peer address="navigation"/>
        <Peer address="driver"/>
        <Peer address="rosbridge"/>
      </Peers>
      <ParticipantIndex>auto</ParticipantIndex>
      <MaxAutoParticipantIndex>120</MaxAutoParticipantIndex>
    </Discovery>
    <Internal>
      <SocketReceiveBufferSize min="10MB"/>
    </Internal>
  </Domain>
</CycloneDDS>

FastDDS XML Config

<!-- fastdds.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
  <profiles>
    <participant profile_name="docker_participant" is_default_profile="true">
      <rtps>
        <builtin>
          <discovery_config>
            <discoveryProtocol>SIMPLE</discoveryProtocol>
            <leaseDuration><sec>10</sec></leaseDuration>
          </discovery_config>
          <initialPeersList>
            <locator>
              <udpv4><address>perception</address><port>7412</port></udpv4>
            </locator>
            <locator>
              <udpv4><address>navigation</address><port>7412</port></udpv4>
            </locator>
            <locator>
              <udpv4><address>driver</address><port>7412</port></udpv4>
            </locator>
          </initialPeersList>
        </builtin>
      </rtps>
    </participant>
  </profiles>
</dds>

Mount and activate in compose:

# CycloneDDS
environment:
  - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
  - CYCLONEDDS_URI=file:///cyclonedds.xml
volumes:
  - ./config/cyclonedds.xml:/cyclonedds.xml:ro

# FastDDS
environment:
  - RMW_IMPLEMENTATION=rmw_fastrtps_cpp
  - FASTRTPS_DEFAULT_PROFILES_FILE=/fastdds.xml
volumes:
  - ./config/fastdds.xml:/fastdds.xml:ro

Shared Memory Transport in Docker

DDS shared memory (zero-copy) requires /dev/shm sharing between containers. This provides highest throughput for large messages (images, point clouds).

services:
  perception:
    shm_size: "512m"                    # Default 64 MB is too small for image topics
    volumes:
      - /dev/shm:/dev/shm              # Share host shm for inter-container zero-copy

<!-- Enable shared memory in CycloneDDS -->
<CycloneDDS xmlns="https://cdds.io/config">
  <Domain>
    <SharedMemory>
      <Enable>true</Enable>
    </SharedMemory>
  </Domain>
</CycloneDDS>

Constraints: all communicating containers must share /dev/shm or use ipc: host. Use --ipc=shareable on one container and --ipc=container:<name> on others for scoped sharing.

Networking Modes and ROS2 Implications

Host Networking

services:
  my_node:
    network_mode: host      # Shares host network namespace; DDS multicast works natively

Bridge Networking (Default)

services:
  my_node:
    networks: [ros_net]
networks:
  ros_net:
    driver: bridge          # DDS multicast blocked; requires unicast peer config

Macvlan Networking

networks:
  ros_macvlan:
    driver: macvlan
    driver_opts:
      parent: eth0
    ipam:
      config:
        - subnet: 192.168.1.0/24
          gateway: 192.168.1.1
services:
  my_node:
    networks:
      ros_macvlan:
        ipv4_address: 192.168.1.50   # Real LAN IP; DDS multicast works natively

Decision Table

Factor	Host	Bridge	Macvlan
DDS discovery	Works natively	Needs unicast peers	Works natively
Network isolation	None	Full isolation	LAN-level isolation
Port conflicts	Yes (host ports)	No (mapped ports)	No (unique IPs)
Performance	Native	Slight overhead	Near-native
Multi-host support	No	With overlay networks	Yes (same LAN)
When to use	Dev, single host	CI/CD, multi-tenant	Multi-robot on LAN

GPU Passthrough for Perception

NVIDIA Container Toolkit Setup

# Install NVIDIA Container Toolkit on the host
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
  | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Compose Config with deploy.resources

services:
  perception:
    image: my_robot_perception:latest
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1                    # Number of GPUs (or "all")
              capabilities: [gpu]
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility,video
    shm_size: "1g"                        # Large shm for GPU<->CPU transfers

For Dockerfiles that need CUDA, start from NVIDIA base and install ROS2 on top:

FROM nvidia/cuda:12.2.0-cudnn8-runtime-ubuntu22.04 AS perception-base
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl gnupg2 lsb-release \
    && curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key \
       -o /usr/share/keyrings/ros-archive-keyring.gpg \
    && echo "deb [arch=$(dpkg --print-architecture) \
       signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] \
       http://packages.ros.org/ros2/ubuntu $(lsb_release -cs) main" \
       > /etc/apt/sources.list.d/ros2.list \
    && apt-get update && apt-get install -y --no-install-recommends \
       ros-humble-ros-base ros-humble-cv-bridge ros-humble-image-transport \
    && rm -rf /var/lib/apt/lists/*

Verification

docker compose exec perception bash -c '
  nvidia-smi
  python3 -c "import torch; print(f\"CUDA available: {torch.cuda.is_available()}\")"
'

Display Forwarding

X11 Forwarding

services:
  rviz:
    image: my_robot:dev
    command: ros2 run rviz2 rviz2
    environment:
      - DISPLAY=${DISPLAY:-:0}                   # Forward host display
      - QT_X11_NO_MITSHM=1                       # Disable MIT-SHM (crashes in Docker)
    volumes:
      - /tmp/.X11-unix:/tmp/.X11-unix:rw          # X11 socket
      - ${HOME}/.Xauthority:/root/.Xauthority:ro  # Auth cookie
    network_mode: host

# Allow local Docker containers to access the X server
xhost +local:docker
# More secure variant:
xhost +SI:localuser:$(whoami)

Wayland Forwarding

services:
  rviz:
    image: my_robot:dev
    command: ros2 run rviz2 rviz2
    environment:
      - WAYLAND_DISPLAY=${WAYLAND_DISPLAY:-wayland-0}
      - XDG_RUNTIME_DIR=/run/user/1000
      - QT_QPA_PLATFORM=wayland
    volumes:
      - ${XDG_RUNTIME_DIR}/${WAYLAND_DISPLAY}:/run/user/1000/${WAYLAND_DISPLAY}:rw

Headless Rendering

For CI/CD or remote machines without a physical display:

# Run rviz2 headless with Xvfb for screenshot capture or testing
docker run --rm my_robot:dev bash -c '
  apt-get update && apt-get install -y xvfb mesa-utils &&
  Xvfb :99 -screen 0 1920x1080x24 &
  export DISPLAY=:99
  source /opt/ros/humble/setup.bash
  ros2 run rviz2 rviz2 -d /config/test.rviz --screenshot /output/frame.png
'

Volume Mounts and Workspace Overlays

Source Mounts for Dev

Mount only src/ during development. Let colcon write build/, install/, and log/ inside named volumes to avoid bind mount performance issues.

# BAD: mounting entire workspace — build artifacts on bind mount are slow
# volumes:
#   - ./my_ros2_ws:/ros2_ws

# GOOD: mount only source, use named volumes for build artifacts
services:
  dev:
    image: my_robot:dev
    volumes:
      - ./src:/ros2_ws/src:rw                     # Source code (bind mount)
      - build_vol:/ros2_ws/build                   # Build artifacts (named volume)
      - install_vol:/ros2_ws/install               # Install space (named volume)
      - log_vol:/ros2_ws/log                       # Log output (named volume)
    working_dir: /ros2_ws

volumes:
  build_vol:
  install_vol:
  log_vol:

ccache Caching

Persist ccache across container rebuilds for faster C++ compilation:

services:
  dev:
    volumes:
      - ccache_vol:/ccache
    environment:
      - CCACHE_DIR=/ccache
      - CCACHE_MAXSIZE=2G
      - CC=ccache gcc
      - CXX=ccache g++
volumes:
  ccache_vol:

ROS Workspace Overlay in Docker

Keep upstream packages cached and only rebuild custom packages:

# Stage 1: upstream dependencies (rarely changes)
FROM ros:humble-ros-base AS upstream
RUN apt-get update && apt-get install -y --no-install-recommends \
    ros-humble-nav2-bringup ros-humble-slam-toolbox \
    ros-humble-robot-localization \
    && rm -rf /var/lib/apt/lists/*

# Stage 2: custom packages overlay on top
FROM upstream AS workspace
WORKDIR /ros2_ws
COPY src/ src/
RUN . /opt/ros/humble/setup.sh && colcon build --symlink-install
# install/setup.bash automatically sources /opt/ros/humble as underlay

USB Device Passthrough

Cameras and Serial Devices

services:
  camera_driver:
    image: my_robot_driver:latest
    devices:
      - /dev/video0:/dev/video0        # USB camera (V4L2)
      - /dev/video1:/dev/video1
    group_add:
      - video                          # Access /dev/videoN without root

  motor_driver:
    image: my_robot_driver:latest
    devices:
      - /dev/ttyUSB0:/dev/ttyUSB0      # USB-serial motor controller
      - /dev/ttyACM0:/dev/ttyACM0      # Arduino/Teensy
    group_add:
      - dialout                        # Access serial ports without root

Udev Rules Inside Containers

Create stable device symlinks on the host so container paths remain consistent regardless of USB enumeration order.

# /etc/udev/rules.d/99-robot-devices.rules (host-side)
SUBSYSTEM=="tty", ATTRS{idVendor}=="0403", ATTRS{idProduct}=="6001", SYMLINK+="robot/motor_controller"
SUBSYSTEM=="tty", ATTRS{idVendor}=="10c4", ATTRS{idProduct}=="ea60", SYMLINK+="robot/lidar"
SUBSYSTEM=="video4linux", ATTRS{idVendor}=="046d", ATTRS{idProduct}=="0825", SYMLINK+="robot/camera"

sudo udevadm control --reload-rules && sudo udevadm trigger

services:
  driver:
    devices:
      - /dev/robot/motor_controller:/dev/ttyMOTOR   # Stable symlink
      - /dev/robot/lidar:/dev/ttyLIDAR
      - /dev/robot/camera:/dev/video0

Dynamic Device Attachment

For devices plugged in after the container starts:

services:
  driver:
    # Option 1: privileged (use only when necessary)
    privileged: true
    volumes:
      - /dev:/dev

    # Option 2: cgroup device rules (more secure)
    # device_cgroup_rules:
    #   - 'c 188:* rmw'                  # USB-serial (major 188)
    #   - 'c 81:* rmw'                   # Video devices (major 81)

Dev Container Configuration

// .devcontainer/devcontainer.json
{
  "name": "ROS2 Humble Dev",
  "build": {
    "dockerfile": "../Dockerfile",
    "target": "dev",
    "args": { "ROS_DISTRO": "humble" }
  },
  "runArgs": [
    "--network=host", "--ipc=host", "--pid=host",
    "--privileged", "--gpus", "all",
    "-e", "DISPLAY=${localEnv:DISPLAY}",
    "-e", "QT_X11_NO_MITSHM=1",
    "-v", "/tmp/.X11-unix:/tmp/.X11-unix:rw",
    "-v", "/dev:/dev"
  ],
  "workspaceMount": "source=${localWorkspaceFolder},target=/ros2_ws/src,type=bind",
  "workspaceFolder": "/ros2_ws",
  "mounts": [
    "source=ros2-build-vol,target=/ros2_ws/build,type=volume",
    "source=ros2-install-vol,target=/ros2_ws/install,type=volume",
    "source=ros2-log-vol,target=/ros2_ws/log,type=volume",
    "source=ros2-ccache-vol,target=/ccache,type=volume"
  ],
  "containerEnv": {
    "ROS_DISTRO": "humble",
    "RMW_IMPLEMENTATION": "rmw_cyclonedds_cpp",
    "CCACHE_DIR": "/ccache",
    "RCUTILS_COLORIZED_OUTPUT": "1"
  },
  "customizations": {
    "vscode": {
      "extensions": [
        "ms-iot.vscode-ros",
        "ms-vscode.cpptools",
        "ms-python.python",
        "ms-vscode.cmake-tools",
        "smilerobotics.urdf",
        "redhat.vscode-xml",
        "redhat.vscode-yaml"
      ],
      "settings": {
        "ros.distro": "humble",
        "python.defaultInterpreterPath": "/usr/bin/python3",
        "C_Cpp.default.compileCommands": "/ros2_ws/build/compile_commands.json",
        "cmake.configureOnOpen": false
      }
    }
  },
  "postCreateCommand": "sudo apt-get update && rosdep update && rosdep install --from-paths src --ignore-src -r -y",
  "remoteUser": "rosuser"
}

CI/CD with Docker

GitHub Actions Workflow

# .github/workflows/ros2-docker-ci.yml
name: ROS2 Docker CI
on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build-and-test:
    runs-on: ubuntu-22.04
    strategy:
      fail-fast: false
      matrix:
        ros_distro: [humble, jazzy]
        include:
          - ros_distro: humble
            ubuntu: "22.04"
          - ros_distro: jazzy
            ubuntu: "24.04"
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Cache Docker layers
        uses: actions/cache@v4
        with:
          path: /tmp/.buildx-cache
          key: ${{ runner.os }}-buildx-${{ matrix.ros_distro }}-${{ hashFiles('src/**/package.xml') }}
          restore-keys: ${{ runner.os }}-buildx-${{ matrix.ros_distro }}-

      - name: Build and test
        run: |
          docker build --target dev \
            --build-arg ROS_DISTRO=${{ matrix.ros_distro }} \
            -t test-image:${{ matrix.ros_distro }} .
          docker run --rm test-image:${{ matrix.ros_distro }} bash -c '
            source /opt/ros/${{ matrix.ros_distro }}/setup.bash &&
            cd /ros2_ws &&
            colcon build --cmake-args -DBUILD_TESTING=ON &&
            colcon test --event-handlers console_direct+ &&
            colcon test-result --verbose'

      - name: Push runtime image
        if: github.ref == 'refs/heads/main'
        uses: docker/build-push-action@v5
        with:
          context: .
          target: runtime
          build-args: ROS_DISTRO=${{ matrix.ros_distro }}
          tags: |
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ matrix.ros_distro }}-latest
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ matrix.ros_distro }}-${{ github.sha }}
          push: true
          cache-from: type=local,src=/tmp/.buildx-cache
          cache-to: type=local,dest=/tmp/.buildx-cache-new,mode=max

      - name: Rotate cache
        run: rm -rf /tmp/.buildx-cache && mv /tmp/.buildx-cache-new /tmp/.buildx-cache

Layer Caching

Order Dockerfile instructions from least-frequently-changed to most-frequently-changed:

1. Base image         (ros:humble-ros-base)     — changes on distro upgrade
2. System apt packages                           — changes on new dependency
3. rosdep install     (from package.xml)         — changes on new ROS dep
4. COPY src/ src/                                — changes on every code edit
5. colcon build                                  — rebuilds on source change

Build Matrix Across Distros

strategy:
  matrix:
    ros_distro: [humble, iron, jazzy, rolling]
    rmw: [rmw_cyclonedds_cpp, rmw_fastrtps_cpp]
    exclude:
      - ros_distro: iron           # Iron EOL — skip
        rmw: rmw_fastrtps_cpp

Docker ROS2 Anti-Patterns

1. Running Everything in One Container

Problem: Putting perception, navigation, planning, and drivers in a single container defeats the purpose of containerization. A crash in one subsystem takes down everything.

Fix: Split into one service per subsystem. Use docker-compose to orchestrate.

# BAD: monolithic container
services:
  robot:
    image: my_robot:latest
    command: ros2 launch my_robot everything.launch.py

# GOOD: one container per subsystem
services:
  perception:
    image: my_robot_perception:latest
    command: ros2 launch my_robot_perception perception.launch.py
  navigation:
    image: my_robot_navigation:latest
    command: ros2 launch my_robot_navigation navigation.launch.py
  driver:
    image: my_robot_driver:latest
    command: ros2 launch my_robot_driver driver.launch.py

2. Using Bridge Networking Without DDS Config

Problem: DDS uses multicast for discovery by default. Docker bridge networks do not forward multicast. Nodes in different containers will not discover each other.

Fix: Use network_mode: host or configure DDS unicast peers explicitly.

# BAD: bridge network with no DDS config
services:
  node_a:
    networks: [ros_net]
  node_b:
    networks: [ros_net]

# GOOD: host networking (simplest)
services:
  node_a:
    network_mode: host
  node_b:
    network_mode: host

# GOOD: bridge with CycloneDDS unicast peers
services:
  node_a:
    networks: [ros_net]
    environment:
      - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
      - CYCLONEDDS_URI=file:///cyclonedds.xml
    volumes:
      - ./cyclonedds.xml:/cyclonedds.xml:ro

3. Building Packages in the Runtime Image

Problem: Installing compilers and build tools in the runtime image bloats it by 1-2 GB and increases attack surface.

Fix: Use multi-stage builds. Compile in a build stage, copy only the install space to runtime.

# BAD: build tools in runtime image (2.5 GB)
FROM ros:humble-ros-base
RUN apt-get update && apt-get install -y build-essential python3-colcon-common-extensions
COPY src/ /ros2_ws/src/
RUN cd /ros2_ws && colcon build
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]

# GOOD: multi-stage build (800 MB)
FROM ros:humble-ros-base AS build
RUN apt-get update && apt-get install -y python3-colcon-common-extensions
COPY src/ /ros2_ws/src/
RUN cd /ros2_ws && . /opt/ros/humble/setup.sh && colcon build

FROM ros:humble-ros-core AS runtime
COPY --from=build /ros2_ws/install /ros2_ws/install
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]

4. Mounting the Entire Workspace as a Volume

Problem: Mounting the full workspace means colcon writes build/, install/, and log/ to a bind mount. On macOS/Windows Docker Desktop, bind mount I/O is 10-50x slower. Builds that take 2 minutes take 20+ minutes.

Fix: Mount only src/ as a bind mount. Use named volumes for build artifacts.

# BAD:
volumes:
  - ./my_ros2_ws:/ros2_ws

# GOOD:
volumes:
  - ./my_ros2_ws/src:/ros2_ws/src
  - build_vol:/ros2_ws/build
  - install_vol:/ros2_ws/install
  - log_vol:/ros2_ws/log

5. Running Containers as Root

Problem: Running as root inside containers is a security risk. If compromised, the attacker has root access to mounted volumes and devices.

Fix: Create a non-root user with appropriate group membership.

# BAD:
FROM ros:humble-ros-base
COPY --from=build /ros2_ws/install /ros2_ws/install
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]

# GOOD:
FROM ros:humble-ros-base
RUN groupadd -r rosuser && \
    useradd -r -g rosuser -G video,dialout -m -s /bin/bash rosuser
COPY --from=build --chown=rosuser:rosuser /ros2_ws/install /ros2_ws/install
USER rosuser
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]

6. Ignoring Layer Cache Order in Dockerfile

Problem: Placing COPY src/ . before rosdep install means every source change invalidates the dependency cache. All apt packages are re-downloaded on every build.

Fix: Copy only package.xml files first, install dependencies, then copy source.

# BAD: source copy before rosdep
COPY src/ /ros2_ws/src/
RUN rosdep install --from-paths src --ignore-src -r -y
RUN colcon build

# GOOD: package.xml first, then rosdep, then source
COPY src/my_pkg/package.xml /ros2_ws/src/my_pkg/package.xml
RUN . /opt/ros/humble/setup.sh && rosdep install --from-paths src --ignore-src -r -y
COPY src/ /ros2_ws/src/
RUN . /opt/ros/humble/setup.sh && colcon build

7. Hardcoding ROS_DOMAIN_ID

Problem: Hardcoding ROS_DOMAIN_ID=42 causes conflicts when multiple robots or developers share a network. Two robots with the same domain ID will cross-talk.

Fix: Use environment variables with defaults. Set domain ID at deploy time.

# BAD:
environment:
  - ROS_DOMAIN_ID=42

# GOOD:
environment:
  - ROS_DOMAIN_ID=${ROS_DOMAIN_ID:-0}

ROS_DOMAIN_ID=1 docker compose up -d    # Robot 1
ROS_DOMAIN_ID=2 docker compose up -d    # Robot 2

8. Forgetting to Source setup.bash in Entrypoint

Problem: The CMD runs ros2 launch ... but the shell has not sourced the ROS2 setup files. Fails with ros2: command not found.

Fix: Use an entrypoint script that sources the underlay and overlay before executing the command.

# BAD: no sourcing
FROM ros:humble-ros-core
COPY --from=build /ros2_ws/install /ros2_ws/install
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]

# GOOD: entrypoint script handles sourcing
FROM ros:humble-ros-core
COPY --from=build /ros2_ws/install /ros2_ws/install
COPY ros_entrypoint.sh /ros_entrypoint.sh
RUN chmod +x /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]

#!/bin/bash
# ros_entrypoint.sh
set -e
source /opt/ros/${ROS_DISTRO}/setup.bash
if [ -f /ros2_ws/install/setup.bash ]; then
    source /ros2_ws/install/setup.bash
fi
exec "$@"

Docker ROS2 Deployment Checklist

Multi-stage build -- Separate dev, build, and runtime stages so production images contain no compilers or build tools
Non-root user -- Create a dedicated rosuser with only the group memberships needed (video, dialout) instead of privileged mode
DDS configuration -- Ship a cyclonedds.xml or fastdds.xml with explicit peer lists if not using host networking
Health checks -- Every service has a health check that verifies the node is running and publishing on expected topics
Resource limits -- Set mem_limit, cpus, and GPU reservations for each service to prevent resource starvation
Restart policy -- Use restart: unless-stopped for all services and restart: always for critical drivers and watchdogs
Log management -- Configure Docker logging driver (json-file with max-size/max-file) to prevent disk exhaustion from ROS2 log output
Environment injection -- Use .env files or orchestrator secrets for ROS_DOMAIN_ID, DDS config paths, and device mappings rather than hardcoding
Shared memory -- Set shm_size to at least 256 MB (512 MB for image topics) and configure DDS shared memory transport for high-bandwidth topics
Device stability -- Use udev rules on the host to create stable /dev/robot/* symlinks and reference those in compose device mappings
Image versioning -- Tag images with both latest and a commit SHA or semantic version; never deploy unversioned latest to production
Backup and rollback -- Keep the previous image version available so a failed deployment can be rolled back by reverting the image tag