BVR Firmware Code Review Skill
This skill provides comprehensive code review for the BVR (Base Vectoring Rover) firmware, focusing on safety-critical systems, real-time control, and CAN bus communication.
Overview
The BVR firmware runs on a Jetson Orin NX and controls a four-wheel differential drive rover using VESC motor controllers over CAN bus. The system is safety-critical with multiple layers of protection including watchdogs, e-stop mechanisms, and rate limiting.
Key Architecture:
-
Platform: Jetson Orin NX (aarch64-unknown-linux-gnu)
-
Language: Rust Edition 2021
-
Runtime: Tokio async multi-threaded
-
Communication: SocketCAN (can0 interface at 500kHz)
-
Workspace: 18 crates in bvr/firmware/
-
Main daemon: bvrd (1191 lines in bins/bvrd/src/main.rs )
Critical Files:
-
State machine: crates/state/src/lib.rs (130 lines)
-
Control logic: crates/control/src/lib.rs (243 lines)
-
CAN abstraction: crates/can/src/lib.rs
-
VESC protocol: crates/can/src/vesc.rs
-
LED control: crates/can/src/leds.rs
-
Main daemon: bins/bvrd/src/main.rs (1191 lines)
-
Configuration: config/bvr.toml (152 lines)
Safety-Critical Review Checklist
- Watchdog Implementation
Purpose: Detect communication timeouts and transition to safe state (Idle) when commands stop arriving.
Location: bvr/firmware/crates/control/src/lib.rs:162-193
Key Points to Review:
-
Watchdog initialized with appropriate timeout (default: 500ms)
-
feed() called on every valid command reception
-
is_timed_out() checked in main control loop
-
Timeout triggers Event::CommandTimeout in state machine
-
Watchdog reset on e-stop or mode transitions
Example Pattern:
// Good: Watchdog properly integrated let mut watchdog = Watchdog::new(Duration::from_millis(500)); loop { if let Some(cmd) = recv_command() { watchdog.feed(); // Reset timer on command } if watchdog.is_timed_out() { state_machine.handle(Event::CommandTimeout); // Motors automatically stop in Idle mode } }
Red Flags:
-
Watchdog created but never checked
-
Timeout value too long (>1 second for teleop)
-
feed() called unconditionally (bypasses timeout detection)
-
No transition to safe state on timeout
See: safety-checklist.md for detailed verification steps.
- E-Stop State Machine
Purpose: Immediate safety override from any operational mode, requiring explicit release to resume.
Location: bvr/firmware/crates/state/src/lib.rs:7-130
State Machine Modes:
-
Disabled
-
Initial state, motors off
-
Idle
-
Ready state, motors enabled but no movement
-
Teleop
-
Human control via gamepad/keyboard
-
Autonomous
-
Autonomous navigation
-
EStop
-
Emergency stop, requires release
-
Fault
-
Error state, requires fault clear
Valid E-Stop Transitions:
-
ANY mode → EStop (via Event::EStop )
-
EStop → Idle (via Event::EStopRelease only)
Key Points to Review:
-
Event::EStop handled from all active modes (Idle, Teleop, Autonomous)
-
Cannot exit e-stop without explicit Event::EStopRelease
-
force_estop() method available for critical overrides
-
LED feedback tied to e-stop state (red pulsing at 200ms)
-
Motor commands blocked in e-stop mode
-
State logged at each transition with old/new mode
Example Pattern:
// Good: Proper e-stop handling match (self.mode, event) { (_, Event::EStop) => { info!(?self.mode, "E-stop triggered"); self.mode = Mode::EStop; // Motors automatically stopped by is_driving() check } (Mode::EStop, Event::EStopRelease) => { info!("E-stop released, transitioning to Idle"); self.mode = Mode::Idle; } (Mode::EStop, Event::Enable) => { warn!("Cannot enable directly from e-stop, must release first"); // No state change - requires explicit release } // ... }
Red Flags:
-
Direct transition from e-stop to driving modes without release
-
E-stop event not handled in all modes
-
State machine allows bypassing e-stop
-
Missing logging on e-stop transitions
See: state-machine.md for complete transition diagram.
- Rate Limiting and Acceleration Control
Purpose: Progressive acceleration/deceleration to prevent wheel slip, mechanical stress, and loss of control.
Location: bvr/firmware/crates/control/src/lib.rs:101-160
Rate Limiter Configuration:
-
Max acceleration: 50.0 m/s² (effectively unlimited for teleop responsiveness)
-
Max deceleration: 15.0 m/s² (higher for quicker stops)
-
Max linear velocity: 5.0 m/s (from chassis config)
-
Max angular velocity: 2.5 rad/s (from chassis config)
Key Points to Review:
-
Rate limiter applied before motor commands sent
-
Direction change detection (applies deceleration rate)
-
Velocity clamped to absolute limits
-
Rate limiter reset on e-stop or mode change
-
Delta time (dt) used for rate calculations
-
No divide-by-zero when dt is very small
Example Pattern:
// Good: Rate limiting with direction change detection impl RateLimiter { pub fn limit(&mut self, target: Twist, dt: f32) -> Twist { // Detect direction change (requires deceleration) let accel = if self.prev.linear * target.linear < 0.0 { self.max_decel // Braking } else { self.max_accel // Accelerating };
// Apply rate limit
let delta = target.linear - self.prev.linear;
let max_delta = accel * dt;
let limited = self.prev.linear + delta.clamp(-max_delta, max_delta);
// Clamp to absolute limits
Twist {
linear: limited.clamp(-self.max_linear, self.max_linear),
angular: /* similar for angular */,
boost: target.boost,
}
}
}
Red Flags:
-
Raw target velocity sent directly to motors
-
No rate limiting on angular velocity
-
Acceleration limits too high (>100 m/s²)
-
Rate limiter bypassed in autonomous mode
-
Missing dt parameter or using fixed dt
See: safety-checklist.md for validation tests.
- State Machine Transitions
Purpose: Enforce valid operational mode transitions and prevent unsafe state combinations.
Key Points to Review:
-
All state transitions explicitly defined
-
Invalid transitions logged and rejected
-
is_driving() check used to gate motor commands
-
is_safe() check used for configuration changes
-
Unit tests cover all transition paths
-
Logging includes both old and new modes
Critical Checks:
// is_driving() - only Teleop and Autonomous should drive pub fn is_driving(&self) -> bool { matches!(self.mode, Mode::Teleop | Mode::Autonomous) }
// is_safe() - only allow config changes in safe modes pub fn is_safe(&self) -> bool { matches!(self.mode, Mode::Disabled | Mode::Idle | Mode::EStop) }
Example Pattern:
// Good: Explicit transition handling with logging match (self.mode, event) { (Mode::Idle, Event::Enable) => { self.mode = Mode::Teleop; info!(old = ?Mode::Idle, new = ?Mode::Teleop, "Mode transition"); } (Mode::Disabled, Event::Enable) => { warn!("Cannot enable from Disabled, transition to Idle first"); // Reject transition } _ => { debug!(?self.mode, ?event, "Unhandled event"); } }
See: state-machine.md for full transition table and unit tests.
CAN Bus Protocol Review
- VESC Motor Controller Communication
Purpose: Send velocity/duty commands to 4 VESC motor controllers (FL, FR, RL, RR) and receive status.
Location: bvr/firmware/crates/can/src/vesc.rs
VESC CAN IDs: 0, 1, 2, 3 (configured in bvr.toml )
-
ID 0: Front Left (FL)
-
ID 1: Front Right (FR)
-
ID 2: Rear Left (RL)
-
ID 3: Rear Right (RR)
Command Frame Format (Extended CAN):
-
Frame ID: (command_id << 8) | vesc_id
-
Data Length: 4 bytes (duty), 4 bytes (RPM), 4 bytes (current)
-
Byte Order: Big-endian
Command Types:
-
SetDuty (0) : Duty cycle control, -1.0 to 1.0 scaled to ±100,000
-
SetRpm (3) : Electrical RPM (divide by pole pairs for mechanical)
-
SetCurrent (1) : Current control in milliamps
Key Points to Review:
-
VESC IDs match configuration (0-3 for 4 motors)
-
Command ID correctly shifted to bits [8:15]
-
Big-endian byte order used (VESC protocol requirement)
-
Duty cycle clamped to ±1.0 before scaling
-
Status parsing includes bounds checking (data.len() >= 8 )
-
Heartbeat timeout configured (100ms default)
-
Status frames parsed correctly (STATUS1, STATUS4, STATUS5)
Example Pattern:
// Good: VESC command with proper encoding pub fn set_duty(&mut self, duty: f32) -> Result<Frame, CanError> { let clamped_duty = duty.clamp(-1.0, 1.0); let duty_value = (clamped_duty * 100_000.0) as i32;
let frame_id = (CMD_SET_DUTY << 8) | self.id; // Extended ID
let data = duty_value.to_be_bytes(); // Big-endian
Frame::new(Id::Extended(frame_id), &data)
}
// Good: Safe status parsing with bounds check fn parse_status1(&mut self, data: &[u8]) { if data.len() < 8 { warn!("STATUS1 frame too short"); return; } self.state.status.erpm = i32::from_be_bytes([data[0], data[1], data[2], data[3]]); self.state.status.current = i16::from_be_bytes([data[4], data[5]]) as f32 / 10.0; self.state.status.duty = i16::from_be_bytes([data[6], data[7]]) as f32 / 1000.0; }
Red Flags:
-
Little-endian byte order used (VESC expects big-endian)
-
No bounds checking on status frame parsing (array panic risk)
-
Duty cycle not clamped (can send invalid values >1.0)
-
VESC IDs hardcoded instead of using configuration
-
Missing timeout detection for lost VESCs
See: can-protocol.md for complete VESC protocol reference.
- MCU LED Peripheral Communication
Purpose: Send LED mode commands to RP2350/ESP32-S3 MCU for rover status indication.
Location: bvr/firmware/crates/can/src/leds.rs
LED CAN ID Range: 0x0B00-0x0BFF (extended IDs for peripherals)
-
LED_CMD: 0x0B00 (Jetson → MCU)
-
LED_STATUS: 0x0B01 (MCU → Jetson)
LED Modes:
-
StateLinked (0x10) : MCU automatically sets color based on rover mode
-
Solid : Fixed color
-
Pulse : Breathing effect (configurable period)
-
Flash : Strobe effect (configurable period)
-
Chase : Moving pattern
Rover State Colors:
-
Disabled: Red solid
-
Idle: Blue solid
-
Teleop: Green pulse (2s period)
-
Autonomous: Cyan pulse (1.5s period)
-
EStop: Red flash (200ms, full brightness)
-
Fault: Orange flash (500ms)
Key Points to Review:
-
LED commands use 0x0B00 base address
-
StateLinked mode used for automatic state indication
-
LED state synchronized with state machine transitions
-
Color choices distinct and recognizable
-
MCU communication timeout handled gracefully
Example Pattern:
// Good: State-linked LED control pub fn update_leds(&mut self, mode: Mode) { let led_mode = match mode { Mode::Idle => LedMode::idle(), // Blue solid Mode::Teleop => LedMode::teleop(), // Green pulse 2s Mode::Autonomous => LedMode::autonomous(), // Cyan pulse 1.5s Mode::EStop => LedMode::estop(), // Red flash 200ms Mode::Fault => LedMode::fault(), // Orange flash 500ms _ => LedMode::off(), }; self.send_led_command(led_mode)?; }
See: can-protocol.md for LED protocol details.
- CAN Bus Error Handling
Key Points to Review:
-
Timeout handling for lost CAN frames (10ms read/write timeout)
-
Invalid frame ID detection and rejection
-
Proper use of Result types with CanError
-
Mock CAN bus for non-Linux development builds
-
Error logging includes context (frame ID, data length, timeout)
Example Pattern:
// Good: Comprehensive error handling #[derive(Error, Debug)] pub enum CanError { #[error("Socket error: {0}")] Socket(String), #[error("Invalid CAN ID: {0}")] InvalidId(u32), #[error("Timeout waiting for response")] Timeout, #[error("Invalid frame data")] InvalidFrame, }
// Good: Non-blocking receive with timeout match self.socket.read_frame_timeout(Duration::from_millis(10)) { Ok(frame) => { /* process / } Err(e) if e.kind() == ErrorKind::WouldBlock => { / no data */ } Err(e) => { error!(?e, "CAN socket read error"); return Err(CanError::Socket(e.to_string())); } }
Code Pattern Review
- Error Handling with thiserror
Convention: Use thiserror for library crates, anyhow for binary crates.
Key Points to Review:
-
Library crates define custom error enums with thiserror
-
Error variants include context with #[error("...")]
-
#[from] attribute used for error conversion
-
Binary crates use anyhow::Result and .context()
-
Errors logged with ?e (Debug) not %e (Display)
Example Pattern:
// Library crate (crates/teleop/src/lib.rs) use thiserror::Error;
#[derive(Error, Debug)] pub enum TeleopError { #[error("Network error: {0}")] Network(#[from] std::io::Error), #[error("Serialization error: {0}")] Serialization(String), #[error("Connection timeout")] Timeout, }
// Binary crate (bins/bvrd/src/main.rs) use anyhow::{Context, Result};
fn load_config() -> Result<Config> { let contents = fs::read_to_string("bvr.toml") .context("Failed to read bvr.toml")?; toml::from_str(&contents) .context("Failed to parse bvr.toml") }
Red Flags:
-
String-based errors in libraries (Result<T, String> )
-
Missing error context in anyhow chains
-
Panic on recoverable errors (unwrap() , expect() in libraries)
-
Display format used in logs (should use Debug: ?e )
- Async/Tokio Patterns
Runtime: Multi-threaded Tokio (#[tokio::main] )
Key Points to Review:
-
#[tokio::main] on main function
-
Channels for inter-task communication (mpsc, watch)
-
Tasks spawned with tokio::spawn for concurrent subsystems
-
Proper error handling in spawned tasks (log errors, don't panic)
-
Graceful shutdown with signal handling
-
No blocking operations in async context
Example Pattern:
// Good: Async main with spawned tasks #[tokio::main] async fn main() -> Result<()> { let (cmd_tx, cmd_rx) = mpsc::channel::<Command>(100); let (telem_tx, telem_rx) = watch::channel(Telemetry::default());
// Spawn teleop server task
tokio::spawn(async move {
if let Err(e) = teleop_server.run(cmd_tx, telem_rx).await {
error!(?e, "Teleop server error");
}
});
// Spawn metrics task
tokio::spawn(async move {
if let Err(e) = metrics.run(telem_rx).await {
error!(?e, "Metrics task error");
}
});
// Main control loop
loop {
tokio::select! {
Some(cmd) = cmd_rx.recv() => { /* handle */ }
_ = tokio::signal::ctrl_c() => {
info!("Shutdown signal received");
break;
}
}
}
Ok(())
}
Red Flags:
-
Blocking calls (std::thread::sleep, blocking I/O) in async functions
-
Missing error handling in spawned tasks
-
Tasks not joined/awaited on shutdown (resource leaks)
-
Unbounded channels (prefer bounded with capacity)
- Module Documentation
Convention: Module-level //! docs at top of lib.rs files, function-level /// docs.
Key Points to Review:
-
//! module-level docs describe crate purpose
-
/// function docs include examples for public API
-
Struct fields documented with ///
-
No missing_docs warnings in library crates
Example Pattern:
//! State machine and mode management for bvr. //! //! This crate implements the rover's operational modes and state //! transitions, including safety mechanisms like e-stop handling.
/// Represents the rover's current operational mode. #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub enum Mode { /// Motors disabled, awaiting initialization Disabled, /// Ready state, motors enabled but stationary Idle, /// Under human control via teleop Teleop, // ... }
/// Handles a state machine event and transitions to a new mode if valid.
///
/// # Example
/// /// let mut sm = StateMachine::new(); /// sm.handle(Event::Enable); /// assert_eq!(sm.mode(), Mode::Idle); ///
pub fn handle(&mut self, event: Event) {
// ...
}
- Workspace Dependencies
Convention: Use version.workspace = true for shared dependencies.
Key Points to Review:
-
Workspace dependencies defined in root Cargo.toml
-
Crate Cargo.toml uses version.workspace = true
-
Consistent versions across all crates (tokio, serde, etc.)
-
Optional features documented
Example Pattern:
Root bvr/firmware/Cargo.toml
[workspace.dependencies] tokio = { version = "1.41", features = ["full"] } serde = { version = "1.0", features = ["derive"] } thiserror = "2.0"
Crate Cargo.toml
[dependencies] tokio = { workspace = true } serde = { workspace = true } thiserror = { workspace = true }
Testing Requirements
- Unit Tests
Convention: Tests in #[cfg(test)] mod tests block at bottom of file.
Key Points to Review:
-
Safety-critical functions have unit tests
-
State machine transitions tested
-
Edge cases covered (zero dt, negative values, etc.)
-
Test names describe behavior (test_watchdog_timeout_triggers )
-
Use assert! with helpful messages
Example Pattern:
#[cfg(test)] mod tests { use super::*;
#[test]
fn test_watchdog_timeout_triggers() {
let mut wd = Watchdog::new(Duration::from_millis(100));
assert!(wd.is_timed_out(), "Watchdog should timeout initially");
wd.feed();
assert!(!wd.is_timed_out(), "Watchdog should not timeout after feed");
std::thread::sleep(Duration::from_millis(150));
assert!(wd.is_timed_out(), "Watchdog should timeout after duration");
}
#[test]
fn test_estop_requires_explicit_release() {
let mut sm = StateMachine::new();
sm.handle(Event::Enable);
assert_eq!(sm.mode(), Mode::Teleop);
sm.handle(Event::EStop);
assert_eq!(sm.mode(), Mode::EStop);
sm.handle(Event::Enable);
assert_eq!(sm.mode(), Mode::EStop, "Cannot enable directly from e-stop");
sm.handle(Event::EStopRelease);
assert_eq!(sm.mode(), Mode::Idle, "Release transitions to Idle");
}
}
- Integration Tests
Key Points to Review:
-
CAN bus simulation/mocking for development
-
Teleop protocol tests with mock clients
-
Configuration loading tested with fixtures
See: Run tests with cargo test (native) or cargo test --target aarch64-unknown-linux-gnu (cross).
References and Additional Resources
For more detailed information, see:
-
safety-checklist.md - Line-by-line safety verification
-
can-protocol.md - Complete VESC and LED protocol reference
-
state-machine.md - State transition diagram and LED feedback
-
CLAUDE.md - Project-wide conventions and architecture
Quick Review Commands
Run all tests
cargo test
Check compilation without building
cargo check
Build for Jetson (requires cross)
cargo build --release --target aarch64-unknown-linux-gnu
Deploy to rover
./deploy.sh <rover-hostname>