Drone CV Expert
Expert in robotics, drone systems, and computer vision for autonomous aerial platforms.
Decision Tree: When to Use This Skill
User mentions drones or UAVs? ├─ YES → Is it about inspection/detection of specific things (fire, roof damage, thermal)? │ ├─ YES → Use drone-inspection-specialist │ └─ NO → Is it about flight control, navigation, or general CV? │ ├─ YES → Use THIS SKILL (drone-cv-expert) │ └─ NO → Is it about GPU rendering/shaders? │ ├─ YES → Use metal-shader-expert │ └─ NO → Use THIS SKILL as default drone skill └─ NO → Is it general object detection without drone context? ├─ YES → Use clip-aware-embeddings or other CV skill └─ NO → Probably not a drone question
Core Competencies
Flight Control & Navigation
-
PID Tuning: Position, velocity, attitude control loops
-
SLAM: ORB-SLAM, LSD-SLAM, visual-inertial odometry (VIO)
-
Path Planning: A*, RRT, RRT*, Dijkstra, potential fields
-
Sensor Fusion: EKF, UKF, complementary filters
-
GPS-Denied Navigation: AprilTags, visual odometry, LiDAR SLAM
Computer Vision
-
Object Detection: YOLO (v5/v8/v10), EfficientDet, SSD
-
Tracking: ByteTrack, DeepSORT, SORT, optical flow
-
Edge Deployment: TensorRT, ONNX, OpenVINO optimization
-
3D Vision: Stereo depth, point clouds, structure-from-motion
Hardware Integration
-
Flight Controllers: Pixhawk, Ardupilot, PX4, DJI
-
Protocols: MAVLink, DroneKit, MAVSDK
-
Edge Compute: Jetson (Nano/Xavier/Orin), Coral TPU
-
Sensors: IMU, GPS, barometer, LiDAR, depth cameras
Anti-Patterns to Avoid
- "Simulation-Only Syndrome"
Wrong: Testing only in Gazebo/AirSim, then deploying directly to real drone. Right: Simulation → Bench test → Tethered flight → Controlled environment → Field.
- "EKF Overkill"
Wrong: Using Extended Kalman Filter when complementary filter suffices. Right: Match filter complexity to requirements:
-
Complementary filter: Basic stabilization, attitude only
-
EKF: Multi-sensor fusion, GPS+IMU+baro
-
UKF: Highly nonlinear systems, aggressive maneuvers
- "Max Resolution Assumption"
Wrong: Processing 4K frames at 30fps expecting real-time performance. Right: Resolution trade-offs by altitude/speed:
Altitude Speed Resolution FPS Rationale
<30m Slow 1920x1080 30 Detail needed
30-100m Medium 1280x720 30 Balance
100m Fast 640x480 60 Speed priority
- "Single-Thread Processing"
Wrong: Sequential detect → track → control in one loop. Right: Pipeline parallelism:
Thread 1: Camera capture (async) Thread 2: Object detection (GPU) Thread 3: Tracking + state estimation Thread 4: Control commands
- "GPS Trust"
Wrong: Assuming GPS is always accurate and available. Right: Multi-source position estimation:
-
GPS: 2-5m accuracy outdoor, unavailable indoor
-
Visual odometry: 0.1-1% drift, lighting dependent
-
AprilTags: cm-level accuracy where deployed
-
IMU: Short-term only, drift accumulates
- "One Model Fits All"
Wrong: Using same YOLO model for all scenarios. Right: Model selection by constraint:
Constraint Model Notes
Latency critical YOLOv8n 6ms inference
Balanced YOLOv8s 15ms, better accuracy
Accuracy first YOLOv8x 50ms, highest mAP
Edge device YOLOv8n + TensorRT 3ms on Jetson
Problem-Solving Framework
- Constraint Analysis
-
Compute: What hardware? (Jetson Nano = ~5 TOPS, Xavier = 32 TOPS)
-
Power: Battery capacity? Flight time impact?
-
Latency: Control loop rate? Detection response time?
-
Weight: Payload capacity? Center of gravity?
-
Environment: Indoor/outdoor? GPS available? Lighting conditions?
- Algorithm Selection Matrix
Problem Classical Approach Deep Learning When to Use Each
Feature tracking KLT optical flow FlowNet Classical: Real-time, limited compute. DL: Robust, more compute
Object detection HOG+SVM YOLO/SSD Classical: Simple objects, no GPU. DL: Complex, GPU available
SLAM ORB-SLAM DROID-SLAM Classical: Mature, debuggable. DL: Better in challenging scenes
Path planning A*, RRT RL-based Classical: Known environments. DL: Complex, dynamic
- Safety Checklist
-
Kill switch tested and accessible
-
Geofence configured
-
Return-to-home altitude set
-
Low battery action defined
-
Signal loss action defined
-
Propeller guards (if applicable)
-
Pre-flight sensor calibration
-
Weather conditions checked
Quick Reference Tables
MAVLink Message Types
Message Purpose Frequency
HEARTBEAT Connection alive 1 Hz
ATTITUDE Roll/pitch/yaw 10-100 Hz
LOCAL_POSITION_NED Position 10-50 Hz
GPS_RAW_INT Raw GPS 1-10 Hz
SET_POSITION_TARGET Commands As needed
Kalman Filter Tuning
Matrix High Values Low Values
Q (process noise) Trust measurements more Trust model more
R (measurement noise) Trust model more Trust measurements more
P (initial covariance) Uncertain initial state Confident initial state
Common Coordinate Frames
Frame Origin Axes Use
NED Takeoff point North-East-Down Navigation
ENU Takeoff point East-North-Up ROS standard
Body Drone CG Forward-Right-Down Control
Camera Lens center Right-Down-Forward Vision
Reference Files
Detailed implementations in references/ :
-
navigation-algorithms.md
-
SLAM, path planning, localization
-
sensor-fusion-ekf.md
-
Kalman filters, multi-sensor fusion
-
object-detection-tracking.md
-
YOLO, ByteTrack, optical flow
Simulation Tools
Tool Strengths Weaknesses Best For
Gazebo ROS integration, physics Graphics quality ROS development
AirSim Photorealistic, CV-focused Windows-centric Vision algorithms
Webots Multi-robot, accessible Less drone-specific Swarm simulations
MATLAB/Simulink Control design Not real-time Controller tuning
Emerging Technologies (2024-2025)
-
Event cameras: 1μs temporal resolution, no motion blur
-
Neuromorphic computing: Loihi 2 for ultra-low-power inference
-
4D Radar: Velocity + 3D position, works in all weather
-
Swarm autonomy: Decentralized coordination, emergent behavior
-
Foundation models: SAM, CLIP for zero-shot detection
Integration Points
-
drone-inspection-specialist: Domain-specific detection (fire, damage, thermal)
-
metal-shader-expert: GPU-accelerated vision processing, custom shaders
-
collage-layout-expert: Report generation, visual composition
Key Principle: In drone systems, reliability trumps performance. A 95% accurate system that never crashes is better than 99% accurate that fails unpredictably. Always have fallbacks.