Abstract
Deep Reinforcement Learning (DRL) offers significant potential for real-world robotics applications, yet sim-to-real transfer remains a major challenge. In this work, we propose the use of the Twin-Delayed Deep Deterministic Policy Gradient (TD3) algorithm to efficiently train a docking policy. We tailor our reward function to perform smooth docking. Additionally, we employ a 'training wheels' approach that initially fills the replay buffer with PID controller demonstrations to accelerate learning. The resulting model is applied to visual servoing and docking tasks, utilizing AprilTag markers for localization. Real experiments validate our approach.
| Original language | English |
|---|---|
| Title of host publication | OCEANS 2025 Brest |
| Publisher | IEEE |
| ISBN (Electronic) | 9798331537470 |
| ISBN (Print) | 9798331537487 |
| DOIs | |
| Publication status | Published - 11 Aug 2025 |
| Event | OCEANS 2025 Brest Conference - Le Quartz, Brest, France Duration: 16 Jun 2025 → 19 Jun 2025 https://brest25.oceansconference.org/ |
Conference
| Conference | OCEANS 2025 Brest Conference |
|---|---|
| Country/Territory | France |
| City | Brest |
| Period | 16/06/25 → 19/06/25 |
| Internet address |
Keywords
- Training
- Location awareness
- Robust control
- Autonomous underwater vehicles
- PI control
- Wheels
- Deep reinforcement learning
- Visual servoing
- Robustness
- Vehicle dynamics
ASJC Scopus subject areas
- Oceanography
- Ocean Engineering