Abstract
Recent work has shown that even superhuman reinforcement learning (RL) policies can be vulnerable to adversarial agents. Most existing approaches for generating such adversaries rely on RL-based methods similar to those used to train the original policy under attack, potentially limiting the diversity of discovered exploits. We present a proof of concept showing that genetic programming (GP) can evolve symbolic adversarial agents that expose flaws in trained RL policies. By framing adversarial discovery as a program synthesis task, our approach enables broader and more interpretable search than conventional methods. We evaluate this approach in two competitive game environments against agents trained by OpenAI, showing that GP-evolved agents can outperform RL-based adversaries. These early results suggest that GP is not only effective for discovering unconventional exploits, but may serve as a useful stress-testing tool for RL systems more generally.
| Original language | English |
|---|---|
| Publication status | Published - 5 Sept 2025 |
| Event | 24th UK Workshop in Computational Intelligence 2025 - Edinburgh, United Kingdom Duration: 3 Sept 2025 → 5 Sept 2025 |
Conference
| Conference | 24th UK Workshop in Computational Intelligence 2025 |
|---|---|
| Abbreviated title | UKCI 2025 |
| Country/Territory | United Kingdom |
| City | Edinburgh |
| Period | 3/09/25 → 5/09/25 |