Optimistic Exploration for Risk-Averse Constrained Reinforcement Learning
Reinforcement Learning (RL) has revolutionized how agents learn to act in complex environments. But what happens when an agent can’t afford to make mistakes—because a mistake means a car crash, system failure, or energy limit violation? In such cases, we turn to Constrained Reinforcement Learning (CRL), where agents aim to maximize reward while staying within safety or cost constraints. Unfortunately, current CRL methods often become… too cautious, leading to poor performance. ...