Off-Policy Safe Reinforcement Learning with Constrained Optimistic Exploration

G. Li, M. T. J. Spaan, J. F. P. Kooij, International Conference on Learning Representations (ICLR), 2026.
arXiv   Code  


Off-Policy Safe Reinforcement Learning with Constrained Optimistic Exploration

Cite this work

@inproceedings{coxq,
    title={{Off-Policy Safe Reinforcement Learning with Constrained Optimistic Exploration}},
    author={G. Li and M. T. J. Spaan and J. F. P. Kooij},
    booktitle={International Conference on Learning Representations (ICLR)},
    pages={},
    year={2026},
    doi={},
}

Updated: