Off-Policy Safe Reinforcement Learning with Cost-Constrained Optimistic Exploration
G. Li, M. T. J. Spaan, J. F. P. Kooij, International Conference on Learning Representations (ICLR), 2026. *to appear*
Cite this work
@inproceedings{coxq,
title={{Off-Policy Safe Reinforcement Learning with Cost-Constrained Optimistic Exploration}},
authors={G. Li and M. T. J. Spaan and J. F. P. Kooij},
booktitle={International Conference on Learning Representations (ICLR)},
pages={},
year={2026},
doi={},
}