Trust Region Optimization for Large Language models (TROLL) Project page and codebase for the paper TROLL: Trust Regions improve Reinforcement Learning for Large Language Models. Code coming soon!