PRIME-RL

P1 Public

P1: Mastering Physics Olympiads with Reinforcement Learning

SimpleVLA-RL Public

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Python 1.2k 70

Entropy-Mechanism-of-RL Public

The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.

Python 408 15

RL-Compositionality Public

FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones

Python 56 5

TTRL Public

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 953 65

PRIME Public

Scalable RL solution for advanced reasoning of language models

Python 1.8k 101

Provide feedback