Welcome to RAGEN's Tutorial!¶

🚀 Introduction¶

RAGEN (Reinforcement learning AGENt) is a reproduction of the DeepSeek-R1(-Zero) framework for training agentic models building on top of verl.

Key Features¶

Feature 1: Support multiple RL algorithms, including PPO and GRPO.
Feature 2: Support multi-turn online RL training for agentic models.
Feature 3: Easy to be extended to any other Gym environments.

📚 Documentation Structure¶

Updates¶

Updates: Our latest updates and changelog

Quick Start¶

Installation: Get RAGEN up and running
Quick Start Guide: Your first steps with RAGEN

Configurations¶

Config Explanation: Understanding RAGEN's configuration system

Examples¶

Sokoban: Complex puzzle environment
Bi-arm Bandit: Classic exploration vs exploitation
FrozenLake: Grid-world environment example

🤝 Contributing¶

We welcome contributions! Whether you're fixing bugs, adding new features, or improving documentation, please feel free to make a pull request.

📖 Citation¶

If you find RAGEN helpful in your research/project, please feel free to cite our work using:

@misc{ragen,
      title={RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning}, 
      author={Zihan Wang and Kangrui Wang and Qineng Wang and Pingyue Zhang and Linjie Li and Zhengyuan Yang and Kefan Yu and Minh Nhat Nguyen and Licheng Liu and Eli Gottlieb and Monica Lam and Yiping Lu and Kyunghyun Cho and Jiajun Wu and Li Fei-Fei and Lijuan Wang and Yejin Choi and Manling Li},
      year={2025},
      eprint={2504.20073},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2504.20073}, 
}

📝 License¶

This project is under Apache-2.0 license.

Ready to get started? Head over to our Installation Guide!