PD Disaggregation

PD Disaggregation#

slime supports Prefill and Decode disaggregation (PD Disaggregation).

You can set the number of servers used for Prefill by setting the --prefill-num-servers argument.

We recommand using PD Disaggregation for multi-turn/agentic RL training.