PD Disaggregation#
slime supports Prefill and Decode disaggregation (PD Disaggregation).
You can set the number of servers used for Prefill by setting the --prefill-num-servers argument.
We recommand using PD Disaggregation for multi-turn/agentic RL training.