CI (Continuous Integration)#
slime uses GitHub Actions for CI. Tests are triggered by PR labels — adding a specific label to a PR will run the corresponding test suite.
How It Works#
The workflow is defined in .github/workflows/pr-test.yml (auto-generated from pr-test.yml.j2). Each CI job:
Runs on a self-hosted GPU runner inside a Docker container (
slimerl/slime:latest).Installs slime with
pip install -e . --no-deps.Acquires the required GPUs via
tests/ci/gpu_lock_exec.py --count <num_gpus>.Executes the test file:
python <test_path>.pyorpython tests/<test_file>.py, depending on whether the test lives undertests/or a subdirectory such astests/plugin_contracts/.
Each test file follows a standard pattern: a prepare() function downloads models/datasets, and an execute() function builds CLI arguments and calls U.execute_train(...).
CI Labels#
Add a label to your PR to trigger the corresponding test suite:
Label |
Job |
Description |
|---|---|---|
|
|
Lightweight smoke tests with Qwen2.5-0.5B (4 GPUs). Fast feedback loop. |
|
|
FSDP backend tests (true on-policy, VL, megatron-fsdp alignment). |
|
|
Core Megatron training tests covering dense, MoE, PPO, MTP, OPD, etc. |
|
|
Numerical precision validation (parallel check, megatron-fsdp alignment). |
|
|
Checkpoint save/load correctness (sync and async-save). |
|
|
Full test suite run on |
|
|
Dynamically detects new/modified test files in the PR and runs only those. |
All labels also run when triggered via workflow_dispatch (manual run from the Actions tab).
Key Labels Explained#
run-ci-changed — Run Only New or Modified Tests#
This is the most useful label for development. When you add a new test file or modify an existing one, just add run-ci-changed to your PR and CI will:
Detect which
tests/test_*.pyortests/plugin_contracts/test_*.pyfiles are added or modified relative toorigin/main(viagit diff --diff-filter=AM).Extract the
NUM_GPUSvalue from each detected test file automatically.Build a dynamic GitHub Actions matrix and run each test in parallel.
This means you don’t need to manually register your new test in the workflow — just make sure your test file has a top-level NUM_GPUS = <N> constant and run-ci-changed will pick it up.
Example: If your PR adds tests/test_qwen3_8B_opd_sglang.py with NUM_GPUS = 8, adding the run-ci-changed label will automatically run that test on 8 GPUs.
run-ci-image — Full Suite on Test Image#
This runs all registered tests on the slimerl/slime-test:latest Docker image. Use this label to:
Validate a newly built Docker image before release.
Run the entire test suite for a comprehensive pre-merge check.
Since this includes every test, it consumes significant GPU time — use it sparingly and prefer more targeted labels for routine development.
run-ci-megatron — Core Megatron Tests#
This is the primary label for validating Megatron-backend changes. It covers:
Dense models: GLM4-9B, Qwen3-4B (PPO)
MoE models: Qwen3-30B-A3B (with/without DeepEP + FP8), Moonlight-16B-A3B
Specialized: MiMo-7B MTP, Qwen2.5-0.5B debug rollout-then-train, OPD with sglang teacher
All tests use 8 GPUs. If you are modifying Megatron training logic, loss computation, or checkpoint conversion, this is the label to use.
Writing a New Test#
Create
tests/test_<your_test_name>.pyfollowing the standard pattern:
import os
import slime.utils.external_utils.command_utils as U
MODEL_NAME = "Qwen2.5-0.5B-Instruct"
MODEL_TYPE = "qwen2.5-0.5B"
NUM_GPUS = 4 # This constant is used by run-ci-changed
def prepare():
U.exec_command("mkdir -p /root/models /root/datasets")
U.exec_command(f"huggingface-cli download Qwen/{MODEL_NAME} --local-dir /root/models/{MODEL_NAME}")
# Download datasets as needed ...
def execute():
# Build argument strings and call U.execute_train(...)
...
if __name__ == "__main__":
prepare()
for proxy_var in ("http_proxy", "https_proxy", "HTTP_PROXY", "HTTPS_PROXY"):
os.environ.pop(proxy_var, None)
execute()
For quick validation: Just push your test file and add
run-ci-changedto the PR. It will be auto-detected.To register in a permanent label group: Edit
.github/workflows/pr-test.yml.j2, add an entry to the desired job’stestslist, then regenerate:
cd .github/workflows && python generate_github_workflows.py
Remember to commit both the .j2 and the generated .yml file.
Workflow Generation#
The workflow file pr-test.yml is auto-generated from the Jinja2 template pr-test.yml.j2. Do not edit pr-test.yml directly. To make changes:
Edit
.github/workflows/pr-test.yml.j2.Run
python .github/workflows/generate_github_workflows.py.Commit both files.
Customization Contract Tests#
For CPU-only contract tests that validate hooks loaded from function paths, run:
python -m pytest \
tests/plugin_contracts/test_plugin_rollout_contracts.py \
tests/plugin_contracts/test_plugin_generate_contracts.py \
tests/plugin_contracts/test_plugin_path_loading_contracts.py \
tests/plugin_contracts/test_plugin_runtime_hook_contracts.py
These files also support direct execution as python tests/plugin_contracts/<file>.py. They declare NUM_GPUS = 0, so run-ci-changed can pick them up without treating them as GPU-heavy end-to-end tests.