Skip to content

Commit 57745af

Browse files
authored
fix: incorrect paths in document and bugs in dynamic scheduling (RLinf#303)
Signed-off-by: Lin-xs <1833080950@qq.com>
1 parent 3945e2d commit 57745af

File tree

20 files changed

+36
-114
lines changed

20 files changed

+36
-114
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
/docker @andylin-hao
33

44
/examples/embodiment @guozhen1997
5-
/examples/math @guozhen1997 @andylin-hao @Lin-xs
5+
/examples/reasoning @guozhen1997 @andylin-hao @Lin-xs
66

77
/ray_utils @Lin-xs
88
/requirements @andylin-hao

docs/source-en/rst_source/examples/reasoning.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,8 +102,8 @@ Before launching, check the configuration file. Key fields include:
102102

103103
Recommended configurations can be found in:
104104

105-
- ``examples/math/config/qwen2.5-1.5b-grpo-megatron.yaml``
106-
- ``examples/math/config/qwen2.5-7b-grpo-megatron.yaml``
105+
- ``examples/reasoning/config/math/qwen2.5-1.5b-grpo-megatron.yaml``
106+
- ``examples/reasoning/config/math/qwen2.5-7b-grpo-megatron.yaml``
107107

108108
**3. Launch Command**
109109

@@ -118,7 +118,7 @@ Run the following commands to start the Ray cluster and begin training:
118118
if [ "$RANK" -eq 0 ]; then
119119
bash check_ray.sh 128; # set to number of accelerators/GPUs in the cluster
120120
cd /path_to_RLinf;
121-
bash examples/math/qwen2.5/run_main_math_grpo_megatron.sh grpo-1.5b-megatron # change config file
121+
bash examples/reasoning/run_main_grpo_math.sh qwen2.5-1.5b-grpo-megatron # change config file
122122
else
123123
if [ "$RANK" -eq 1 ]; then
124124
sleep 3m

docs/source-en/rst_source/start/distribute.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ Edit the sample YAML:
8484

8585
.. code-block:: yaml
8686
87-
# examples/math/config/qwen2.5-1.5b-grpo-megatron.yaml
87+
# examples/reasoning/config/math/qwen2.5-1.5b-grpo-megatron.yaml
8888
cluster:
8989
num_nodes: 4 # adapt to your cluster
9090
component_placement:
@@ -94,19 +94,19 @@ Launch from the head node:
9494

9595
.. code-block:: bash
9696
97-
bash examples/math/run_main_math_grpo_megatron.sh \
97+
bash examples/reasoning/run_main_grpo_math.sh \
9898
qwen2.5-1.5b-grpo-megatron
9999
100100
101101
Disaggregated
102102
^^^^^^^^^^^^^^^^^^
103103

104104
Different stages receive disjoint GPU ranges,
105-
allowing fine-grained pipeliningng. Edit the pipeline YAML:
105+
allowing fine-grained pipelining. Edit the pipeline YAML:
106106

107107
.. code-block:: yaml
108108
109-
# examples/math/config/qwen2.5-1.5b-grpo-megatron-pipeline.yaml
109+
# examples/reasoning/config/math/qwen2.5-1.5b-grpo-megatron-pipeline.yaml
110110
cluster:
111111
num_nodes: 4
112112
component_placement:
@@ -122,5 +122,5 @@ Start the job:
122122

123123
.. code-block:: bash
124124
125-
bash examples/math/run_main_math_pipeline_grpo_megatron.sh \
125+
bash examples/reasoning/run_main_grpo_math.sh \
126126
qwen2.5-1.5b-grpo-megatron-pipeline

docs/source-en/rst_source/start/llm.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Launch Training
4949
For user convenience, our configuration file is set up to run with a single GPU by default.
5050
However, if you have multiple GPUs and wish to accelerate the quickstart process,
5151
we highly recommend updating the following configuration option in
52-
``./examples/math/config/qwen2.5-1.5b-single-gpu.yaml``:
52+
``./examples/reasoning/config/math/qwen2.5-1.5b-single-gpu.yaml``:
5353
``cluster.component_placement``.
5454

5555

@@ -75,7 +75,7 @@ After these modifications, launch the following script to start training!
7575

7676
.. code-block:: bash
7777
78-
bash examples/math/run_main_math_grpo_megatron.sh qwen2.5-1.5b-single-gpu
78+
bash examples/reasoning/run_main_grpo_math.sh qwen2.5-1.5b-single-gpu
7979
8080
**Step 3: View the results:**
8181

docs/source-en/rst_source/tutorials/advance/version.rst

Lines changed: 2 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ reinforcement-learning pipeline. For the current release **SGLang and vLLM** is
66

77
.. note::
88

9-
RLinf is compatible with **SGLang 0.4.4 → 0.4.9**, **vLLM 0.8.5 → 0.8.5.post1**.
9+
RLinf is compatible with **SGLang 0.4.4 → 0.5.2**, **vLLM 0.8.5 → 0.8.5.post1**.
1010
No manual patching is required – the framework detects the installed
1111
version and loads the matching shim automatically.
1212

@@ -35,7 +35,7 @@ Install via pip
3535
pip install sglang==0.4.8
3636
3737
# Latest supported
38-
pip install sglang==0.4.9
38+
pip install sglang==0.5.2
3939
4040
# Install vLLM
4141
pip install vllm==0.8.5
@@ -106,43 +106,3 @@ Install from Source
106106
cuda_graph_max_bs: 128 # the maximum batch size for cuda graph. If the batch size is larger than this, cuda graph will not be used.
107107
108108
...
109-
110-
111-
Internal Version Routing
112-
------------------------
113-
114-
Directory layout::
115-
116-
rlinf/hybrid_engines/sglang/
117-
├── __init__.py # Version detection and routing
118-
├── sglang_worker.py # Main worker implementation
119-
├── sglang_0_4_4/ # SGLang 0.4.4 specific implementation
120-
│ ├── __init__.py
121-
│ ├── io_struct.py # I/O structures for 0.4.4
122-
│ ├── sgl_engine.py # Engine implementation for 0.4.4
123-
│ ├── sgl_scheduler.py # Scheduler for 0.4.4
124-
│ └── tokenizer_manager.py # Tokenizer management for 0.4.4
125-
└── sglang_0_4_x/ # Future version implementations
126-
└── ...
127-
128-
The loader in ``__init__.py`` resolves the installed package:
129-
130-
.. code-block:: python
131-
132-
from importlib.metadata import PackageNotFoundError, version
133-
134-
def get_version(pkg):
135-
try:
136-
return version(pkg)
137-
except PackageNotFoundError:
138-
return None
139-
140-
package_name = "sglang"
141-
package_version = get_version(package_name)
142-
143-
if package_version == "0.4.4":
144-
sglang_version = "0.4.4"
145-
from .sglang_0_4_4 import io_struct
146-
from .sglang_0_4_4.sgl_engine import Engine
147-
else:
148-
raise ValueError(f"sglang version {package_version} not supported")

docs/source-en/rst_source/tutorials/extend/new_model_megatron.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -468,7 +468,7 @@ Below is an example YAML configuration file for the qwen2.5 model family.
468468

469469
After adapting your new model, you can refer to this YAML configuration file and make appropriate modifications.
470470

471-
**File:** ``examples/math/config/qwen2.5-1.5b-grpo-megatron.yaml``
471+
**File:** ``examples/reasoning/config/math/qwen2.5-1.5b-grpo-megatron.yaml``
472472

473473
Set Megatron parameters used by RLinf.
474474

docs/source-en/rst_source/tutorials/mode/collocated.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,4 +59,4 @@ Given the above placement configuration, users can use proper `ComponentPlacemen
5959
)
6060
6161
`ModelParallelComponentPlacement` supports two types of placement: collocated and disaggregated. More importantly, it deals with rank arrangement that allows efficient model weight update from training to rollout. It parses the configuration and generates placements for different components. The generated placement is then enforced during worker launching.
62-
Refer to `Math RL training python script <https://github.com/RLinf/RLinf/blob/main/examples/math/main_math.py>`_ for the complete code.
62+
Refer to `Math RL training python script <https://github.com/RLinf/RLinf/blob/main/examples/reasoning/main_grpo.py>`_ for the complete code.

docs/source-en/rst_source/tutorials/mode/disaggregated.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,4 @@ Currently, whether the execution is pipelined is decided by the underlying code
3535

3636
**ComponentPlacement programming**
3737

38-
As described in :doc:`collocated`, the placement configuration in the yaml file can be parsed by `ComponentPlacement` and enforced on workers. Refer to `Math RL training with pipelining <https://github.com/RLinf/RLinf/blob/main/examples/math/main_math_pipeline.py>`_ for the complete code.
38+
As described in :doc:`collocated`, the placement configuration in the yaml file can be parsed by `ComponentPlacement` and enforced on workers. Refer to `Math RL training with pipelining <https://github.com/RLinf/RLinf/blob/main/examples/reasoning/main_grpo.py>`_ for the complete code.

docs/source-en/rst_source/tutorials/scheduler/auto-placement.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Use the provided shell script to run the auto placement tool:
4545

4646
.. code-block:: bash
4747
48-
cd examples/math
48+
cd examples/reasoning
4949
./run_placement_autotune.sh [config_name]
5050
5151
Where ``config_name`` is the name of your configuration file.

docs/source-en/rst_source/tutorials/user/flow.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ For example:
1919
- Configs for training a **VLA** agent in embodied tasks live under
2020
``examples/embodiment/config``.
2121
- Configs for training an **LLM** on math reasoning live under
22-
``examples/math/config``.
22+
``examples/reasoning/config/math``.
2323

2424
As a starting point, we recommend getting familiar with the YAML structure of
2525
these examples, then iterating toward your custom task. Key options include

0 commit comments

Comments
 (0)