-
Notifications
You must be signed in to change notification settings - Fork 191
Open
Description
大家好!感谢大家对ROLL的关注。
ROLL近期更新了大量新功能,以下是近期更新的一些梳理,我们将持续对ROLL进行迭代更新,欢迎加入ROLL的社区。
🚀亮点:
- (feat): support Qwen3VL, mcore_adapter and examples.
- (feat): Add optimization for computing ref_logprobs and old_logprobs.
- (feat): support vllm beam_search.
- (feat): Add support for Qwen-3-next on AMD GPUs.
- (feat): support sglang==0.5.4、vllm==0.11.1、torch2.8.0.
🚀主要新特性:
- Agentic
- (fix): fix agentic val get_batch state in redundancy env.
- (feat): agentic-spec actor worker.
- (feat): add infer_log_probs in agentic.
- (feat): refactor agentic norm like LitePPO.
- (feat): add agentic profile metrics.
- 模型与后端
- (feat): support vllm beam_search.
- (feat): Add support for Qwen-3-next on AMD GPUs.
- (feat): support offload nccl to save gpu memory. Thanks for slime.
- (feat): support sglang 054.
- (feat): sglang support dp-attention.
- (feat): add enable_reference option. 关于RLVR Pipeline的Reference Model #250
- (feat): add enable_old_logprobs, opt old log probs by cache.
- (feat): support Qwen3VL, mcore_adapter and examples yaml. qwen3 vl有支持计划吗 #190
- (feat): add sequence packing for sft pipeline and distill pipeline, optimize memory usage during top-k logits computation.
- bug fix, refactor
- (fix): update math rule reward worker with thinking. Confused about the extract function in rlvr math_rule_reward_worker. #281
- (feat): set RAY_CGRAPH_get_timeout=600.
- (fix): fix train infer ratio/diff mean & add train infer ratio/diff token/seq mask & add rollout importance sampling. 有关训推差异和old-log-pro #242 添加训推修复功能(add feature train-infer-mismatch) #273
- (fix): ensure compatibility with transformers version check for causal mask update.
- (fix): fix vllm 0110 import for torch280.
- (fix): fix tokenizer mismatch between policy and reward model in llm judge reward worker. Issue with llm judge tokenizer #91
- (fix): fix bugs in data fetching for face embeddings for wan_module.
- (fix): vllm _generate_standard missing prompt_token_ids input args in vllm >0.11.0. fix(vllm_strategy): llm.generate params 'prompt_token_ids' become part of 'prompts' #189
- (fix): vllm add missing argument is_lora in function update_parameter. TypeError: Llm084.update_parameter() takes 4 positional arguments but 5 were given #233
- (fix): fix bugs with metrics recording in the DPO pipeline.
- (fix): update image loading logic for byte data in rlvr_vlm_pipeline.py
- (fix): add alive check. 最新版的roll,vllm,agentic任务,采用async_generation_ratio的话,出现问题,vllm sleep没有了? #253
tt0718 and PanAndy
Metadata
Metadata
Assignees
Labels
No labels