Using auxiliary models in experience pipeline & OpenAI API supports stream mode by pan-x-c · Pull Request #513 · agentscope-ai/Trinity-RFT

pan-x-c · 2026-03-02T07:56:02Z

Description

As the title says

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist · 2026-03-02T07:56:06Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copilot

Pull request overview

This PR wires auxiliary (judge) models into the experience processing pipeline by making operators async-capable and providing them access to auxiliary model OpenAI clients, while also refactoring ModelWrapper to infer engine type from the underlying model actor.

Changes:

Add an async operator interface (ExperienceOperatorV1) and update ExperiencePipeline to prepare operators asynchronously and await operator processing/cleanup.
Add auxiliary model wrapper discovery (get_auxiliary_model_wrappers) and pass auxiliary model OpenAI clients into experience operators.
Make ModelWrapper fetch engine type from the model actor (get_engine_type) and remove config-passed engine_type.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
trinity/explorer/workflow_runner.py	Stop passing `engine_type` into `ModelWrapper`; rely on model-reported engine type.
trinity/explorer/explorer.py	Reorder preparation so models are prepared before the experience pipeline; add node-affinity comment for the pipeline actor.
trinity/common/models/vllm_model.py	Implement `get_engine_type()` for vLLM-backed inference models.
trinity/common/models/tinker_model.py	Implement `get_engine_type()` for Tinker-backed inference models.
trinity/common/models/model.py	Add `InferenceModel.get_engine_type()` abstractmethod and fetch it in `ModelWrapper.prepare()`.
trinity/common/models/init.py	Rename auxiliary actor names to optionally include config `name`; add `get_auxiliary_model_wrappers()` helper.
trinity/common/config.py	Make `DataProcessorConfig.experience_pipeline` non-optional with a default factory.
trinity/buffer/pipelines/experience_pipeline.py	Defer operator creation to `prepare()`, inject auxiliary model clients, and make operator execution/close async.
trinity/buffer/operators/experience_operator.py	Introduce `ExperienceOperatorV1` async interface + wrapper for legacy operators; add `create_operators()` helper.
trinity/buffer/operators/init.py	Export the new operator interface and factory.
tests/explorer/workflow_test.py	Update `ModelWrapper` construction after removing `engine_type` parameter.
tests/explorer/scheduler_test.py	Update dummy inference models to implement `get_engine_type()`.
tests/explorer/explorer_test.py	Add a test operator that uses auxiliary models via OpenAI async clients; configure an auxiliary model name.
tests/common/vllm_test.py	Update `ModelWrapper` construction after removing `engine_type` parameter.
tests/buffer/reward_shaping_mapper_test.py	Switch to async test and use `create_operators()` with `await op.process(...)`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

trinity/buffer/pipelines/experience_pipeline.py

trinity/common/models/model.py

trinity/common/models/__init__.py

tests/explorer/explorer_test.py

pan-x-c · 2026-03-02T13:26:52Z

/unittest-diff

github-actions · 2026-03-02T13:55:41Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
158	157	0	1	0	0	26m 19s

Skipped

Tests	Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	skipped ⏭️

Tests

Test Name	Status	Duration
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	10.7s
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation	✅	6.3s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	✅	2.6s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	✅	4.3s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	✅	4.8s
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	147ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	1.5s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	552ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	476ms
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter	✅	841ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	978ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	725ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	228ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	6.3s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	2.1s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control	✅	4.0s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	3.3s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	3.1s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	3.6s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration	✅	760ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	7ms
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_default_sample_strategy	✅	2.0s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_staleness_control_sample_strategy	✅	1.7s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_default_sample_strategy	✅	1.6s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_staleness_control_sample_strategy	✅	1.8s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_sql_staleness_control_sample_strategy	✅	4.3s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_default_sample_strategy	✅	1.8s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_staleness_control_sample_strategy	✅	1.8s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_default_sample_strategy	✅	1.6s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_staleness_control_sample_strategy	✅	1.8s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_sql_staleness_control_sample_strategy	✅	3.2s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_0	✅	5.5s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_1	✅	2.1s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write	✅	2.6s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0	✅	71ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1	✅	57ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2	✅	90ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3	✅	89ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4	✅	89ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5	✅	92ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6	✅	106ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple	✅	46ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file	✅	272ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql	✅	2.5s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file	✅	40ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql	✅	2.5s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file	✅	41ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql	✅	3.0s
tests/cli/launcher_test.py::TestLauncherMain::test_debug_mode	✅	45.5s
tests/cli/launcher_test.py::TestLauncherMain::test_log_mode	✅	154ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command	✅	5.5s
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc	✅	1.2s
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command	✅	688ms
tests/cli/launcher_test.py::TestLauncherMain::test_multi_stage_run	✅	13.9s
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	21.6s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	77ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	32ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	161ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	77ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	11.4s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	78ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	77ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	143ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	14ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes	✅	1ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_column_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_block_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	57.4s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	39.6s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	38.6s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	32.3s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	27.0s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	28.1s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	28.2s
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation	✅	26.5s
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status	✅	26.6s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	29.1s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	26.5s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	28.7s
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	⏭️	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	255ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	236ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	32.1s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	32.3s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	1m 25s
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api	✅	40.9s
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	1m 39s
tests/explorer/explorer_test.py::TestExplorerEvalDetailedStats::test_explorer	✅	1m 15s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	1m 1s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	3m 1s
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	1m 2s
tests/explorer/proxy_test.py::RecorderTest::test_recorder	✅	85ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	4.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	4.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	12.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	28.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	4.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	4.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	4.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	4.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	5.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	4.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	12.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	14.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	9.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	7.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	25.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	7.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	13.5s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	9.8s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	1.1s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	11ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	16ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	127ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	3ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	11ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	7ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	100ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	23.2s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	23.0s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0	✅	729ms
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1	✅	14ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	138ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	8.1s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai	✅	26.8s
tests/explorer/workflow_test.py::TestConcurrentWorkflowRunner::test_concurrent_workflow_runner	✅	45.6s

Github Test Reporter by CTRF 💚

pan-x-c · 2026-03-02T14:17:30Z

/unittest-module-trainer

github-actions · 2026-03-02T16:35:04Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
27	24	0	3	0	0	47m 42s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	4m 8s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	5m 9s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 40s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 8s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 3s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 8s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 11s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	⏭️	1ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	33.3s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	32.0s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	32.0s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 38s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 38s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 28s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 54s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	5m 53s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	1m 57s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	✅	1m 46s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	✅	2m 33s
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	✅	1m 7s
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 19s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 10s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	45.5s
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer	⏭️	1ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class	⏭️	1ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner	✅	1m 18s
tests/trainer/trainer_test.py::ColocateModeTest::test_trainer	✅	1m 59s

Github Test Reporter by CTRF 💚

pan-x-c · 2026-03-03T02:04:24Z

/gemini review

gemini-code-assist · 2026-03-03T02:04:27Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

pan-x-c · 2026-03-03T02:34:15Z

/unittest-module-trainer

chenyushuo · 2026-03-03T02:57:00Z

/unittest-module-trainer

github-actions · 2026-03-03T03:49:20Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
27	24	0	3	0	0	49m 43s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	4m 2s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	5m 17s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 47s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 14s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 6s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 17s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 22s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	⏭️	1ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	39.2s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	35.4s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	35.0s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 45s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 45s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 30s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 54s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	5m 53s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	2m 6s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	✅	1m 54s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	✅	2m 41s
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	✅	1m 6s
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 25s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 14s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	48.5s
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer	⏭️	1ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class	⏭️	1ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner	✅	1m 29s
tests/trainer/trainer_test.py::ColocateModeTest::test_trainer	✅	2m 7s

Github Test Reporter by CTRF 💚

pan-x-c · 2026-03-03T03:50:18Z

/unittest-module-common

github-actions · 2026-03-03T04:05:24Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
55	54	0	1	0	0	12m 33s

Skipped

Tests	Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	skipped ⏭️

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	23.0s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	77ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	31ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	161ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	353ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	14.4s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	79ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	77ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1.7s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	14ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes	✅	1ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_column_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_block_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	1m 1s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	44.7s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	50.1s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	28.5s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	27.3s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	27.3s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	27.3s
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation	✅	38.0s
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status	✅	27.2s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	30.0s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	27.0s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	29.7s
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	⏭️	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	554ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	829ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	32.2s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	31.7s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	3m
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api	✅	44.3s

Github Test Reporter by CTRF 💚

pan-x-c added 2 commits March 2, 2026 15:12

experience pipeline using auxiliary models

86e386e

add tests

ea1c695

pan-x-c requested a review from Copilot March 2, 2026 07:58

Copilot started reviewing on behalf of pan-x-c March 2, 2026 07:59 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

pan-x-c added 7 commits March 2, 2026 16:28

gracefully shutdown explorer

914099d

fix tests

91f4711

fix tests

a188dc8

update doc

a80e9ed

fix logger

7e11953

enhance serve mode with streaming support

8691d15

support stream in openai client

2d03874

pan-x-c changed the title ~~Using auxiliary models in experience pipeline~~ Using auxiliary models in experience pipeline & OpenAI API supports stream mode Mar 2, 2026

remove redundant code

52bdd89

Merge branch 'main' into feature/exp_pipeline_auxiliary_model

05980a9

fix cli test

5fc23b9

chenyushuo approved these changes Mar 3, 2026

View reviewed changes

pan-x-c merged commit 31f9b79 into agentscope-ai:main Mar 3, 2026
2 checks passed

Conversation

pan-x-c commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Mar 2, 2026

Uh oh!

github-actions bot commented Mar 2, 2026

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Mar 2, 2026

Uh oh!

github-actions bot commented Mar 2, 2026

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Mar 3, 2026

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Uh oh!

pan-x-c commented Mar 3, 2026

Uh oh!

chenyushuo commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026

Summary

Skipped

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pan-x-c commented Mar 2, 2026 •

edited

Loading