-
Notifications
You must be signed in to change notification settings - Fork 105
Description
rft阶段,按照作者的github提示,将NODES=("10.112.2.106" "10.112.2.40") 与
pdsh -R ssh -w "$NODE" bash -lc "
source ~/anaconda3/bin/activate arl
cd '$FILE_DIR'
export TOKENIZERS_PARALLELISM=false 虚拟环境设置后,运行发现在加载模型与优化器分布训练的时候出现卡顿,两台机器上均配置了相同的arl虚拟环境和代码,
bash fsdp.sh
Training directory: /home/xiang.xiao/AgentCPM-GUI-main/rft
Launching on nodes: 10.112.2.106 10.112.2.40
-> Launching on 10.112.2.106 (rank 0)...
-> Launching on 10.112.2.40 (rank 1)...
10.112.2.40: Authorized users only. All activity may be monitored and reported.
10.112.2.40: bash: -c: option requires an argument
10.112.2.106: bash: -c: option requires an argument
10.112.2.106: INFO 08-06 09:38:11 [init.py:239] Automatically detected platform cuda.
10.112.2.106: reward_funcs: [<function action_type_check at 0x7f2999860cc0>, <function action_args_check at 0x7f2999860e00>]
10.112.2.106: You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with model.to('cuda').
10.112.2.106: You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 29.59it/s]
10.112.2.106: /home/xiang.xiao/anaconda3/envs/arl/lib/python3.11/site-packages/transformers/models/auto/image_processing_auto.py:625: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use slow_image_processor_class, or fast_image_processor_class instead
10.112.2.106: warnings.warn(
10.112.2.106: Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False.
Loading dataset: 100%|██████████| 15/15 [00:00<00:00, 106274.59it/s]
Loading dataset: 100%|██████████| 15/15 [00:00<00:00, 154581.23it/s]
10.112.2.106: [2025-08-06 09:38:13,519] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
10.112.2.40: INFO 08-06 09:38:17 [init.py:239] Automatically detected platform cuda.
10.112.2.40: reward_funcs: [<function action_type_check at 0x7f7a39938cc0>, <function action_args_check at 0x7f7a39938e00>]
10.112.2.40: You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with model.to('cuda').
10.112.2.40: You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 16.14it/s]
10.112.2.40: /home/xiang.xiao/anaconda3/envs/arl/lib/python3.11/site-packages/transformers/models/auto/image_processing_auto.py:625: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use slow_image_processor_class, or fast_image_processor_class instead
10.112.2.40: warnings.warn(
10.112.2.40: Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False.
10.112.2.106: self.global_sync_address is tcp://10.112.2.106:15000
10.112.2.106: collect_address is tcp://10.112.2.106:15001
10.112.2.106: num_generation is 4
10.112.2.106: num_to_sync is 2 num_to_sync is 2 num_to_sync is 2 num_to_sync is 2
10.112.2.106: gradient_accumulation_steps is 2
10.112.2.106: accelerator.num_processes is 2
10.112.2.106: args.per_device_train_batch_size is 2
10.112.2.106: accelerator num_processes 2 // device_count 1
10.112.2.106: tp_size is %d 1
10.112.2.106: we run into _sync_node_queue
10.112.2.106: _sync_node_queue--------------
10.112.2.106: [rank0]:[W806 09:38:20.084839864 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
10.112.2.106: NCCL version 2.21.5+cuda12.4
Loading dataset: 100%|██████████| 15/15 [00:00<00:00, 72149.72it/s]
Loading dataset: 100%|██████████| 15/15 [00:00<00:00, 85831.60it/s]
10.112.2.40: [2025-08-06 09:38:21,003] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.40: [rank1]:[W806 09:38:32.905694726 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
10.112.2.106: 2025-08-06 09:38:33,208 - ARL - INFO - Listen for stealing at tcp://10.112.2.106:15003
10.112.2.106: max_cache_size is 32
10.112.2.106: 2025-08-06 09:38:33,289 - ARL - INFO - Worker 0 is running on 0 and setup 0MQ.
10.112.2.40: 2025-08-06 09:38:33,290 - ARL - INFO - Worker 1 is running on 1 and setup 0MQ.
10.112.2.40: 2025-08-06 09:38:33,292 - ARL - INFO - Listen for stealing at tcp://10.112.2.40:15003
10.112.2.106: we are returning the GlobalDistributed0MQDataLoader
10.112.2.106: _load_data require data
10.112.2.106: _load_data require data
10.112.2.106: _load_data require data
10.112.2.106: _load_data require data
10.112.2.106: self.chunk_size is 2
10.112.2.106: we run into work_stealing
10.112.2.106: sync_handler----------
10.112.2.106: --------------------------------------------------------11111
10.112.2.106: _master_loop------------------
10.112.2.106: --------------------------------------------------------222
10.112.2.106: --------------------------------------------------------3333
10.112.2.106: --------------------------------------------------------4444444
10.112.2.106: --------------------------------------------------------555555
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04}\x94.']
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:38:50,636 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:39:20,637 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:40:20,638 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:40:50,638 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:41:20,638 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:45:50,642 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:46:20,642 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:46:50,642 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:47:20,643 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: 2025-08-06 09:47:50,643 - ARL - INFO - [ Global GID: 0 | SyncPool Size: 0 | 0 acked / 0 total ] Current 0 sent, 0 ack. Speed 0.00/s.
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: message received is {'tcp://10.112.2.40:15003': 0}
10.112.2.106: sync_sender.send_multipart
10.112.2.106: _sync_node_queue--------------
10.112.2.106: work_stealing parts are [b'SYNC_NODE_QUEUE_LENGTHS', b'\x80\x04\x95>\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x18tcp://10.112.2.106:15003\x94K\x00\x8c\x17tcp://10.112.2.40:15003\x94K\x00u.']
10.112.2.106: mean_queue_length is 0.0
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
10.112.2.106: we run into work_stealing
10.112.2.106: message received is {'tcp://10.112.2.106:15003': 0}
^Cpdsh@a800-01: interrupt (one more within 1 sec to abort)
pdsh@a800-01: interrupt (one more within 1 sec to abort)
pdsh@a800-01: (^Z within 1 sec to cancel pending threads)
pdsh@a800-01: (^Z within 1 sec to cancel pending threads)
pdsh@a800-01: 10.112.2.40: command in progresspdsh@a800-01: 10.112.2.106: command in progress
Cleanup: killing local pdsh and remote training processes...
10.112.2.40: Authorized users only. All activity may be monitored and reported.
pdsh@a800-01: 10.112.2.40: ssh exited with exit code 255
^Cpdsh@a800-01: interrupt (one more within 1 sec to abort)
pdsh@a800-01: (^Z within 1 sec to cancel pending threads)
pdsh@a800-01: 10.112.2.106: command in progress
Cleanup: killing local pdsh and remote training processes...
10.112.2.40: Authorized users only. All activity may be monitored and reported.
pdsh@a800-01: 10.112.2.40: ssh exited with exit code 255
Cleanup: killing local pdsh and remote training processes...
10.112.2.40: Authorized users only. All activity may be monitored and reported.
pdsh@a800-01: 10.112.2.40: ssh exited with exit code 255
^Cpdsh@a800-01: interrupt (one more within 1 sec to abort)
pdsh@a800-01: (^Z within 1 sec to cancel pending threads)
pdsh@a800-01: 10.112.2.106: command in progress
Cleanup: killing local pdsh and remote training processes...
10.112.2.40: Authorized users only. All activity may be monitored and reported.
pdsh@a800-01: 10.112.2.40: ssh exited with exit code 255
