Hi, thanks for your great work!
I’m trying to reproduce the RL training and I’m consistently seeing errors like:
[ERROR code] Failed to decode image 0: Incorrect padding
[ERROR code] Failed to decode image 0: Invalid base64-encoded string: number of data characters (336877) cannot be 1 more than a multiple of 4
My main question is whether these errors correspond to the issue described in the README (i.e., caused by network pressure):
During RL training, transmitting a large number of high-resolution images to a single node in a short period may saturate the bandwidth, leading to retransmissions and timeouts.
I tried launching 3 code servers, each with 4 processes, on a single machine with 8× A800 (80GB), but it didn’t work out. I used commands like:
IMAGE=chenshawn6915/multimodal-ipython-sandbox:oss-v2
docker run -p 18901:18901 -p 18902:18902 -p 18903:18903 -p 18904:18904 $IMAGE
docker run -p 28901:18901 -p 28902:18902 -p 28903:18903 -p 28904:18904 $IMAGE
docker run -p 38901:18901 -p 38902:18902 -p 38903:18903 -p 38904:18904 $IMAGE