## π Documentation * Links to the relevant documentation/comment: As per the instructions provided here : https://github.com/facebookresearch/detectron2/blob/master/projects/DensePose/doc/GETTING_STARTED.md, if I want to train with 1 GPU, a specific batch size of 2 and base learning of 0.0025 was chosen. How was linear learning rate scaling rule applied here? How did we end up with 0.0025 for batch size of 2?