Make Elastic Training Flexible to GPU Memory 

Currently, the user needs to choose a local upper bound in adaptdl by themselves so that no memory issue is caused by too large batch size given the GPU memory resource, which should be automated by adaptdl to prevent the elastic training from breaking the GPU memory requirement in the future.