-
Notifications
You must be signed in to change notification settings - Fork 141
Description
Hi! I'm working a value function decomposition model, and I'm trying to get overcooked (v1) results for some baselines, including QMIX.
I understand that this is a library mainly to provide the environments, and possibly the development of the baselines is not your primary concern; but I have tried to take QMIX model from qmix_rnn.py, combine it with the CNN model for overcooked, and it just won't learn anything; I'm hoping maybe someone here has some insight?
notably the losses skyrocket on the early stages of the training. I've been trying many ways to stabilize the learning, but to little success. I tried mitigating this issue by reducing learning rates, decreasing polyak tau, increasing target update intervals, adding normalization layers and so on and so forth.
Is there anybody who has experienced similar issues learning with the Qmix model?