-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Hi!
First of all, thanks for sharing your code. RL it's a super interesting topic in which I'm still new, so my apologies in advance if these questions are too obvious.
I'm trying to train the DQN using 5 landmarks using a dataloader similar to filesListBrainMRLandmark that you already have implemented. I changed some parameters to fit my requisites but in general it follows the same structure. I also changed the BatchSize to 40.
My questions are:
-
When I'm doing the training I get a success ratio that goes above 1. In fact, the last evaluation gave me 6.01 or so. Is this normal? if not, do you have any idea how can I fix it?
-
I took the last model saved to see how the model behaved but I'm having the following error:
File "DQN.py", line 311, in
pred, num_files)
File "E:\Code\DQN\common.py", line 82, in play_n_episodes
render=render,agents=agents)
File "E:\Code\DQN\common.py", line 54, in play_one_episode
acts, q_values = predict(obs,agents)
File "E:\Code\DQN\common.py", line 40, in predict
q_values = func(*s)
File "D:\Users\Marcos\Anaconda3\lib\site-packages\tensorpack\predict\base.py", line 39, in call
output = self._do_call(dp)
File "D:\Users\Marcos\Anaconda3\lib\site-packages\tensorpack\predict\base.py", line 131, in _do_call
return self._callable(*dp)
File "D:\Users\Marcos\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1226, in _generic_run
return self.run(fetches, feed_dict=feed_dict, **kwargs)
File "D:\Users\Marcos\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "D:\Users\Marcos\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1149, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (45, 45, 45, 4) for Tensor 'state_2:0', which has shape '(?, 45, 45, 45, 4)'
Again, do you have any clue how can I fix this or to which part of the code should I direct my attention?
Thanks in advance for your time and patience.