-
Notifications
You must be signed in to change notification settings - Fork 148
Description
I've set use_gpu = True, but the GPU useage is almost close to zero when running the code. When I look into tensorboard, it shows that all operations are assigned to CPU. Then I disable sess_config = tf.ConfigProto(allow_soft_placement=True) and force it running on GPU, the system console throws an error as:
`INFO:tensorflow:Start a new run and write summaries and checkpoints to E:\Code\PythonScripts\DeepRL\BatchPPO\20180308T091941-pendulum.
WARNING:tensorflow:Number of agents should divide episodes per update.
2018-03-08 09:19:41.315004: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2018-03-08 09:19:41.595863: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 960 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:01:00.0
totalMemory: 2.00GiB freeMemory: 1.64GiB
2018-03-08 09:19:41.596493: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 960, pci bus id: 0000:01:00.0, compute capability: 5.2)
INFO:tensorflow:Graph contains 42003 trainable variables.
2018-03-08 09:19:57.811479: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 960, pci bus id: 0000:01:00.0, compute capability: 5.2)
Traceback (most recent call last):
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\client\session.py", line 1293, in _run_fn
self._extend_graph()
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\client\session.py", line 1354, in _extend_graph
self._session, graph_def.SerializeToString(), status)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'ppo_temporary/episodes/Variable': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and devices:
Switch: GPU CPU
VariableV2: CPU
Identity: CPU
Assign: CPU
RefSwitch: GPU CPU
ScatterUpdate: CPU
AssignAdd: CPU
[[Node: ppo_temporary/episodes/Variable = VariableV2container="", dtype=DT_INT32, shape=[10], shared_name="", _device="/device:GPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:/Code/PythonScripts/DeepRL/BatchPPO/agents/scripts/train.py", line 163, in
tf.app.run()
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "E:/Code/PythonScripts/DeepRL/BatchPPO/agents/scripts/train.py", line 145, in main
for score in train(config, FLAGS.env_processes):
File "E:/Code/PythonScripts/DeepRL/BatchPPO/agents/scripts/train.py", line 127, in train
utility.initialize_variables(sess, saver, config.logdir)
File "E:\Code\PythonScripts\DeepRL\BatchPPO\agents\scripts\utility.py", line 116, in initialize_variables
tf.global_variables_initializer()))
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
run_metadata_ptr)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
options, run_metadata)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'ppo_temporary/episodes/Variable': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and devices:
Switch: GPU CPU
VariableV2: CPU
Identity: CPU
Assign: CPU
RefSwitch: GPU CPU
ScatterUpdate: CPU
AssignAdd: CPU
[[Node: ppo_temporary/episodes/Variable = VariableV2container="", dtype=DT_INT32, shape=[10], shared_name="", _device="/device:GPU:0"]]
Caused by op 'ppo_temporary/episodes/Variable', defined at:
File "E:/Code/PythonScripts/DeepRL/BatchPPO/agents/scripts/train.py", line 163, in
tf.app.run()
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "E:/Code/PythonScripts/DeepRL/BatchPPO/agents/scripts/train.py", line 145, in main
for score in train(config, FLAGS.env_processes):
File "E:/Code/PythonScripts/DeepRL/BatchPPO/agents/scripts/train.py", line 113, in train
batch_env, config.algorithm, config)
File "E:\Code\PythonScripts\DeepRL\BatchPPO\agents\scripts\utility.py", line 48, in define_simulation_graph
algo = algo_cls(batch_env, step, is_training, should_log, config)
File "E:\Code\PythonScripts\DeepRL\BatchPPO\agents\ppo\algorithm.py", line 78, in init
template, len(batch_env), config.max_length, 'episodes')
File "E:\Code\PythonScripts\DeepRL\BatchPPO\agents\ppo\memory.py", line 44, in init
self._length = tf.Variable(tf.zeros(capacity, tf.int32), False)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\ops\variables.py", line 213, in init
constraint=constraint)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\ops\variables.py", line 331, in _init_from_args
name=name)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\ops\state_ops.py", line 133, in variable_op_v2
shared_name=shared_name)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 926, in _variable_v2
shared_name=shared_name, name=name)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\framework\ops.py", line 2956, in create_op
op_def=op_def)
File "D:\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\framework\ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'ppo_temporary/episodes/Variable': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and devices:
Switch: GPU CPU
VariableV2: CPU
Identity: CPU
Assign: CPU
RefSwitch: GPU CPU
ScatterUpdate: CPU
AssignAdd: CPU
[[Node: ppo_temporary/episodes/Variable = VariableV2container="", dtype=DT_INT32, shape=[10], shared_name="", _device="/device:GPU:0"]]`
It seems that tensorflow does not allow assign an int type variable on GPU.