- 
                Notifications
    
You must be signed in to change notification settings  - Fork 216
 
Description
I use the codes to train my own dataset, but raised this error at sees.run(). The detail printed log is as below in which I changed some args such as net_input_height size and batch_p. my tensorflow version is 1.7. I don't know what's wrong here
Instructions for updating:
Use the retry module or similar alternatives.
2018-09-27 11:12:06,474 [INFO] train: Training using the following parameters:
2018-09-27 11:12:06,474 [INFO] train: batch_k: 4
2018-09-27 11:12:06,474 [INFO] train: batch_p: 8
2018-09-27 11:12:06,474 [INFO] train: checkpoint_frequency: 1000
2018-09-27 11:12:06,474 [INFO] train: crop_augment: False
2018-09-27 11:12:06,474 [INFO] train: decay_start_iteration: 100000
2018-09-27 11:12:06,474 [INFO] train: detailed_logs: False
2018-09-27 11:12:06,474 [INFO] train: embedding_dim: 128
2018-09-27 11:12:06,475 [INFO] train: experiment_root: F:/projector/GestureClassification/TripletBasedGestureRecognition/experiment_root/20180926/
2018-09-27 11:12:06,475 [INFO] train: flip_augment: False
2018-09-27 11:12:06,475 [INFO] train: head_name: fc1024
2018-09-27 11:12:06,475 [INFO] train: image_root: F:/projector/GestureClassification/data/img/20180919/triplet_data/img/
2018-09-27 11:12:06,475 [INFO] train: initial_checkpoint: None
2018-09-27 11:12:06,475 [INFO] train: learning_rate: 0.0003
2018-09-27 11:12:06,475 [INFO] train: loading_threads: 4
2018-09-27 11:12:06,475 [INFO] train: loss: batch_hard
2018-09-27 11:12:06,476 [INFO] train: margin: soft
2018-09-27 11:12:06,476 [INFO] train: metric: euclidean
2018-09-27 11:12:06,476 [INFO] train: model_name: resnet_v1_50
2018-09-27 11:12:06,476 [INFO] train: net_input_height: 64
2018-09-27 11:12:06,476 [INFO] train: net_input_width: 64
2018-09-27 11:12:06,476 [INFO] train: pre_crop_height: 64
2018-09-27 11:12:06,476 [INFO] train: pre_crop_width: 64
2018-09-27 11:12:06,476 [INFO] train: resume: False
2018-09-27 11:12:06,476 [INFO] train: train_iterations: 250000
2018-09-27 11:12:06,476 [INFO] train: train_set: F:/projector/GestureClassification/data/img/20180919/triplet_data/gesture_train.csv
2018-09-27 11:12:07,403 [INFO] tensorflow: Scale of 0 disables regularizer.
2018-09-27 11:12:07,403 [INFO] tensorflow: Scale of 0 disables regularizer.
2018-09-27 11:12:08,569 [WARNING] tensorflow: From F:\projector\GestureClassification\TripletBasedGestureRecognition\triplet-reid\nets\resnet_v1.py:219: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
2018-09-27 11:12:08,569 [WARNING] tensorflow: From F:\projector\GestureClassification\TripletBasedGestureRecognition\triplet-reid\nets\resnet_v1.py:219: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\ops\gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2018-09-27 11:12:11.533610: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-09-27 11:12:11.936193: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1060 5GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085
pciBusID: 0000:01:00.0
totalMemory: 5.00GiB freeMemory: 4.12GiB
2018-09-27 11:12:11.936710: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2018-09-27 11:12:14.388590: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-27 11:12:14.388811: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917]      0
2018-09-27 11:12:14.388948: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0:   N
2018-09-27 11:12:14.415769: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3871 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 5GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-09-27 11:12:16.275624: I T:\src\github\tensorflow\tensorflow\core\kernels\cuda_solvers.cc:159] Creating CudaSolver handles for stream 000001A50E54E080
2018-09-27 11:12:20,572 [INFO] tensorflow: F:/projector/GestureClassification/TripletBasedGestureRecognition/experiment_root/20180926/checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it.
2018-09-27 11:12:20,572 [INFO] tensorflow: F:/projector/GestureClassification/TripletBasedGestureRecognition/experiment_root/20180926/checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it.
2018-09-27 11:12:23,207 [INFO] train: Starting training from iteration 0.
Traceback (most recent call last):
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1327, in _do_call
return fn(*args)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1312, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1420, in _call_tf_sessionrun
status, run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?], [?]], output_types=[DT_FLOAT, DT_STRING, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 439, in 
main()
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 393, in main
prec_at_k, endpoints['emb'], losses, fids])
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 905, in run
run_metadata_ptr)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1140, in _run
feed_dict_tensor, options, run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1321, in _do_run
run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?], [?]], output_types=[DT_FLOAT, DT_STRING, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Caused by op 'IteratorGetNext', defined at:
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 439, in 
main()
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 280, in main
images, fids, pids = dataset.make_one_shot_iterator().get_next()
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 366, in get_next
name=name)), self._output_types,
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\ops\gen_dataset_ops.py", line 1484, in iterator_get_next
output_shapes=output_shapes, name=name)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\ops.py", line 3290, in create_op
op_def=op_def)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\ops.py", line 1654, in init
self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access
OutOfRangeError (see above for traceback): End of sequence
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?], [?]], output_types=[DT_FLOAT, DT_STRING, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Process finished with exit code 1