Replies: 1 comment 1 reply
-
You may use the compressed model for better memory efficiency: |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
the error is " OP_REQUIRES failed at concat_op.cc:158 : RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[1,20678400] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
RESOURCE_EXHAUSTED: 2 root error(s) found.
(0) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[1,20678400] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node concat}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(1) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[1,20678400] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node concat}}]] "
the GPU infomation is as following during the testing:
Limit: 30680698060byte
InUse: 4784566016byte
MaxInUse: 4784566016byte
NumAllocs: 99
MaxAllocSize: 846987264byte
Reserved: 0
PeakReserved: 0
LargestFreeBlock: 0
the log is as following:
2022-12-05 22:32:21.860346: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.861897: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 3.60G (3865470464 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.863532: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 3.24G (3478923264 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.865157: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 2.92G (3131030784 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.887481: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.888920: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 3.60G (3865470464 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.890322: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 3.24G (3478923264 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.891723: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 2.92G (3131030784 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.893115: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 2.62G (2817927680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.894514: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 2.36G (2536134912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.895913: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 2.12G (2282521344 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.897309: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 1.91G (2054269184 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.898704: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 1.72G (1848842240 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.900096: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 1.55G (1663958016 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.901486: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 1.39G (1497562112 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.902880: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 1.25G (1347805952 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.904276: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 1.13G (1213025280 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.905669: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 1.02G (1091722752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.907066: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 937.03M (982550528 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.908459: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 843.33M (884295424 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.909851: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 759.00M (795865856 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.911246: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 683.10M (716279296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.912639: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 614.79M (644651520 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.914032: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 553.31M (580186368 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.915440: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 497.98M (522167808 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.916840: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 448.18M (469951232 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.918237: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 403.36M (422956288 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.919639: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 363.03M (380660736 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.921035: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 326.72M (342594816 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.922435: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 294.05M (308335360 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.923837: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 264.65M (277501952 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.925240: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 238.18M (249751808 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.926644: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 214.36M (224776704 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.928053: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 192.93M (202299136 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.929461: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 173.63M (182069248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:21.930860: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:31.932387: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:31.933794: I tensorflow/stream_executor/cuda/cuda_driver.cc:739] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2022-12-05 22:32:31.933808: W tensorflow/core/common_runtime/bfc_allocator.cc:479] Allocator (GPU_0_bfc) ran out of memory trying to allocate 157.76MiB (rounded to 165427200)requested by op concat
If the cause is memory fragmentation maybe the environment variable 'TF_GPU_ALLOCATOR=cuda_malloc_async' will improve the situation.
Current allocation summary follows.
Current allocation summary follows.
2022-12-05 22:32:31.933818: I tensorflow/core/common_runtime/bfc_allocator.cc:1027] BFCAllocator dump for GPU_0_bfc
2022-12-05 22:32:31.933825: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (256): Total Chunks: 13, Chunks in use: 13. 3.2KiB allocated for chunks. 3.2KiB in use in bin. 1.6KiB client-requested in use in bin.
2022-12-05 22:32:31.933830: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (512): Total Chunks: 4, Chunks in use: 4. 2.0KiB allocated for chunks. 2.0KiB in use in bin. 1.6KiB client-requested in use in bin.
2022-12-05 22:32:31.933834: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (1024): Total Chunks: 5, Chunks in use: 5. 5.2KiB allocated for chunks. 5.2KiB in use in bin. 4.1KiB client-requested in use in bin.
2022-12-05 22:32:31.933839: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (2048): Total Chunks: 12, Chunks in use: 12. 24.0KiB allocated for chunks. 24.0KiB in use in bin. 22.5KiB client-requested in use in bin.
2022-12-05 22:32:31.933843: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-12-05 22:32:31.933849: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (8192): Total Chunks: 6, Chunks in use: 6. 57.5KiB allocated for chunks. 57.5KiB in use in bin. 56.3KiB client-requested in use in bin.
2022-12-05 22:32:31.933853: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (16384): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-12-05 22:32:31.933857: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (32768): Total Chunks: 4, Chunks in use: 4. 157.0KiB allocated for chunks. 157.0KiB in use in bin. 156.2KiB client-requested in use in bin.
2022-12-05 22:32:31.933862: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (65536): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-12-05 22:32:31.933866: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (131072): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-12-05 22:32:31.933869: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (262144): Total Chunks: 3, Chunks in use: 3. 1.32MiB allocated for chunks. 1.32MiB in use in bin. 1.32MiB client-requested in use in bin.
2022-12-05 22:32:31.933874: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (524288): Total Chunks: 1, Chunks in use: 1. 671.5KiB allocated for chunks. 671.5KiB in use in bin. 450.0KiB client-requested in use in bin.
2022-12-05 22:32:31.933879: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (1048576): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-12-05 22:32:31.933883: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (2097152): Total Chunks: 3, Chunks in use: 2. 7.78MiB allocated for chunks. 5.03MiB in use in bin. 5.03MiB client-requested in use in bin.
2022-12-05 22:32:31.933888: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (4194304): Total Chunks: 4, Chunks in use: 4. 21.36MiB allocated for chunks. 21.36MiB in use in bin. 19.98MiB client-requested in use in bin.
2022-12-05 22:32:31.933892: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (8388608): Total Chunks: 4, Chunks in use: 3. 46.86MiB allocated for chunks. 37.29MiB in use in bin. 37.29MiB client-requested in use in bin.
2022-12-05 22:32:31.933896: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (16777216): Total Chunks: 2, Chunks in use: 2. 50.55MiB allocated for chunks. 50.55MiB in use in bin. 50.55MiB client-requested in use in bin.
2022-12-05 22:32:31.933902: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (33554432): Total Chunks: 5, Chunks in use: 3. 241.73MiB allocated for chunks. 130.98MiB in use in bin. 130.98MiB client-requested in use in bin.
2022-12-05 22:32:31.933907: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (67108864): Total Chunks: 3, Chunks in use: 3. 256.40MiB allocated for chunks. 256.40MiB in use in bin. 256.40MiB client-requested in use in bin.
2022-12-05 22:32:31.933913: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (134217728): Total Chunks: 6, Chunks in use: 6. 962.50MiB allocated for chunks. 962.50MiB in use in bin. 830.58MiB client-requested in use in bin.
2022-12-05 22:32:31.933917: I tensorflow/core/common_runtime/bfc_allocator.cc:1034] Bin (268435456): Total Chunks: 8, Chunks in use: 8. 3.02GiB allocated for chunks. 3.02GiB in use in bin. 2.82GiB client-requested in use in bin.
2022-12-05 22:32:31.933921: I tensorflow/core/common_runtime/bfc_allocator.cc:1050] Bin for 157.76MiB was 128.00MiB, Chunk State:
2022-12-05 22:32:31.933926: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 2147483648
2022-12-05 22:32:31.933932: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a1fc000000 of size 317473792 next 74
2022-12-05 22:32:31.933936: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a20eec4400 of size 634947328 next 75
2022-12-05 22:32:31.933938: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a234c4cb00 of size 634947328 next 76
2022-12-05 22:32:31.933941: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a25a9d5200 of size 316259328 next 77
2022-12-05 22:32:31.933944: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a26d770e00 of size 13750528 next 79
2022-12-05 22:32:31.933947: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a26e48df00 of size 110425600 next 81
2022-12-05 22:32:31.933950: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a274ddd500 of size 55001600 next 82
2022-12-05 22:32:31.933953: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] Free at 14a278251700 of size 64678144 next 18446744073709551615
2022-12-05 22:32:31.933957: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 1073741824
2022-12-05 22:32:31.933960: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a27c000000 of size 317473792 next 69
2022-12-05 22:32:31.933963: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a28eec4400 of size 158129664 next 70
2022-12-05 22:32:31.933966: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a298592200 of size 27606528 next 78
2022-12-05 22:32:31.933969: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] Free at 14a299fe6000 of size 51458304 next 71
2022-12-05 22:32:31.933972: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a29d0f9100 of size 79064832 next 72
2022-12-05 22:32:31.933975: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a2a1c60000 of size 440008704 next 18446744073709551615
2022-12-05 22:32:31.933978: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 1073741824
2022-12-05 22:32:31.933981: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a2e8000000 of size 158736896 next 64
2022-12-05 22:32:31.933984: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a2f1762200 of size 39532544 next 65
2022-12-05 22:32:31.933987: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a2f3d15a00 of size 158736896 next 52
2022-12-05 22:32:31.933989: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a2fd477c00 of size 158736896 next 66
2022-12-05 22:32:31.933992: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a306bd9e00 of size 317473792 next 67
2022-12-05 22:32:31.933995: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a319a9e200 of size 240524800 next 18446744073709551615
2022-12-05 22:32:31.933998: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 268435456
2022-12-05 22:32:31.934001: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a32c000000 of size 42804480 next 58
2022-12-05 22:32:31.934004: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a32e8d2500 of size 7134208 next 59
2022-12-05 22:32:31.934006: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a32efa0100 of size 4743936 next 53
2022-12-05 22:32:31.934009: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a32f426400 of size 79368448 next 63
2022-12-05 22:32:31.934014: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a333fd7500 of size 134384384 next 18446744073709551615
2022-12-05 22:32:31.934017: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 268435456
2022-12-05 22:32:31.934020: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a33c000000 of size 268435456 next 18446744073709551615
2022-12-05 22:32:31.934023: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 67108864
2022-12-05 22:32:31.934027: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a34c000000 of size 12699136 next 54
2022-12-05 22:32:31.934030: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a34cc1c600 of size 25398016 next 51
2022-12-05 22:32:31.934034: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a34e455100 of size 12650496 next 60
2022-12-05 22:32:31.934037: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a34f065900 of size 6325248 next 62
2022-12-05 22:32:31.934040: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] Free at 14a34f66dd00 of size 10035968 next 18446744073709551615
2022-12-05 22:32:31.934042: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 2097152
2022-12-05 22:32:31.934045: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356200000 of size 256 next 1
2022-12-05 22:32:31.934048: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356200100 of size 1280 next 2
2022-12-05 22:32:31.934051: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356200600 of size 256 next 3
2022-12-05 22:32:31.934054: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356200700 of size 256 next 4
2022-12-05 22:32:31.934057: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356200800 of size 2048 next 6
2022-12-05 22:32:31.934060: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356201000 of size 460800 next 7
2022-12-05 22:32:31.934063: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356271800 of size 2048 next 8
2022-12-05 22:32:31.934065: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356272000 of size 2048 next 9
2022-12-05 22:32:31.934069: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356272800 of size 460800 next 10
2022-12-05 22:32:31.934071: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e3000 of size 2048 next 11
2022-12-05 22:32:31.934074: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e3800 of size 2048 next 12
2022-12-05 22:32:31.934077: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e4000 of size 2048 next 13
2022-12-05 22:32:31.934080: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e4800 of size 256 next 14
2022-12-05 22:32:31.934083: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e4900 of size 256 next 15
2022-12-05 22:32:31.934085: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e4a00 of size 2048 next 16
2022-12-05 22:32:31.934088: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e5200 of size 2048 next 17
2022-12-05 22:32:31.934091: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e5a00 of size 2048 next 18
2022-12-05 22:32:31.934094: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a3562e6200 of size 460800 next 19
2022-12-05 22:32:31.934096: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356356a00 of size 2048 next 20
2022-12-05 22:32:31.934099: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356357200 of size 2048 next 21
2022-12-05 22:32:31.934101: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356357a00 of size 2048 next 22
2022-12-05 22:32:31.934104: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356358200 of size 687616 next 18446744073709551615
2022-12-05 22:32:31.934107: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 4194304
2022-12-05 22:32:31.934110: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356800000 of size 4194304 next 18446744073709551615
2022-12-05 22:32:31.934112: I tensorflow/core/common_runtime/bfc_allocator.cc:1063] Next region of size 8388608
2022-12-05 22:32:31.934115: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356c00000 of size 3072000 next 24
2022-12-05 22:32:31.934120: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356eee000 of size 40192 next 25
2022-12-05 22:32:31.934123: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356ef7d00 of size 1024 next 26
2022-12-05 22:32:31.934126: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356ef8100 of size 10240 next 27
2022-12-05 22:32:31.934129: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356efa900 of size 512 next 28
2022-12-05 22:32:31.934132: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356efab00 of size 256 next 29
2022-12-05 22:32:31.934134: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356efac00 of size 256 next 30
2022-12-05 22:32:31.934137: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356efad00 of size 40192 next 31
2022-12-05 22:32:31.934140: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f04a00 of size 1024 next 32
2022-12-05 22:32:31.934143: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f04e00 of size 10240 next 33
2022-12-05 22:32:31.934146: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f07600 of size 512 next 34
2022-12-05 22:32:31.934148: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f07800 of size 256 next 35
2022-12-05 22:32:31.934152: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f07900 of size 256 next 36
2022-12-05 22:32:31.934155: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f07a00 of size 40192 next 37
2022-12-05 22:32:31.934158: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f11700 of size 1024 next 38
2022-12-05 22:32:31.934161: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f11b00 of size 10240 next 39
2022-12-05 22:32:31.934163: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f14300 of size 512 next 40
2022-12-05 22:32:31.934166: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f14500 of size 256 next 41
2022-12-05 22:32:31.934169: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f14600 of size 256 next 42
2022-12-05 22:32:31.934172: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f14700 of size 40192 next 43
2022-12-05 22:32:31.934174: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f1e400 of size 1024 next 44
2022-12-05 22:32:31.934177: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f1e800 of size 256 next 45
2022-12-05 22:32:31.934198: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f1e900 of size 10240 next 46
2022-12-05 22:32:31.934202: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f21100 of size 512 next 47
2022-12-05 22:32:31.934204: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f21300 of size 256 next 48
2022-12-05 22:32:31.934207: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f21400 of size 8960 next 49
2022-12-05 22:32:31.934210: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f23700 of size 8960 next 50
2022-12-05 22:32:31.934212: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] InUse at 14a356f25a00 of size 2200064 next 80
2022-12-05 22:32:31.934215: I tensorflow/core/common_runtime/bfc_allocator.cc:1083] Free at 14a35713ec00 of size 2888704 next 18446744073709551615
2022-12-05 22:32:31.934218: I tensorflow/core/common_runtime/bfc_allocator.cc:1088] Summary of in-use Chunks by size:
2022-12-05 22:32:31.934222: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 13 Chunks of size 256 totalling 3.2KiB
2022-12-05 22:32:31.934227: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 4 Chunks of size 512 totalling 2.0KiB
2022-12-05 22:32:31.934230: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 4 Chunks of size 1024 totalling 4.0KiB
2022-12-05 22:32:31.934233: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 1280 totalling 1.2KiB
2022-12-05 22:32:31.934237: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 12 Chunks of size 2048 totalling 24.0KiB
2022-12-05 22:32:31.934240: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 2 Chunks of size 8960 totalling 17.5KiB
2022-12-05 22:32:31.934243: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 4 Chunks of size 10240 totalling 40.0KiB
2022-12-05 22:32:31.934247: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 4 Chunks of size 40192 totalling 157.0KiB
2022-12-05 22:32:31.934251: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 3 Chunks of size 460800 totalling 1.32MiB
2022-12-05 22:32:31.934255: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 687616 totalling 671.5KiB
2022-12-05 22:32:31.934258: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 2200064 totalling 2.10MiB
2022-12-05 22:32:31.934261: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 3072000 totalling 2.93MiB
2022-12-05 22:32:31.934264: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 4194304 totalling 4.00MiB
2022-12-05 22:32:31.934267: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 4743936 totalling 4.52MiB
2022-12-05 22:32:31.934270: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 6325248 totalling 6.03MiB
2022-12-05 22:32:31.934273: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 7134208 totalling 6.80MiB
2022-12-05 22:32:31.934276: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 12650496 totalling 12.06MiB
2022-12-05 22:32:31.934280: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 12699136 totalling 12.11MiB
2022-12-05 22:32:31.934284: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 13750528 totalling 13.11MiB
2022-12-05 22:32:31.934288: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 25398016 totalling 24.22MiB
2022-12-05 22:32:31.934291: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 27606528 totalling 26.33MiB
2022-12-05 22:32:31.934294: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 39532544 totalling 37.70MiB
2022-12-05 22:32:31.934298: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 42804480 totalling 40.82MiB
2022-12-05 22:32:31.934301: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 55001600 totalling 52.45MiB
2022-12-05 22:32:31.934304: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 79064832 totalling 75.40MiB
2022-12-05 22:32:31.934308: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 79368448 totalling 75.69MiB
2022-12-05 22:32:31.934311: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 110425600 totalling 105.31MiB
2022-12-05 22:32:31.934314: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 134384384 totalling 128.16MiB
2022-12-05 22:32:31.934318: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 158129664 totalling 150.80MiB
2022-12-05 22:32:31.934321: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 3 Chunks of size 158736896 totalling 454.15MiB
2022-12-05 22:32:31.934325: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 240524800 totalling 229.38MiB
2022-12-05 22:32:31.934328: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 268435456 totalling 256.00MiB
2022-12-05 22:32:31.934332: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 316259328 totalling 301.61MiB
2022-12-05 22:32:31.934335: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 3 Chunks of size 317473792 totalling 908.30MiB
2022-12-05 22:32:31.934339: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 1 Chunks of size 440008704 totalling 419.62MiB
2022-12-05 22:32:31.934342: I tensorflow/core/common_runtime/bfc_allocator.cc:1091] 2 Chunks of size 634947328 totalling 1.18GiB
2022-12-05 22:32:31.934345: I tensorflow/core/common_runtime/bfc_allocator.cc:1095] Sum Total of in-use chunks: 4.46GiB
2022-12-05 22:32:31.934348: I tensorflow/core/common_runtime/bfc_allocator.cc:1097] total_region_allocated_bytes_: 4913627136 memory_limit_: 30680698060 available bytes: 25767070924 curr_region_allocation_bytes_: 4294967296
2022-12-05 22:32:31.934354: I tensorflow/core/common_runtime/bfc_allocator.cc:1103] Stats:
Limit: 30680698060
InUse: 4784566016
MaxInUse: 4784566016
NumAllocs: 99
MaxAllocSize: 846987264
Reserved: 0
PeakReserved: 0
LargestFreeBlock: 0
2022-12-05 22:32:31.934364: W tensorflow/core/common_runtime/bfc_allocator.cc:491] *******************************xxx
2022-12-05 22:32:31.934388: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at concat_op.cc:158 : RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[1,20678400] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
RESOURCE_EXHAUSTED: 2 root error(s) found.
(0) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[1,20678400] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node concat}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(1) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[1,20678400] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node concat}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
0 successful operations.
0 derived errors ignored.
/root/anaconda3/envs/deepmdgpu2.1.1/bin/lmp: line 11: 1126870 Aborted (core dumped) /root/anaconda3/envs/deepmdgpu2.1.1/bin/_lmp "$@"
Beta Was this translation helpful? Give feedback.
All reactions