You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Out of memory error on GPU 0. Cannot allocate 5.062500GB memory on GPU 0, 13.277405GB memory has been allocated and available memory is only 2.459412GB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00015504.jpg
erro_no:10000, err_msg:[rec] failed to predict. (data_id=2 log_id=2) [rec|1] Failed to process(batch: [2]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 429.468750MB memory on GPU 0, 15.416077GB memory has been allocated and available memory is only 328.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00018069.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=3 log_id=3) [det|7] Failed to process(batch: [3]): (External) CUDNN error(8), CUDNN_STATUS_EXECUTION_FAILED.
[Hint: 'CUDNN_STATUS_EXECUTION_FAILED'. The GPU program failed to execute. This is usually caused by a failure to launch some cuDNN kernel on the GPU, which can occur for multiple reasons. To correct, check that the hardware, an appropriate version of the driver, and the cuDNN library are correctly installed. Otherwise, this may indicate an internal error/bug in the library. ] (at /paddle/paddle/fluid/operators/fused/conv_fusion_op.cu:393)
[operator < conv2d_fusion > error]. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00056221.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=4 log_id=4) [det|1] Failed to process(batch: [4]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00057937.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=5 log_id=5) [det|4] Failed to process(batch: [5]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00059985.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=6 log_id=6) [det|2] Failed to process(batch: [6]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 10.500000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00077949.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=7 log_id=7) [det|2] Failed to process(batch: [7]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00111002.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=8 log_id=8) [det|4] Failed to process(batch: [8]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/00207393.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=9 log_id=9) [det|3] Failed to process(batch: [9]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 60.750000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log ../../doc/imgs/1.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=10 log_id=10) [det|1] Failed to process(batch: [10]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 7.500000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
一、问题:

使用pdserver进行推理paddleocr模型,显存会爆掉,具体是:调用第一次模型,显卡内存占用到15GB,第二次调用模型推理直接out of memory,哪些设置不对呢?
疑问1、第一次启动服务时候,占用系统显存为5GB,这个通过设置哪些参数不占用系统gpu?
疑问2、第一次请求推理,占用显存为15GB,再次请求则显示out of memory,此后系统显存一直被占用不被释放,应该如何设置哪些参数支持高并发,而不out of memory?

二、系统环境:(Ubuntu 22.04 )NVIDIA A5000 16GB)
paddle-bfloat 0.1.7
paddle-serving-app 0.8.3
paddle-serving-client 0.8.3
paddle-serving-server-gpu 0.8.3.post112
paddleocr 2.7.0.0
paddlepaddle-gpu 2.3.2.post112
按照教程:https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/pdserving/README_CN.md#%E9%83%A8%E7%BD%B2进行部署;
部署过程:
#1 格式转换
python -m paddle_serving_client.convert --dirname ./ch_PP-OCRv4_det_server_infer --model_filename inference.pdmodel --params_filename inference.pdiparams --serving_server ./ppocr_det_v4_serving/ --serving_client ./ppocr_det_v3_client/
python -m paddle_serving_client.convert --dirname ./ch_PP-OCRv4_rec_server_infer --model_filename inference.pdmodel --params_filename inference.pdiparams --serving_server ./ppocr_rec_v4_serving/ --serving_client ./ppocr_rec_v4_client/
2 将 ppocr_det_v4_client、ppocr_det_v4_serving、 ppocr_rec_v4_client/和ppocr_rec_v4_serving/文件夹移动至ocr-PaddleOCR-2.7/deploy/pdserving下;
#3 修改config文件
#rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时,会自动将rpc_port设置为http_port+1
rpc_port: 18091
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 9998
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 10
#build_dag_each_worker, False,框架在进程内创建一条DAG;True,框架会每个进程内创建多个独立的DAG
build_dag_each_worker: False
dag:
#op资源类型, True, 为线程模型;False,为进程模型
is_thread_op: False
op:
det:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 8
4、启动服务:python web_service.py --config=config.yml
可以成功启动:

5、执行python pipeline_http_client.py
第一张图可以执行结果,第二张图就会显示显存已满;
记录如下:
../../doc/imgs/00006737.jpg
erro_no:0, err_msg:
('www.997788.c0m中国收藏热线', 0.93801534), [[2.0, 7.0], [329.0, 4.0], [329.0, 27.0], [2.0, 29.0]]
('BOARDING', 0.9911648), [[422.0, 23.0], [657.0, 18.0], [658.0, 58.0], [423.0, 62.0]]
('登机牌', 0.9993414), [[153.0, 24.0], [355.0, 21.0], [356.0, 70.0], [154.0, 74.0]]
('PASS', 0.9975725), [[703.0, 15.0], [820.0, 13.0], [821.0, 56.0], [704.0, 58.0]]
('座位号', 0.99840087), [[676.0, 99.0], [738.0, 97.0], [739.0, 118.0], [677.0, 120.0]]
('序号', 0.99921644), [[490.0, 103.0], [534.0, 103.0], [534.0, 122.0], [490.0, 122.0]]
('SERIALNO', 0.9913897), [[545.0, 103.0], [647.0, 101.0], [647.0, 118.0], [545.0, 121.0]]
('舱位', 0.9949764), [[340.0, 105.0], [385.0, 103.0], [385.0, 126.0], [341.0, 127.0]]
('CLASS', 0.98982906), [[399.0, 105.0], [456.0, 105.0], [456.0, 123.0], [399.0, 123.0]]
('日期DATE', 0.99608606), [[214.0, 107.0], [317.0, 107.0], [317.0, 130.0], [214.0, 130.0]]
('SEATNO', 0.9924663), [[753.0, 99.0], [833.0, 96.0], [833.0, 114.0], [754.0, 117.0]]
('航班', 0.99966276), [[63.0, 111.0], [108.0, 111.0], [108.0, 132.0], [63.0, 132.0]]
('FLIGHT', 0.9781428), [[119.0, 111.0], [189.0, 109.0], [189.0, 127.0], [119.0, 129.0]]
('W', 0.7744087), [[406.0, 133.0], [430.0, 133.0], [430.0, 155.0], [406.0, 155.0]]
('03DEC', 0.98559016), [[234.0, 136.0], [327.0, 135.0], [327.0, 157.0], [234.0, 159.0]]
('MU2379', 0.99838823), [[81.0, 140.0], [210.0, 137.0], [210.0, 160.0], [81.0, 162.0]]
('035', 0.99874926), [[509.0, 131.0], [567.0, 129.0], [567.0, 153.0], [510.0, 155.0]]
('始发地', 0.9990115), [[343.0, 174.0], [406.0, 173.0], [406.0, 194.0], [343.0, 196.0]]
('FROM', 0.98515224), [[420.0, 174.0], [468.0, 174.0], [468.0, 193.0], [420.0, 193.0]]
('登机口', 0.99886125), [[490.0, 174.0], [556.0, 174.0], [556.0, 194.0], [490.0, 194.0]]
('GATE', 0.9974543), [[566.0, 174.0], [613.0, 172.0], [614.0, 190.0], [567.0, 192.0]]
('目的地T0', 0.918258), [[66.0, 179.0], [169.0, 178.0], [169.0, 201.0], [66.0, 202.0]]
('登机时间BDT', 0.99218565), [[677.0, 170.0], [812.0, 167.0], [812.0, 188.0], [677.0, 191.0]]
('福州', 0.9998597), [[98.0, 206.0], [169.0, 206.0], [169.0, 229.0], [98.0, 229.0]]
('TAIYUAN', 0.9712927), [[336.0, 218.0], [474.0, 216.0], [474.0, 236.0], [336.0, 239.0]]
('C11', 0.8256312), [[508.0, 215.0], [553.0, 215.0], [553.0, 234.0], [508.0, 234.0]]
('FUZHOU', 0.9928904), [[89.0, 228.0], [203.0, 226.0], [203.0, 248.0], [89.0, 250.0]]
('身份识别IDNO.', 0.92804295), [[344.0, 238.0], [483.0, 235.0], [483.0, 255.0], [344.0, 258.0]]
('姓名NAME', 0.99572617), [[65.0, 249.0], [172.0, 247.0], [172.0, 270.0], [65.0, 272.0]]
('ZHANGQIWET', 0.9452583), [[77.0, 276.0], [263.0, 272.0], [263.0, 294.0], [77.0, 298.0]]
('票号TKTNO', 0.9782185), [[462.0, 297.0], [578.0, 294.0], [578.0, 314.0], [462.0, 316.0]]
('张祺伟', 0.95761776), [[103.0, 312.0], [209.0, 311.0], [209.0, 334.0], [103.0, 335.0]]
('票价FARE', 0.99265593), [[70.0, 344.0], [163.0, 342.0], [163.0, 362.0], [70.0, 363.0]]
('ETKT7813699238489/1', 0.9963156), [[346.0, 348.0], [661.0, 345.0], [661.0, 365.0], [346.0, 368.0]]
('登机口于起飞前10分钟关闭', 0.994543), [[100.0, 457.0], [343.0, 453.0], [343.0, 473.0], [100.0, 477.0]]
('GATES CLOSE 1O MINUTES BEFORE DEPARTURE TIME', 0.9434192), [[360.0, 452.0], [830.0, 443.0], [830.0, 462.0], [360.0, 471.0]]
../../doc/imgs/00009282.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=1 log_id=1) [det|3] Failed to process(batch: [1]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 5.062500GB memory on GPU 0, 13.277405GB memory has been allocated and available memory is only 2.459412GB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00015504.jpg
erro_no:10000, err_msg:[rec] failed to predict. (data_id=2 log_id=2) [rec|1] Failed to process(batch: [2]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 429.468750MB memory on GPU 0, 15.416077GB memory has been allocated and available memory is only 328.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00018069.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=3 log_id=3) [det|7] Failed to process(batch: [3]): (External) CUDNN error(8), CUDNN_STATUS_EXECUTION_FAILED.
[Hint: 'CUDNN_STATUS_EXECUTION_FAILED'. The GPU program failed to execute. This is usually caused by a failure to launch some cuDNN kernel on the GPU, which can occur for multiple reasons. To correct, check that the hardware, an appropriate version of the driver, and the cuDNN library are correctly installed. Otherwise, this may indicate an internal error/bug in the library. ] (at /paddle/paddle/fluid/operators/fused/conv_fusion_op.cu:393)
[operator < conv2d_fusion > error]. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00056221.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=4 log_id=4) [det|1] Failed to process(batch: [4]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00057937.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=5 log_id=5) [det|4] Failed to process(batch: [5]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00059985.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=6 log_id=6) [det|2] Failed to process(batch: [6]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 10.500000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00077949.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=7 log_id=7) [det|2] Failed to process(batch: [7]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00111002.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=8 log_id=8) [det|4] Failed to process(batch: [8]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 9.000000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/00207393.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=9 log_id=9) [det|3] Failed to process(batch: [9]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 60.750000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
../../doc/imgs/1.jpg
erro_no:10000, err_msg:[det] failed to predict. (data_id=10 log_id=10) [det|1] Failed to process(batch: [10]): ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 7.500000MB memory on GPU 0, 15.707092GB memory has been allocated and available memory is only 30.437500MB.
Please check whether there is any other process using GPU 0.
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is
export FLAGS_use_cuda_managed_memory=false
.(at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. Please check the input dict and checkout PipelineServingLogs/pipeline.log for more details.
For details about error message, see PipelineServingLogs/pipeline.log
Beta Was this translation helpful? Give feedback.
All reactions