Replies: 1 comment
-
The "Out of memory" (OOM) error on GPU 0 during training or inference in PaddleOCR is a common issue when the allocated memory exceeds the available capacity of the GPU. Based on your description and the environment setup, there are several approaches to address this problem, along with methods to estimate memory requirements before training. 1. Memory Estimation Before TrainingTo estimate the memory required for training or inference, follow these steps:
If possible, run a small batch size (e.g., 1) to analyze the memory needed and extrapolate for larger batches. 2. Solutions to Prevent OOM Errors(A) Reduce Batch SizeThe easiest and most effective way to manage GPU memory is to decrease the batch size:
(B) Reduce Image Size
(C) Enable Mixed Precision TrainingMixed precision uses
(D) Use
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
🔎 Search before asking
🐛 Bug (问题描述)
Hi.
I am getting Out of memory error while training PP-OCRv3_det.
#14304 (comment)
I also checked the above issue.
When I fixed it as advised, it is less than before, but my problem still occurs.
Also, the learning speed became very slow due to use_shared_memory=False...
I know that to solve this problem, I can separate the batch size or image size.
But the problem is that all the programs intermittently terminate due to Out of memory error.
When exiting, vs code terminates without checking for errors.
But I am sure that the cause is lack of memory.
Is there a way to estimate the memory size required for training before starting training?
🏃♂️ Environment (运行环境)
windows
Package Version
albucore 0.0.19
albumentations 1.4.20
annotated-types 0.7.0
anyio 4.6.2.post1
astor 0.8.1
attrdict 2.0.1
babel 2.16.0
bce-python-sdk 0.9.23
blinker 1.8.2
cachetools 5.5.0
certifi 2024.8.30
charset-normalizer 3.4.0
click 8.1.7
colorama 0.4.6
contourpy 1.3.0
cssselect 1.2.0
cssutils 2.11.1
cycler 0.12.1
Cython 3.0.11
decorator 5.1.1
et_xmlfile 2.0.0
eval_type_backport 0.2.0
fastapi 0.115.4
filelock 3.16.1
Flask 3.0.3
flask-babel 4.0.0
fonttools 4.54.1
fsspec 2024.10.0
future 1.0.0
h11 0.14.0
httpcore 1.0.6
httpx 0.27.2
idna 3.10
imageio 2.36.0
imgaug 0.4.0
itsdangerous 2.2.0
Jinja2 3.1.4
joblib 1.4.2
kiwisolver 1.4.7
lazy_loader 0.4
lmdb 1.5.1
lxml 5.3.0
MarkupSafe 3.0.2
matplotlib 3.9.2
more-itertools 10.5.0
mpmath 1.3.0
networkx 3.4.2
numpy 2.1.2
nvidia-ml-py 12.560.30
opencv-python 4.10.0.84
opencv-python-headless 4.10.0.84
openpyxl 3.1.5
opt-einsum 3.3.0
packaging 24.1
paddlepaddle-gpu 2.6.2
pandas 2.2.3
pillow 11.0.0
pip 24.2
premailer 3.10.0
protobuf 3.20.2
psutil 6.1.0
pyclipper 1.3.0.post6
pycryptodome 3.21.0
pydantic 2.9.2
pydantic_core 2.23.4
PyMuPDF 1.21.1
pynvml 12.0.0
pyparsing 3.2.0
python-dateutil 2.9.0.post0
python-multipart 0.0.16
pytz 2024.2
PyYAML 6.0.2
RapidFuzz 3.10.1
rarfile 4.2
requests 2.32.3
scikit-image 0.24.0
scikit-learn 1.5.2
scipy 1.14.1
setuptools 75.1.0
shapely 2.0.6
six 1.16.0
sniffio 1.3.1
starlette 0.41.2
stringzilla 3.10.6
sympy 1.13.1
threadpoolctl 3.5.0
tifffile 2024.9.20
torch 2.5.1
tqdm 4.66.6
typing_extensions 4.12.2
tzdata 2024.2
urllib3 2.2.3
uvicorn 0.32.0
visualdl 2.5.3
Werkzeug 3.0.6
wheel 0.44.0
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
.
Beta Was this translation helpful? Give feedback.
All reactions