Replies: 12 comments
-
这个与模型训练效果有关,确实会存在对预测错误的图片给出很高的置信度,模型训练过程中可能对这些图片训偏了。而提高模型的评估精度会减轻这种情况发生,所以可以尝试重新训练模型,调整训练参数,得到更高精度的模型,并在数据集中添加更多此类图片。 |
Beta Was this translation helpful? Give feedback.
-
这个训偏了?有没有什么办法过滤,判断?避免这种情况? 有没有V4版日文,韩文,繁体中文模型? |
Beta Was this translation helpful? Give feedback.
-
在后续我们会支持badcase分析功能,然后调整训练集比例,可以减轻这种情况发生 |
Beta Was this translation helpful? Give feedback.
-
比如,在评估时,印刷文档,字幕图,随机生成的干扰很大的图,这种比例有什么要求吗? |
Beta Was this translation helpful? Give feedback.
-
这个badcase分析功能,可以不用占gpu显存,加载每一张图片前先分析吗? |
Beta Was this translation helpful? Give feedback.
-
这个没有具体经验值的哈,需要根据具体情况进行调整 |
Beta Was this translation helpful? Give feedback.
-
可以发一下cuda错误吗,一般显存溢出报的是一个C++错误,并且在log中会明确体现显存申请失败的信息 |
Beta Was this translation helpful? Give feedback.
-
File "C:\F\pycharm2020.2\PaddleOCR-2.7.5\paddleocr.py", line 712, in ocr
rec_res, elapse = self.text_recognizer(img)
File "C:\F\pycharm2020.2\PaddleOCR-2.7.5\tools\infer\predict_rec.py", line 669, in __call__
self.input_tensor.copy_from_cpu(norm_img_batch)
File "C:\Program Files\Python38\lib\site-packages\paddle\fluid\inference\wrapper.py", line 36, in tensor_copy_from_cpu
self.copy_from_cpu_bind(data)
OSError: (External) CUDA error(719), unspecified launch failure.
[Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at ..\paddle\phi\backends\gpu\cuda\cuda_info.cc:251)
…--------------------------------------------------------------------------------
------------------ 原始邮件 ------------------
发件人: changdazhou ***@***.***>
发送时间: 2024-04-12 19:57:46
收件人:PaddlePaddle/PaddleOCR ***@***.***>
抄送:nissanjp ***@***.***>,Author ***@***.***>
主题: Re: [PaddlePaddle/PaddleOCR] rec推理得到的置信度有时也不是很准,有时0.6可能时对的,有时0.97都不一定对, 有什么办法让这个返回的置信度更准吗? (Issue #11921)
这个badcase分析功能,可以不用占gpu显存,加载每一张图片前先分析吗? 在训练时尝试先用paddleocr.py识别一次,就会报cuda错误,可能显存未释放,不知道怎么解决。
可以发一下cuda错误吗,一般显存溢出报的是一个C++错误,并且在log中会明确体现显存申请失败的信息
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
rec模型训练,单行文字,这样设计行不行,有没有什么问题?总字符4万,随机取出5个字符,另加一个短语 组成一行,这样8000行就能覆盖所有想训练的字符,8000*500=4百万就能均衡字符分布 评估时,要多少行比较好?用8000行够不够? |
Beta Was this translation helpful? Give feedback.
-
这个需要等下周一让专门负责的同学确认一下哈 |
Beta Was this translation helpful? Give feedback.
-
v4 中文模型,有的置信度显示出来也是很低,但是字符是正确的 |
Beta Was this translation helpful? Give feedback.
-
模型预测的也是是这个字符的概率,因此会出现置信度低,但是预测正确的情况,这一般是说明当前字符比较难识别,还有其他字符占据一定的概率,导致模型对这个字符的确定程度不高,也就是置信度低了 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
rec推理得到的置信度有时也不是很准,有时0.6可能时对的,有时0.97都不一定对, 有什么办法让这个返回的置信度更准吗?
Beta Was this translation helpful? Give feedback.
All reactions