optimise training.iob_utils.offsets_to_biluo_tags method #11441
Replies: 3 comments
-
Converting this to a discussion, as I don't think this is a concrete issue and a discussion might be a better format. |
Beta Was this translation helpful? Give feedback.
-
the problem is with the large loop in line 130: |
Beta Was this translation helpful? Give feedback.
-
I would recommend converting your data once, storing it in a An example of a conversion script for NER data is here:
A PR that improves the speed would always be welcome! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
How to reproduce the behaviour
run space.training.Example.from_dict
the method offsets_to_biluo_tags in training.iob_utils is not optimised, it is taking quite significant time while running Example.from_dict
Here is a profile snapshot
ncalls tottime percall cumtime percall filename:lineno(function)
14601 24.677 0.002 90.797 0.006 {built-in method from_dict}
112839144 19.502 0.000 33.350 0.000 doc.pyx:484(iter)
2 15.860 7.930 15.860 7.930 {built-in method _pickle.load}
112702047 13.855 0.000 13.855 0.000 token.pxd:21(cinit)
15276 9.896 0.001 22.981 0.002 doc.pyx:1327(to_dict)
18106173 7.784 0.000 11.133 0.000 vocab.pyx:142(get)
1 7.507 7.507 41.008 41.008 {built-in method _pickle.dump}
537703 7.312 0.000 7.312 0.000 decoder.py:343(raw_decode)
11180 6.963 0.001 24.228 0.002 iob_utils.py:63(offsets_to_biluo_tags)
31407 6.523 0.000 15.889 0.001 doc.pyx:180(init)
1 5.164 5.164 181.599 181.599 Ned_process_deep.py:1()
7968792 4.563 0.000 7.044 0.000 doc.pyx:694(genexpr)
20025 4.461 0.000 7.149 0.000 init.py:100(get_all)
Your Environment
Beta Was this translation helpful? Give feedback.
All reactions