-
-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Doing some benchmarks, I've noticed that the I/O parts are done in the main loop, which causes an increase in the total runtime.
As an example, we have:
time python -m data_morph --seed 42 --start-shape panda --target-shape star --iterations 10000
13.81s user 0.79s system 109% cpu 13.310 total
but if we remove this block (which is in the loop):
data-morph/src/data_morph/morpher.py
Lines 486 to 490 in c16e3dc
| frame_number = record_frames( | |
| data=morphed_data, | |
| count=frame_numbers.count(i), | |
| frame_number=frame_number, | |
| ) |
we achieve a major speed-up:
time python -m data_morph --seed 42 --start-shape panda --target-shape star --iterations 10000
5.41s user 0.28s system 137% cpu 4.149 total
I'd propose that instead of doing the I/O in the main loop, the frames that would be written to disk are simply stored in some internal list, and then I/O is done after all the computations. This has several benefits:
- if
keep_frames=False, we don't output anything other than the final GIF, so we spare the disk from unnecessary writes - if
keep_frames=True, since the task is probably I/O bound, we can take advantage of theconcurrent.futuresmodule to do it concurrently, so the speed-up is probably still significant - less error-prone since we don't need to find the files. There's also no guarantee that the files won't change on disk while the sim is running
- the most obvious one, speed! If using
keep_frames=False, which is the default, we just output one file instead of potentially hundreds.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request