This repository was archived by the owner on May 27, 2024. It is now read-only.
feat: add multiprocess event serialization#82
Open
rkokkelk wants to merge 3 commits intocertat:masterfrom
Open
feat: add multiprocess event serialization#82rkokkelk wants to merge 3 commits intocertat:masterfrom
rkokkelk wants to merge 3 commits intocertat:masterfrom
Conversation
When `Event` is init, it automatically loads the harmonization config file. This means that for every event uploaded, the entire harmonization config file is loaded and parsed. This dramatically extends the time needed to parse a single event. This PR ensures that the harmonization file is only loaded once and then used for generating all `Event` objects. Fixes: certat#79
Ensure that harmonization config file is loaded from file directly, instead of `Event` object.
Events were previously serialized serialy. However by using Python concurrent features it is possible to execute event serialization over multiple cores greatly decreasing overall execution time. The actual sending of the serialized event is done without parallelization due to threading lock issues in the Python Redis implementation. Relies on PR: certat#80
Contributor
Author
|
Worked in development, however when implemented in production environment (RH7, Apache, WSGI), the created processes used by concurrent.futures, were killed. So probably something in the WSGI config was not configured correctly for using it in this kind of implementation. Further testing should be done on other platforms to determine if the issue originates from the code base or implementation configuration. If it depends on the implementation, additional documentation should be made. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Events were previously serialized serially. However by using Python
concurrent features it is possible to execute event serialization over
multiple cores greatly decreasing overall execution time.
The actual sending of the serialized event is done without
parallelization due to threading lock issues in the Python Redis
implementation.
Relies on PR: #80
This implementation decreased running time for parsing 20k records in 120s to 20s on a quad core laptop.