Add (optional at compile-time) G3Frame JSON output#69
Draft
Add (optional at compile-time) G3Frame JSON output#69
Conversation
I envision it being useful to access data in .g3 format from web applications for monitoring purposes. For that reason, it would be useful to have a way to convert any data to JSON (one possible alternative would be a .g3 binary file reader in javascript, but that is much more work). Fortunately, cereal supports JSON as an archive format, so adding JSON output is nearly trivial. We mostly just need to make sure the JSONOutputArchive version of all the serializations are compiled, and write a saveJSON() method for g3frame. An asJSON() (as_json() in Python) method is also provided that returns a string. There there are a few small differences from binary output: - Don't bother emitting crc sums - Don't FLAC encode, ever - Output the character instead of the number for frametype Currently, this is only enabled by a new compile-time option (the cmake variable ENABLE_JSON_OUTPUT), though in the future it will probably be enabled by default if it doesn't break anything. It does add moderately to binary size and compile time, but hopefully that's not a huge deal. The asJSON/as_json methods still exist without JSON support, but return an error in JSON format. Also included is a new script, spt3g-jsonify, that will read in a .g3[.gz] file and output a json stream as a proof of concept. There are a few places where some other code had to be modified, due to the different API for binary in cereal text and binary archive formats. Actually, it's in code that will never be run, but gets generated and must compile.
arahlin
requested changes
Dec 18, 2021
By using the PUBLIC target, it both affects the compilation of core and anything compiled against core.
Member
|
By the way, I cherry-picked your cmake action fix onto master. Thanks for fixing that! |
I envision it being useful to access data in .g3 format from web applications for monitoring purposes. For that reason, it would be useful to have a way to convert any data to JSON (one possible alternative would be a .g3 binary file reader in javascript, but that is much more work). Fortunately, cereal supports JSON as an archive format, so adding JSON output is nearly trivial. We mostly just need to make sure the JSONOutputArchive version of all the serializations are compiled, and write a saveJSON() method for g3frame. An asJSON() (as_json() in Python) method is also provided that returns a string. There there are a few small differences from binary output: - Don't bother emitting crc sums - Don't FLAC encode, ever - Output the character instead of the number for frametype Currently, this is only enabled by a new compile-time option (the cmake variable ENABLE_JSON_OUTPUT), though in the future it will probably be enabled by default if it doesn't break anything. It does add moderately to binary size and compile time, but hopefully that's not a huge deal. The asJSON/as_json methods still exist without JSON support, but return an error in JSON format. Also included is a new script, spt3g-jsonify, that will read in a .g3[.gz] file and output a json stream as a proof of concept. There are a few places where some other code had to be modified, due to the different API for binary in cereal text and binary archive formats. Actually, it's in code that will never be run, but gets generated and must compile.
By using the PUBLIC target, it both affects the compilation of core and anything compiled against core.
Member
|
I think it probably makes some sense to move the python GIL / threading context machinery to a separate PR, since it's used in a few different places (G3PipelineInfo, G3Reader, G3Writer, G3EventBuilder...) and is not specific to this particular feature. |
This PR creates a new class that simplifies initialization of python threads, as well as acquiriing / releasing the Python global interpreter lock in various contexts. Use cases include: 1. Ensuring that Py_Initialize() is properly called at the beginning of a program that is expected to interact with the python interpreter, and also that Py_Finalize() is called when the program is finished. 2. Ensuring that the current thread state is saved and the GIL released as necessary, e.g. for IO operations, and then the thread state is restored on completion. 3. Ensuring that the GIL is acquired for one-off interaction with the python interpreter, and released when complete. A G3PythonContext object is used throughout the library code for cases 2 and 3. If the python interpreter has not been initialized (i.e. the compiled program is expected to be purely in C++), then these context objects are essentially no-op. If the python interpreter is initialized (e.g. inside a python program or command-line interface), then these context objects will handle the GIL appropriately. See the examples/cppexample.cxx C++ program for a simple implementation of the above behavior. This PR also adds logic throughout the G3PipelineInfo and G3ModuleConfig class definitions to enable them to serialize appropriately in a pure-C++ program.
These are python objects, and if we allow them to be deleted otherwise, bad things happen. This fixes at least most of the concurrency problems I have with reading files that have G3PipelineInfo in them?
These are python objects, and if we allow them to be deleted otherwise, bad things happen. This fixes at least most of the concurrency problems I have with reading files that have G3PipelineInfo in them?
Use a G3MapFrameObject storage structure for the module arguments, rather than a map of python objects. Since the serialization process requires a call to repr() for non-G3FrameObjects anyway, do this step in the python shim that creates the config in the first place. Also ensure that simple scalar values are serialized as frame objects. Adds a new ``spt3g.core.to_g3frameobject`` function for converting python objects to G3FrameObjects.
Avoid specialization warning with G3Frame class, close the loop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I envision it being useful to access data in .g3 format from web
applications for monitoring purposes. For that reason, it would
be useful to have a way to convert any data to JSON (one possible alternative
would be a .g3 binary file reader in javascript, but that is much more
work).
Fortunately, cereal supports JSON as an archive format, so adding
JSON output is nearly trivial. We mostly just need to make sure
the JSONOutputArchive version of all the serializations are compiled,
and write a saveJSON() method for g3frame. An asJSON() (as_json() in
Python) method is also provided that returns a string.
There there are a few small differences from binary output:
Currently, this is only enabled by a new compile-time option (the cmake
variable ENABLE_JSON_OUTPUT), though in the future it will probably be
enabled by default if it doesn't break anything. It does add moderately
to binary size and compile time, but hopefully that's not a huge deal.
The asJSON/as_json methods still exist without JSON support, but return
an error in JSON format.
Also included is a new script, spt3g-jsonify, that will read in a
.g3[.gz] file and output a json stream as a proof of concept.
There are a few places where some other code had to be modified, due to
the different API for binary in cereal text and binary archive formats.
Actually, it's in code that will never be run, but gets generated and
must compile.
Still TODO: