-
Notifications
You must be signed in to change notification settings - Fork 110
Refactoring TiledWriter #1916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring TiledWriter #1916
Conversation
|
A new PR due to the branch renaming; moved here from #1911. |
danielballan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The separation of legacy is a huge improvement to the readability of this code.
Just a couple comments.
| data_desc = descriptor["data_keys"][self.data_key] | ||
| self.datum_shape: tuple[int, ...] = tuple(data_desc["shape"]) | ||
| if None in data_desc["shape"]: | ||
| raise NotImplementedError(f"Consolidator for {self.mimetype} does not support variable-sized data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for this might be coming soon. I wondering if it's worth putting this into a separate method (validate_shape) so it is easy to override. Maybe, maybe not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you thinking the validate_shape method should just raise the error (or not, if there are no issue), or should it be more like a setter for the datum_shape attribute? I've only added that exception after I noticed that the EvenModel schema has been updated recently to allow for Nones in the shape, and mypy was complaining.
I personally prefer setting the attributes directly in __init__ instead of in separate methods, so it's more visible what attributes the class has (we can do self.datum_shape = self.validate_shape(...) here of course). I also see that there's quite a bit of other back-compat things going on while we setting this shape attribute (notably checking for the multiplier in StreamResource); ideally I think, it should be done in _RunNormalizer, but it means delaying the emittance of Descriptors even further...
I need to think more about this, but currently I'm thinking about just leaving it as is.
Co-authored-by: Dan Allan <[email protected]>
Co-authored-by: Dan Allan <[email protected]>
Some refactoring changes to TiledWriter.
Description
Split TiledWriter into two callbacks,
_RunNormalizerand_RunWriterwith the former intended to convert any encountered Bluesky documents that follow legacy schema conventions to the most recentEventModelschema._RunWriteris subscribed to the_RunNormalizerin chain and can only accept documents that adhere to the new schemas. Furthermore, any non-stream documents are converted toStreamResourceandStreamDatums._RunNormalizercreates deep copies of the documents to prevent modifying them for other callbacks subscribed to the same RE.Furthermore, a backup capability has been added to TiledWriter: if writing to Tiled fails at any point, the entire BlueskyRun (all previous and all future documents) would be written to a local JSONLines file.
Motivation and Context
The introduced separation of functionality between schema conversion and document writing would help apply any beamline-specific patches and transforms, if needed, before ingesting the documents to Tiled.
How Has This Been Tested?
external_assets_legacy.json, containing a detailed example of legacy external asset documents to validate backward compatibility.test_bad_document_order, to ensureTiledWritercan handle documents emitted in an unexpected order, such asdatumdocuments appearing late.Related: EventModel PR-223