`Calibration` model no longer allows arbitrary model injection #1543

bruno-f-cruz · 2025-08-25T16:15:44Z

bruno-f-cruz
Aug 25, 2025
Collaborator

https://github.com/AllenNeuralDynamics/aind-data-schema/blob/79d38078df40e36a21b7597428511f0804778d6c/src/aind_data_schema/components/measurements.py#L58C1-L59C1

In aind-data-schema 2.0 the model for Calibration no longer allows users to inject arbitrary models. This is unfortunate given the previously flagged use-case in (#692).

We have now been using a calibration model in our own experiments (https://github.com/AllenNeuralDynamics/Aind.Behavior.Services/blob/main/src/aind_behavior_services/calibration/load_cells.py) that was heavily influenced by the previous version of this library (where a simple DTO mapping is sufficient to map to aind-data-schema), where we need to keep track of additional information about the calibration performed that is not trackable in a simple List[float, str] signature.

I can serialize my model into a string and simply dump it on the list, but it seems like a clunky solution.

Is there an alternative option to capture this use-case in the present library?

Ps: feel free to transfer this to Discussions if you think it is more suitable.

dbirman · 2025-08-25T20:33:41Z

dbirman
Aug 25, 2025
Maintainer

@bruno-f-cruz can you provide an example of the missing metadata that you would like to be able to track which the Calibration object no longer supports? We can definitely expand the Calibration definition if needed. I thought this was sufficient after looking through the existing examples I was able to pull from existing records.

0 replies

bruno-f-cruz · 2025-08-25T22:41:53Z

bruno-f-cruz
Aug 25, 2025
Collaborator Author

I think it depends on what you consider "Calibration". The generic way it was before, made it possible to use it as both calibration and/or configuration. You can take a look at the classes in the aind-behavior-services library. A few examples:

Manipulator

https://github.com/AllenNeuralDynamics/Aind.Behavior.Services/blob/23baa20a5b9afa85c8f6de46d1e02888b0ec3303/src/aind_behavior_services/calibration/aind_manipulator.py#L110

For the input calibration, we have several settings on top of a simple calibration of steps -> cm. For instance, the simple steps -> cm may be different per axis (full_step_to_mm field). Each axis may have a different motor configuration (e.g. acceleration) and, finally, for each animal, the initial position of the manipulator must be calibrated to account for differences in head-frame implantation procedure.

Load cells

https://github.com/AllenNeuralDynamics/Aind.Behavior.Services/blob/23baa20a5b9afa85c8f6de46d1e02888b0ec3303/src/aind_behavior_services/calibration/load_cells.py#L123

The Load cell device can have up to 8 load cells. Each Load cell must be calibrated with a baseline + slope. However, it is also necessary to keep track of a resistor value (offset) that is also calibrated by channel

Treadmill

Additional information is necessary to calibrate the treadmill since it has a portion that is concerned with the encoder, and a second one related to the brake.

https://github.com/AllenNeuralDynamics/Aind.Behavior.Services/blob/main/src/aind_behavior_services/calibration/treadmill.py

Water valve

https://github.com/AllenNeuralDynamics/Aind.Behavior.Services/blob/main/src/aind_behavior_services/calibration/water_valve.py

There is a new subclass implemented in v2 that is used to calibrate water valves. We have already been using a stable model for >1 year to keep track of this information where we additionally keep track of: time between valve openings, and the single measurements in case multiple ones are taken. The model also keeps track of the association between delays and water by simply using a Dictionary instead of relying on separate lists.

0 replies

dbirman · 2025-08-25T23:13:42Z

dbirman
Aug 25, 2025
Maintainer

I think adding a "settings" dictionary and separating calibrations into their component parts captures most of the issues you've outlined. We could add a MultiCalibration class which is just a dictionary of calibrations to keep it from turning into a mess if we want.

For manipulator the initial position could go in the settings, the different axes are different calibrations
Load cell baseline, slope, and offset can all go in the fit parameters. Each of the 8 load cells should probably be its own calibration.
Treadmill sounds like it's two calibrations

For the water valve I think you're saying you liked your previous system and want to continue dumping the dictionaries? I don't think we can support this anymore, the goal with standardizing calibrations is so that we can more easily display this information in downstream tools across projects. The upgrader already handles like six different variations of your calibration objects, and I don't want the downstream tools to have to handle additional future changes. I think we're doing a reasonable job here of capturing the core of what a "calibration" is and we just need a few more fields to properly represent everything about the calibration process. I'm happy to help write mappers btw to go from your internal type to the data schema, but if you plan to change the format in the future it would need to go through an extractor that you maintain into an intermediate format that is static.

0 replies

bruno-f-cruz · 2025-08-26T00:00:45Z

bruno-f-cruz
Aug 26, 2025
Collaborator Author

For manipulator the initial position could go in the settings, the different axes are different calibrations
Load cell baseline, slope, and offset can all go in the fit parameters. Each of the 8 load cells should probably be its own calibration.
Treadmill sounds like it's two calibrations

I admit I don’t see how this approach is any better when we keep in mind your point: “I don’t want the downstream tools to have to handle additional future changes.” Choosing an arbitrary model for calibration is just as problematic (if not worse) than arbitrarily storing calibration values in different places, especially when it breaks device hierarchies (e.g., multiple load cells belonging to the same device). If aind-data-schema does not have enough expressivity to represent how hardware is organized at the rig, I actually think reverting to more generic fields, as in the past, is a valid approach. You can still enforce the presence of certain fields (e.g., input/output) for generic plotting/qc.

For the water valve, I think you're saying you liked your previous system and want to continue dumping the dictionaries?

Sure, being able to DTO was a nice and scalable approach. Yet, what I don’t understand is: why adopt this new model for water calibration, which lacks the expressivity we need, instead of modelling in a way that was informed by our current model? The “requirements” were already there...

I'm happy to help write mappers btw to go from your internal type to the data schema, but if you plan to change the format in the future it would need to go through an extractor that you maintain into an intermediate format that is static.

It’s not the act of writing mappers that worries me (though I do think that, if you want what you claimed in #1531 (reply in thread) the current approach is hard to scale). What I’m asking for is clarity around the stability of this library. The structure of the models in aind-behavior-services will be familiar to you because they were deliberately designed for DTO—it wasn’t a coincidence. I would have implemented something much simpler, and easier to maintain, otherwise. Now I find myself having to change my code yet again because aind-data-schema was upgraded. Unlike a “standard” library where users can opt in or out of upgrades (or using it altogether), aind-data-schema is currently a mandatory waypoint for data collection at AIND. How can a library be considered “core” if it breaks more often than its dependencies? For example, as we move toward common definitions for instruments, devices, and calibrations across tasks—stored in shared databases—can we trust this library to model them reliably, or should we just avoid doing it with the fear that aind-data-schema will break again in the future?

The upgrader already handles like six different variations of your calibration objects, and I don't want the downstream tools to have to handle additional future changes.

Can you point me toward the example you mentioned? I don’t recall making changes that resulted in six distinct calibration schemas. If you mean that there are multiple devices that your mapper has to "remap" to upgrade, I dont see how you could have done it without losing information. Additionally, if this code already exists, then the mappers for calibrations are done, as I said before the calibrations were previously done via simple DTO.

0 replies

dbirman · 2025-08-26T16:55:59Z

dbirman
Aug 26, 2025
Maintainer

There's a mix of schema design and actionable issues here, so I'm going to convert this to a discussion. I've opened two issues #1541 and #1542 where I'm trying to turn this into requirements that we can take action on. I'll respond again to the discussion when I have more time.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`Calibration` model no longer allows arbitrary model injection #1543

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Calibration model no longer allows arbitrary model injection #1543

Uh oh!

Uh oh!

bruno-f-cruz Aug 25, 2025 Collaborator

Replies: 5 comments

Uh oh!

dbirman Aug 25, 2025 Maintainer

Uh oh!

bruno-f-cruz Aug 25, 2025 Collaborator Author

Manipulator

Load cells

Treadmill

Water valve

Uh oh!

dbirman Aug 25, 2025 Maintainer

Uh oh!

bruno-f-cruz Aug 26, 2025 Collaborator Author

Uh oh!

dbirman Aug 26, 2025 Maintainer

`Calibration` model no longer allows arbitrary model injection #1543

bruno-f-cruz
Aug 25, 2025
Collaborator

dbirman
Aug 25, 2025
Maintainer

bruno-f-cruz
Aug 25, 2025
Collaborator Author

dbirman
Aug 25, 2025
Maintainer

bruno-f-cruz
Aug 26, 2025
Collaborator Author

dbirman
Aug 26, 2025
Maintainer