-
Notifications
You must be signed in to change notification settings - Fork 9
Description
The extrapolator classes in AFL.double_agent.Extrapolator squash the multi-class probabilities down to a single value per grid point:
AFL-agent/AFL/double_agent/Extrapolator.py
Line 363 in 61765b0
| y_prob.sum(axis=-1), dims=self.grid_dim |
This behavior is limiting when we want to access the full multi-class probabilities for downstream computations or visualization. To return these probabilities in an xr.DataArray, we need to introduce dummy dimensions for the multiple classes. For example:
self.output[self._prefix_output("y_prob")] = xr.DataArray(
probabilities.detach().numpy(), dims=(self.grid_dim, self._prefix_output("n_classes"))
)However, this approach breaks when the number of classes changes between iterations, since the dimensions are no longer consistent after a new class is discovered by the labeler. Since the PipelineOp initiates the following after .calculate,
AFL-agent/AFL/double_agent/Pipeline.py
Line 408 in 61765b0
| dataset1 = op.add_to_dataset(dataset1, copy_dataset=False) |
it throws an error:
ValueError: cannot reindex or align along dimension 'phase_n_classes' because of conflicting dimension sizes:This happens because, upon discovering a new phase, the dimension phase_n_classes gains an extra entry, while the existing dataset still has fewer entries.
This seems like a general issue that hasn’t shown up in earlier AFL pipelines due to the use of clustering as a labeler, where the number of phases is indirectly fixed in advance. Supporting the more generic case—where phases can be discovered dynamically—would make the framework more robust. Pipelines that know all possible phases at the start would naturally fit into this more general solution.