refactor: allow passing strings to default_grouping, also build docs (#1648)

dbirman · web-flow · commit cccec70fd1c9 · 2025-12-16T22:06:09.000Z
* refactor: allow un-tupled string to make life easier

* docs: update

* fix: actually allow str |

* tests: fix tests, and properly handle input data that looks like new data

* docs: improve how allow_tag_failures is explained

* chore: lint

* fix: overenthusiastic defensiveness
diff --git a/docs/base/core/quality_control.md b/docs/base/core/quality_control.md
@@ -39,15 +39,15 @@ tags = {
 }
 ```
 
-Use the `QualityControl.default_grouping` list to define how users should organize a visualization by default. In almost all cases *modality should be the top-level grouping*. For example, building on the example above you might group by: `[["modality"], ["probe", "video"], ["shank"]]` to get a tree split by modality first (which naturally splits ephys and behavior-videos tags into two groups), then by which probe or video a metric belongs to, and finally only for probes the individual shanks are split into groups.
+Use the `QualityControl.default_grouping` list to define how users should organize a visualization by default. In almost all cases *modality should be the top-level grouping*. For example, building on the example above you might group by: `["modality", ("probe", "video"), "shank"]` to get a tree split by modality first (which naturally splits ephys and behavior-videos tags into two groups), then by which probe or video a metric belongs to, and finally only for probes the individual shanks are split into groups.
 
 ### QualityControl.evaluate_status()
 
 You can evaluate the state of a set of metrics filtered by any combination of modalities, stages, and tags on a specific date (by default, today). When evaluating the [Status](#status) of a group of metrics the following rules apply:
 
-First, any metric that is failing and also has a matching tag *value* in the `QualityControl.allow_tag_failures` list is set to pass. This allows you to specify that certain metrics are not critical to a data asset.
+First, any metric that has a tag *value* in the `QualityControl.allow_tag_failures` list is ignored. This allows you to specify that certain metrics are not critical to a data asset.
 
-Then, given the status of all the metrics in the group:
+Then, given the status of all the remaining metrics in the group:
 
 1. If any metric is still failing, the evaluation fails
 2. If any metric is pending and the rest pass the evaluation is pending
diff --git a/docs/source/components/identifiers.md b/docs/source/components/identifiers.md
@@ -12,7 +12,7 @@ Code or script identifier
 | `name` | `Optional[str]` | Name  |
 | `version` | `Optional[str]` | Code version  |
 | `container` | Optional[[Container](#container)] | Container  |
-| `run_script` | `Optional[pathlib.Path]` | Run script (Path to run script) |
+| `run_script` | `Optional[pathlib._local.Path]` | Run script (Path to run script) |
 | `language` | `Optional[str]` | Programming language (Programming language used) |
 | `language_version` | `Optional[str]` | Programming language version  |
 | `input_data` | Optional[List[[DataAsset](#dataasset) or [CombinedData](#combineddata)]] | Input data (Input data used in the code or script) |
diff --git a/docs/source/components/subjects.md b/docs/source/components/subjects.md
@@ -8,7 +8,7 @@ Description of breeding info for subject
 
 | Field | Type | Title (Description) |
 |-------|------|-------------|
-| <del>`breeding_group`</del> | `str` | **[DEPRECATED]** Field will be removed in future releases. Breeding Group  |
+| <del>`breeding_group`</del> | `Optional[str]` | **[DEPRECATED]** Field will be removed in future releases. Breeding Group  |
 | `maternal_id` | `str` | Maternal specimen ID  |
 | `maternal_genotype` | `str` | Maternal genotype  |
 | `paternal_id` | `str` | Paternal specimen ID  |
diff --git a/docs/source/quality_control.md b/docs/source/quality_control.md
@@ -24,15 +24,30 @@ If you find yourself computing a value for something smaller than an entire moda
 
 ### Tags
 
-`tags` are any string that naturally groups sets of metrics together. Good tags are things like: "Probe A", "Motion correction", and "Pose tracking". The stage and modality are automatically treated as tags, you do not need to include them in the tags list.
+`tags` are groups of descriptors that define how metrics are organized hierarchically, making it easier to visualize metrics. Good tag keys (groups) are things like "probe" and good tag values are things like "Probe A" or just "A".
+
+```{python}
+# For an electrophysiology metric
+tags = {
+    "probe": "A",
+    "shank": "0",
+}
+
+# For a behavioral video metric
+tags = {
+    "video": "left body",
+}
+```
+
+Use the `QualityControl.default_grouping` list to define how users should organize a visualization by default. In almost all cases *modality should be the top-level grouping*. For example, building on the example above you might group by: `["modality", ("probe", "video"), "shank"]` to get a tree split by modality first (which naturally splits ephys and behavior-videos tags into two groups), then by which probe or video a metric belongs to, and finally only for probes the individual shanks are split into groups.
 
 ### QualityControl.evaluate_status()
 
 You can evaluate the state of a set of metrics filtered by any combination of modalities, stages, and tags on a specific date (by default, today). When evaluating the [Status](#status) of a group of metrics the following rules apply:
 
-First, any metric that is failing and also has a matching tag (or tuple of tags) in the `QualityControl.allow_tag_failures` list is set to pass. This allows you to specify that certain metrics are not critical to a data asset.
+First, any metric that has a tag *value* in the `QualityControl.allow_tag_failures` list is ignored. This allows you to specify that certain metrics are not critical to a data asset.
 
-Then, given the status of all the metrics in the group:
+Then, given the status of all the remaining metrics in the group:
 
 1. If any metric is still failing, the evaluation fails
 2. If any metric is pending and the rest pass the evaluation is pending
@@ -72,8 +87,8 @@ Collection of quality control metrics evaluated on a data asset to determine pas
 | `metrics` | List[[QCMetric](quality_control.md#qcmetric) or [CurationMetric](quality_control.md#curationmetric)] | Evaluations  |
 | `key_experimenters` | `Optional[List[str]]` | Key experimenters (Experimenters who are responsible for quality control of this data asset) |
 | `notes` | `Optional[str]` | Notes  |
-| `default_grouping` | `List[str]` | Default grouping (Default tag grouping for this QualityControl object, used in visualizations) |
-| `allow_tag_failures` | `List[str or tuple]` | Allow tag failures (List of tags that are allowed to fail without failing the overall QC) |
+| `default_grouping` | `List[str or tuple[str, ...]]` | Default grouping (Tag *keys* that should be used to group metrics hierarchically for visualization) |
+| `allow_tag_failures` | `List[str]` | Allow tag failures (List of tag *values* that are allowed to fail without failing the overall QC) |
 | `status` | `Optional[dict]` | Status mapping (Mapping of tags, modalities, and stages to their evaluated status, automatically computed) |
 
 
@@ -104,7 +119,7 @@ Description of a curation metric
 | `status_history` | List[[QCStatus](quality_control.md#qcstatus)] | Metric status history  |
 | `description` | `Optional[str]` | Metric description  |
 | `reference` | `Optional[str]` | Metric reference image URL or plot type  |
-| `tags` | `List[str]` | Tags (Tags group QCMetric objects to allow for grouping and filtering) |
+| `tags` | `Dict[str, str]` | Tags (Tags group QCMetric objects. Unique keys define groups of tags, for example {'probe': 'probeA'}.) |
 | `evaluated_assets` | `Optional[List[str]]` | List of asset names that this metric depends on (Set to None except when a metric's calculation required data coming from a different data asset.) |
 
 
@@ -121,7 +136,7 @@ Description of a single quality control metric
 | `status_history` | List[[QCStatus](quality_control.md#qcstatus)] | Metric status history  |
 | `description` | `Optional[str]` | Metric description  |
 | `reference` | `Optional[str]` | Metric reference image URL or plot type  |
-| `tags` | `List[str]` | Tags (Tags group QCMetric objects to allow for grouping and filtering) |
+| `tags` | `Dict[str, str]` | Tags (Tags group QCMetric objects. Unique keys define groups of tags, for example {'probe': 'probeA'}.) |
 | `evaluated_assets` | `Optional[List[str]]` | List of asset names that this metric depends on (Set to None except when a metric's calculation required data coming from a different data asset.) |
 
 
diff --git a/examples/exaspim_quality_control.py b/examples/exaspim_quality_control.py
@@ -134,7 +134,7 @@
     ),
 ]
 
-quality_control = QualityControl(metrics=metrics, default_grouping=[["neuron_id"]])
+quality_control = QualityControl(metrics=metrics, default_grouping=["neuron_id"])
 
 serialized = quality_control.model_dump_json()
 deserialized = QualityControl.model_validate_json(serialized)
diff --git a/examples/quality_control.py b/examples/quality_control.py
@@ -133,7 +133,7 @@
 q = QualityControl(
     metrics=metrics,
     # in visualizations split first by modality, then by probe / video tags
-    default_grouping=[["modality"], ["probe", "video"]],
+    default_grouping=["modality", ("probe", "video")],
     # allow any metrics with tag video: Video 2 to fail without failing overall QC
     allow_tag_failures=["Video 2"],
 )
diff --git a/src/aind_data_schema/core/quality_control.py b/src/aind_data_schema/core/quality_control.py
@@ -128,7 +128,7 @@ class QualityControl(DataCoreModel):
     )
     notes: Optional[str] = Field(default=None, title="Notes")
 
-    default_grouping: List[tuple[str, ...]] = Field(
+    default_grouping: List[str | tuple[str, ...]] = Field(
         ...,
         title="Default grouping",
         description="Tag *keys* that should be used to group metrics hierarchically for visualization",
@@ -283,9 +283,13 @@ def fix_default_grouping_list(cls, value: dict) -> dict:
         """
         if "default_grouping" not in value:
             return value
-        if value["default_grouping"] and isinstance(value["default_grouping"][0], str):
-            # Add the modality as the top-level grouping, then tag_1 as the second level, similar to old portal behavior
-            value["default_grouping"] = [["modality"], ["tag_1"]]
+
+        if all(isinstance(item, str) for item in value["default_grouping"]):
+            first_metric = value["metrics"][0]
+            if isinstance(first_metric, dict) and "tags" in first_metric:
+                if isinstance(first_metric["tags"], list):
+                    value["default_grouping"] = [["modality"], ["tag_1"]]
+
         return value
 
 
diff --git a/tests/test_composability_merge.py b/tests/test_composability_merge.py
@@ -55,12 +55,12 @@ def test_merge_quality_control(self):
 
         q1 = QualityControl(
             metrics=metrics,
-            default_grouping=[("group1",)],
+            default_grouping=["group1"],
         )
 
         q2 = QualityControl(
             metrics=metrics + metrics,
-            default_grouping=[("group1",)],
+            default_grouping=["group1"],
         )
 
         q3 = q1 + q2
@@ -97,14 +97,14 @@ def test_merge_quality_control(self):
         q1 = QualityControl(
             metrics=metrics,
             key_experimenters=["Alice", "Bob"],
-            default_grouping=[("group1",), ("group1", "group2")],
+            default_grouping=["group1", ("group1", "group2")],
             allow_tag_failures=["FailTag1", "FailTag2"],
         )
 
         q2 = QualityControl(
             metrics=metrics,
             key_experimenters=["Bob", "Charlie"],  # Bob is duplicate
-            default_grouping=[("group1", "group2"), ("group3",)],  # ("group1", "group2") is duplicate
+            default_grouping=[("group1", "group2"), "group3"],  # ("group1", "group2") is duplicate
             allow_tag_failures=["FailTag2", "FailTag3"],  # FailTag2 is duplicate
         )
 
@@ -125,7 +125,7 @@ def test_merge_quality_control(self):
         self.assertEqual(set(q3.key_experimenters), {"Alice", "Bob", "Charlie"})
 
         self.assertEqual(q3.default_grouping.count(("group1", "group2")), 1)  # Should be deduplicated
-        self.assertEqual(set(q3.default_grouping), {("group1",), ("group1", "group2"), ("group3",)})
+        self.assertEqual(set(q3.default_grouping), {"group1", ("group1", "group2"), "group3"})
 
         self.assertEqual(q3.allow_tag_failures.count("FailTag2"), 1)  # Should be deduplicated
         self.assertEqual(set(q3.allow_tag_failures), {"FailTag1", "FailTag2", "FailTag3"})
diff --git a/tests/test_quality_control.py b/tests/test_quality_control.py
@@ -37,9 +37,7 @@ def test_tags_list_to_dict_conversion(self):
             "modality": {"name": "Extracellular electrophysiology", "abbreviation": "ecephys"},
             "stage": "Processing",
             "value": 42,
-            "status_history": [
-                {"evaluator": "Test", "timestamp": "2020-10-10", "status": "Pass"}
-            ],
+            "status_history": [{"evaluator": "Test", "timestamp": "2020-10-10", "status": "Pass"}],
             "tags": ["tag1", "tag2", "tag3"],
         }
 
@@ -79,7 +77,7 @@ def test_overall_status(self):
 
         self.assertEqual(test_metrics[0].status.status, Status.PASS)
 
-        q = QualityControl(metrics=test_metrics + test_metrics, default_grouping=[("group")])  # duplicate the metrics
+        q = QualityControl(metrics=test_metrics + test_metrics, default_grouping=["group"])  # duplicate the metrics
 
         # check that overall status gets auto-set if it has never been set before
         self.assertEqual(q.evaluate_status(), Status.PASS)
@@ -149,7 +147,7 @@ def test_evaluation_status(self):
             ),
         ]
 
-        qc = QualityControl(metrics=metrics, default_grouping=[("group")])
+        qc = QualityControl(metrics=metrics, default_grouping=["group"])
         self.assertEqual(qc.evaluate_status(tag="Drift map"), Status.PASS)
 
         # Add a pending metric, evaluation should now evaluate to pending
@@ -223,7 +221,7 @@ def test_allowed_failed_metrics(self):
         # First check that a pending evaluation still evaluates properly
         qc = QualityControl(
             metrics=metrics,
-            default_grouping=[("group")],
+            default_grouping=["group"],
         )
 
         self.assertEqual(qc.evaluate_status(tag="Drift map"), Status.PENDING)
@@ -480,7 +478,7 @@ def test_status_date(self):
 
         # Note: The date filtering is currently not implemented in the new schema
         # This test would need to be updated once date filtering is implemented
-        qc = QualityControl(metrics=[metric], default_grouping=[("group")])
+        qc = QualityControl(metrics=[metric], default_grouping=["group"])
 
         self.assertEqual(qc.evaluate_status(date=t3), Status.PASS)
         self.assertEqual(qc.evaluate_status(date=t2), Status.PENDING)
@@ -752,7 +750,7 @@ def test_helper_functions_integration(self):
 
         qc = QualityControl(
             metrics=metrics,
-            default_grouping=[("group")],
+            default_grouping=["group"],
         )
 
         # Test status at different times
@@ -770,6 +768,85 @@ def test_helper_functions_integration(self):
         late_status = qc.evaluate_status(date=late_date, tag="time_sensitive")
         self.assertEqual(late_status, Status.FAIL)
 
+    def test_backwards_compatibility_default_grouping(self):
+        """Test that fix_default_grouping_list validator handles old v2.2.X format correctly"""
+
+        # Test old v2.2.X format: list of strings for default_grouping + list-based tags
+        old_format_dict = {
+            "metrics": [
+                {
+                    "object_type": "QC metric",
+                    "name": "Old format metric",
+                    "modality": {"name": "Extracellular electrophysiology", "abbreviation": "ecephys"},
+                    "stage": "Processing",
+                    "value": 42,
+                    "status_history": [{"evaluator": "Test", "timestamp": "2020-10-10", "status": "Pass"}],
+                    "tags": ["old_tag1", "old_tag2"],
+                }
+            ],
+            "default_grouping": ["group1", "group2"],
+        }
+
+        with self.assertWarns(DeprecationWarning):
+            qc_old = QualityControl.model_validate(old_format_dict)
+
+        # Should convert to [("modality",), ("tag_1",)] for backwards compatibility
+        self.assertEqual(qc_old.default_grouping, [("modality",), ("tag_1",)])
+        # Tags should be converted to dict
+        self.assertEqual(qc_old.metrics[0].tags, {"tag_1": "old_tag1", "tag_2": "old_tag2"})
+
+    def test_new_format_default_grouping_all_strings(self):
+        """Test that new format with all strings in default_grouping is NOT converted"""
+
+        # Test new format: list of strings for default_grouping + dict-based tags
+        new_format_dict = {
+            "metrics": [
+                {
+                    "object_type": "QC metric",
+                    "name": "New format metric",
+                    "modality": {"name": "Extracellular electrophysiology", "abbreviation": "ecephys"},
+                    "stage": "Processing",
+                    "value": 42,
+                    "status_history": [{"evaluator": "Test", "timestamp": "2020-10-10", "status": "Pass"}],
+                    "tags": {"group": "test_group", "probe": "probeA"},
+                }
+            ],
+            "default_grouping": ["group", "probe"],
+        }
+
+        qc_new = QualityControl.model_validate(new_format_dict)
+
+        # Should NOT convert - keep as-is
+        self.assertEqual(qc_new.default_grouping, ["group", "probe"])
+        # Tags should remain as dict
+        self.assertEqual(qc_new.metrics[0].tags, {"group": "test_group", "probe": "probeA"})
+
+    def test_new_format_default_grouping_mixed(self):
+        """Test that new format with mixed strings and tuples in default_grouping is NOT converted"""
+
+        # Test new format: mixed strings and tuples for default_grouping + dict-based tags
+        new_format_dict = {
+            "metrics": [
+                {
+                    "object_type": "QC metric",
+                    "name": "New format metric",
+                    "modality": {"name": "Extracellular electrophysiology", "abbreviation": "ecephys"},
+                    "stage": "Processing",
+                    "value": 42,
+                    "status_history": [{"evaluator": "Test", "timestamp": "2020-10-10", "status": "Pass"}],
+                    "tags": {"group": "test_group", "probe": "probeA", "shank": "shank1"},
+                }
+            ],
+            "default_grouping": ["group", ("probe", "shank")],
+        }
+
+        qc_new = QualityControl.model_validate(new_format_dict)
+
+        # Should NOT convert - keep as-is
+        self.assertEqual(qc_new.default_grouping, ["group", ("probe", "shank")])
+        # Tags should remain as dict
+        self.assertEqual(qc_new.metrics[0].tags, {"group": "test_group", "probe": "probeA", "shank": "shank1"})
+
 
 if __name__ == "__main__":
     unittest.main()

Original file line number	Diff line number	Diff line change
`@@ -134,7 +134,7 @@`
`134`	`134`	`),`
`135`	`135`	`]`
`136`	`136`
`137`		`-quality_control = QualityControl(metrics=metrics, default_grouping=[["neuron_id"]])`
	`137`	`+quality_control = QualityControl(metrics=metrics, default_grouping=["neuron_id"])`
`138`	`138`
`139`	`139`	`serialized = quality_control.model_dump_json()`
`140`	`140`	`deserialized = QualityControl.model_validate_json(serialized)`
Original file line number	Diff line number	Diff line change
`@@ -133,7 +133,7 @@`
`133`	`133`	`q = QualityControl(`
`134`	`134`	`metrics=metrics,`
`135`	`135`	`# in visualizations split first by modality, then by probe / video tags`
`136`		`- default_grouping=[["modality"], ["probe", "video"]],`
	`136`	`+ default_grouping=["modality", ("probe", "video")],`
`137`	`137`	`# allow any metrics with tag video: Video 2 to fail without failing overall QC`
`138`	`138`	`allow_tag_failures=["Video 2"],`
`139`	`139`	`)`