[Feature] Add scene comparison report. (Resolves #43) #56

angelosilvestre · 2025-06-19T18:05:49Z

[Feature] Add scene comparison report. (Resolves #43)

This PR introduces the GoldenSceneReport, which holds information necessary to provide a human readable report.

Reports the following information:

Number of failed/passed items
Number of missing/extra candidates
The description of each item
The failure description of each failed item
The description of each missing candidate
The description of each extra candidate

lib/src/goldens/golden_comparisons.dart

lib/src/scenes/gallery.dart

matthew-carroll · 2025-06-19T18:29:57Z

lib/src/scenes/gallery.dart

+    for (final mismatch in mismatches.mismatches.values) {
+      FtgLog.pipeline.fine(" - ${mismatch.golden?.id ?? mismatch.screenshot?.id}: $mismatch");
+      switch (mismatch) {
+        case MissingGoldenMismatch(screenshot: null):


Would it be more convenient to have different classes for each of these situations? I don't really care either way. The whole API surface is up for debate.

Yeah, I think that different classes would make it clearer. Updated.

lib/src/scenes/gallery.dart

matthew-carroll · 2025-06-19T18:37:10Z

lib/src/scenes/gallery.dart

-            }
-          }
+      await tester.runAsync(() async {
+        final goldenWidth = mismatch.golden!.image.width;


The stuff inside the tester.runAsync() should probably be a standalone behavior somewhere. Maybe inside of a failure_scene.dart, where we'll eventually implement entire scenes of failures.

matthew-carroll · 2025-06-19T18:39:02Z

lib/src/scenes/gallery.dart

+      });
    }
+
+    throw Exception("Goldens failed with ${mismatches.mismatches.length} mismatch(es)");


Please verify how tests usually output failure info. Do they run a regular print followed by an exception? Or does the exception typically contain all intended output?

They just throw a TestFailure. I modified it to call fail, which is the same that is called when an expect fails.

lib/src/scenes/gallery.dart

lib/src/scenes/failure_scene.dart

matthew-carroll · 2025-06-19T19:38:42Z

lib/src/scenes/golden_scene.dart

+
+/// A report of a golden scene test.
+///
+/// Holds information to display the results of a golden scene test.


This doc line is redundant.

Some things worth pointing out would include the fact that this reports on individual success and failures for each golden in the scene, as well as goldens that have no candidate, and candidates that have no corresponding golden.

matthew-carroll · 2025-06-19T19:40:20Z

lib/src/scenes/golden_scene.dart

+  });
+
+  /// The human readable description of the scene.
+  final String sceneDescription;


My gut feeling is that we should probably reference the entire golden scene metadata here, allowing the report to pull out whatever it wants from that.

For example, it's very likely that when we report a failure, we'll want to check the platform that generated the golden, and the platform that generated the candidate, and report that difference to the user, if it exists. That data will be available in the overall metadata. But if we go property by property then we've gotta keep piping things in here.

What do you think?

I think it makes sense to reference the metadata. But it doesn't look like the GoldenSceneMetadata holds the scene description, so we would need to keep it here.

Maybe that change is only on my branches, but that metadata should definitely have the description. We should just add that, rather than avoid it.

I added the description to the metadata class. The goldens will need to be re-generated afterwards to store it.

matthew-carroll · 2025-06-19T19:42:30Z

lib/src/scenes/golden_scene.dart

+  final List<MissingGoldenMismatch> extraCandidates;
+
+  /// The total number of successful [items] in the scene.
+  final int totalPassed;


Would it make more sense for these totals to be synthesized properties? Otherwise, it seems like there's a possible bug where the numbers passed in don't match the actual item, missingCandidates, and extraCandidates in the scene.

I'm also wondering if "passed" and "failed" is a sufficient variety of statuses when we're talking about totals. Is an extra candidate a failure? If so, you could end up with 20 failures in a scene with 5 goldens. At a minimum, we should think that through and ensure we're reporting info that's likely to be useful.

Changed them to be synthesized properties.

I think a failure should be counted once per screenshot that was found in both the original golden and the candidate widget tree. I think we shouldn't treat a missing/extra candidate as a regular failure. Keeping them separate makes it clearer in my opinion.

Ok. Please be sure to document that in the Dart Docs. I haven't looked at the updated code yet.

lib/src/scenes/golden_scene.dart

matthew-carroll · 2025-06-19T19:45:03Z

lib/src/scenes/golden_scene.dart

+  });
+
+  factory GoldenReportItem.success({
+    required String description,


For a single required property, we probably don't need to name it.

Though I'm also wondering what value the description has. Is there a reason that users would want the description and nothing else? For example, we haven't passed the golden id in here, so how do we know which golden we're talking about?

Should the GoldenReport take the entire GoldenMetadata instead?

matthew-carroll · 2025-06-19T19:45:43Z

lib/src/scenes/golden_scene.dart

+
+  /// The details of the golden check for this item.
+  ///
+  /// Might contain both successful and failed checks.


Why would this contain successes and failures? Isn't this class for a single golden?

This is intended to be used for the cases you mentioned where we want to perform invisible checks. That way, a single golden can have a pixel check that succeeded and other checks that failed.

We can probably just ignore that for the moment, because we don't have sufficient API support to even generate those things.

That said, I'm thinking about things like focus and semantics as "layers" in a golden. I think it's OK for us to clearly differentiate between mismatched pixels vs mismatches in some arbitrary set of layers.

matthew-carroll · 2025-06-20T00:50:41Z

lib/src/goldens/golden_collections.dart


  final Map<String, GoldenImage> imagesById;

+  final GoldenSceneMetadata metadata;


This shouldn't be here. I assume this was added to make the scene metadata available somewhere that we have a collection. Currently these are separate concepts. Perhaps later we'll merge them both into GoldenSceneMetadata, but for now we should respect the barrier.

Is there a reasonable way to get the scene metadata where you need it?

lib/src/goldens/golden_collections.dart

matthew-carroll · 2025-06-20T22:58:11Z

lib/src/scenes/gallery.dart

+        // The golden check passed.
+        items.add(
+          GoldenReport.success(
+            metadata.images.where((image) => image.id == screenshotId).first,


What is this searching for here? It's not obvious to me...

Also, it looks like this image is passed into both possible execution paths. Any reason not to look this up before the if-statement so that we don't risk screwing up the selection in one branch but not the other?

lib/src/scenes/gallery.dart

matthew-carroll · 2025-06-20T23:01:06Z

lib/src/scenes/golden_scene.dart

+  /// The total number of successful [items] in the scene.
+  int get totalPassed => items.where((e) => e.status == GoldenTestStatus.success).length;
+
+  /// The total number of failed [items] in the scene.


This Dart Doc still needs clarity about what is considered a failure, per our previous conversation about missing goldens and missing candidates.

matthew-carroll · 2025-06-20T23:02:10Z

lib/src/scenes/golden_scene.dart

+  int get totalFailed => items.where((e) => e.status == GoldenTestStatus.failure).length;
+}
+
+/// An item in a golden scene report.


The use of the term "item" is cyclical. It also no longer matches the name of the class. I believe this class is "A report of success or failure for a single golden within a scene."

[Feature] Add scene comparison report (Resolves #43)

f2435ff

angelosilvestre requested a review from matthew-carroll June 19, 2025 18:06

matthew-carroll reviewed Jun 19, 2025

View reviewed changes

PR updates

45a26c2

angelosilvestre requested a review from matthew-carroll June 19, 2025 19:26

matthew-carroll reviewed Jun 19, 2025

View reviewed changes

PR updates

b10a612

angelosilvestre requested a review from matthew-carroll June 19, 2025 21:33

matthew-carroll reviewed Jun 20, 2025

View reviewed changes

PR updates

831e127

angelosilvestre requested a review from matthew-carroll June 20, 2025 22:12

angelosilvestre added 2 commits June 20, 2025 19:36

Separate metadata extraction

0de9b9b

PR updates

8b18cee

matthew-carroll reviewed Jun 20, 2025

View reviewed changes

PR updates

a911782

angelosilvestre requested a review from matthew-carroll June 20, 2025 23:14

matthew-carroll approved these changes Jun 20, 2025

View reviewed changes

matthew-carroll marked this pull request as ready for review June 20, 2025 23:22

matthew-carroll merged commit c919d19 into main Jun 20, 2025
1 check passed

matthew-carroll deleted the 43_golden-scene-report branch June 20, 2025 23:22


		final Map<String, GoldenImage> imagesById;

		final GoldenSceneMetadata metadata;

[Feature] Add scene comparison report. (Resolves #43) #56

[Feature] Add scene comparison report. (Resolves #43) #56

Uh oh!

Conversation

angelosilvestre commented Jun 19, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!