-
Notifications
You must be signed in to change notification settings - Fork 0
[Feature] Add scene comparison report. (Resolves #43) #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/src/scenes/gallery.dart
Outdated
for (final mismatch in mismatches.mismatches.values) { | ||
FtgLog.pipeline.fine(" - ${mismatch.golden?.id ?? mismatch.screenshot?.id}: $mismatch"); | ||
switch (mismatch) { | ||
case MissingGoldenMismatch(screenshot: null): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more convenient to have different classes for each of these situations? I don't really care either way. The whole API surface is up for debate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that different classes would make it clearer. Updated.
lib/src/scenes/gallery.dart
Outdated
} | ||
} | ||
await tester.runAsync(() async { | ||
final goldenWidth = mismatch.golden!.image.width; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stuff inside the tester.runAsync()
should probably be a standalone behavior somewhere. Maybe inside of a failure_scene.dart
, where we'll eventually implement entire scenes of failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
lib/src/scenes/gallery.dart
Outdated
}); | ||
} | ||
|
||
throw Exception("Goldens failed with ${mismatches.mismatches.length} mismatch(es)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please verify how tests usually output failure info. Do they run a regular print followed by an exception? Or does the exception typically contain all intended output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They just throw a TestFailure
. I modified it to call fail
, which is the same that is called when an expect
fails.
lib/src/scenes/golden_scene.dart
Outdated
|
||
/// A report of a golden scene test. | ||
/// | ||
/// Holds information to display the results of a golden scene test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc line is redundant.
Some things worth pointing out would include the fact that this reports on individual success and failures for each golden in the scene, as well as goldens that have no candidate, and candidates that have no corresponding golden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
lib/src/scenes/golden_scene.dart
Outdated
}); | ||
|
||
/// The human readable description of the scene. | ||
final String sceneDescription; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My gut feeling is that we should probably reference the entire golden scene metadata here, allowing the report to pull out whatever it wants from that.
For example, it's very likely that when we report a failure, we'll want to check the platform that generated the golden, and the platform that generated the candidate, and report that difference to the user, if it exists. That data will be available in the overall metadata. But if we go property by property then we've gotta keep piping things in here.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to reference the metadata. But it doesn't look like the GoldenSceneMetadata
holds the scene description, so we would need to keep it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that change is only on my branches, but that metadata should definitely have the description. We should just add that, rather than avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the description to the metadata class. The goldens will need to be re-generated afterwards to store it.
lib/src/scenes/golden_scene.dart
Outdated
final List<MissingGoldenMismatch> extraCandidates; | ||
|
||
/// The total number of successful [items] in the scene. | ||
final int totalPassed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense for these totals to be synthesized properties? Otherwise, it seems like there's a possible bug where the numbers passed in don't match the actual item
, missingCandidates
, and extraCandidates
in the scene.
I'm also wondering if "passed" and "failed" is a sufficient variety of statuses when we're talking about totals. Is an extra candidate a failure? If so, you could end up with 20 failures in a scene with 5 goldens. At a minimum, we should think that through and ensure we're reporting info that's likely to be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed them to be synthesized properties.
I think a failure should be counted once per screenshot that was found in both the original golden and the candidate widget tree. I think we shouldn't treat a missing/extra candidate as a regular failure. Keeping them separate makes it clearer in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Please be sure to document that in the Dart Docs. I haven't looked at the updated code yet.
lib/src/scenes/golden_scene.dart
Outdated
}); | ||
|
||
factory GoldenReportItem.success({ | ||
required String description, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a single required property, we probably don't need to name it.
Though I'm also wondering what value the description
has. Is there a reason that users would want the description and nothing else? For example, we haven't passed the golden id
in here, so how do we know which golden we're talking about?
Should the GoldenReport
take the entire GoldenMetadata
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
lib/src/scenes/golden_scene.dart
Outdated
|
||
/// The details of the golden check for this item. | ||
/// | ||
/// Might contain both successful and failed checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this contain successes and failures? Isn't this class for a single golden?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be used for the cases you mentioned where we want to perform invisible checks. That way, a single golden can have a pixel check that succeeded and other checks that failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably just ignore that for the moment, because we don't have sufficient API support to even generate those things.
That said, I'm thinking about things like focus and semantics as "layers" in a golden. I think it's OK for us to clearly differentiate between mismatched pixels vs mismatches in some arbitrary set of layers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
|
||
final Map<String, GoldenImage> imagesById; | ||
|
||
final GoldenSceneMetadata metadata; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be here. I assume this was added to make the scene metadata available somewhere that we have a collection. Currently these are separate concepts. Perhaps later we'll merge them both into GoldenSceneMetadata
, but for now we should respect the barrier.
Is there a reasonable way to get the scene metadata where you need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
lib/src/scenes/gallery.dart
Outdated
// The golden check passed. | ||
items.add( | ||
GoldenReport.success( | ||
metadata.images.where((image) => image.id == screenshotId).first, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this searching for here? It's not obvious to me...
Also, it looks like this image is passed into both possible execution paths. Any reason not to look this up before the if-statement so that we don't risk screwing up the selection in one branch but not the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
/// The total number of successful [items] in the scene. | ||
int get totalPassed => items.where((e) => e.status == GoldenTestStatus.success).length; | ||
|
||
/// The total number of failed [items] in the scene. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Dart Doc still needs clarity about what is considered a failure, per our previous conversation about missing goldens and missing candidates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
lib/src/scenes/golden_scene.dart
Outdated
int get totalFailed => items.where((e) => e.status == GoldenTestStatus.failure).length; | ||
} | ||
|
||
/// An item in a golden scene report. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of the term "item" is cyclical. It also no longer matches the name of the class. I believe this class is "A report of success or failure for a single golden within a scene."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
[Feature] Add scene comparison report. (Resolves #43)
This PR introduces the
GoldenSceneReport
, which holds information necessary to provide a human readable report.Reports the following information: