Clean up some instances of mishandled error logging#9923
Clean up some instances of mishandled error logging#9923jgallagher merged 2 commits intooxidecomputer:mainfrom
Conversation
|
To clarify, I did this because I was looking for something to work on that would be useful, but not so urgent that I'd end up racing with someone else, and also feasible for me to tackle without any hand-holding. So far, it has felt like a good choice, but if I'm clashing with someone else's work, or this is a misguided thing to pursue for whatever other reason, please let me know. Otherwise, I plan to do a bit more similar work, and either update this PR or make another one, depending on whether anyone has reviewed it yet. |
|
@jgallagher I don't have permission to request reviewers, and I also don't know who to ask, in general. I think you can help me with that, though, so I'm pinging you here. If I understood #9804 correctly, you'd be interested in merging this 🙂 |
I am, thanks! I agree that this is useful work that is pretty off-the-beaten-path. I do have a branch that updates a bunch of log sites within |
|
I'm probably done for today, but if you're willing to push your work in progress to a branch (without making a PR) at some point in the near future, I can either steer clear of it or carry it forward, depending on what you think makes sense. While I was auditing how |
|
I was hitting mostly the logging sites and not the error definitions themselves, which is why I was guessing we wouldn't have too much overlap. My WIP branch is here: https://github.com/oxidecomputer/omicron/tree/john/log-more-errors-llm; it's probably pretty close to ready-to-PR shape, and I'll try to get that up some time soon. |
|
Sure, but don't feel like you have to expedite that on my behalf. If I end up having to deal with some merge conflicts, it's not the end of the world. Is there a particular reason not to derive Admittedly, for the places that are only calling |
03ceec1 to
627ef3b
Compare
As described in docs/error-types-and-logging.adoc, recursing into a source's Display implementation can result in duplicated error text in log output, so this commit updates the error attributes on DdmError's variants to stop doing this. Unfortunately, this type of change is breaking, in terms of how callers must log an error, but the difference isn't visible to the type system, and therefore won't result in compilation errors. Accordingly, this commit also updates any callers that were previously relying on the old Display implementation, so they walk the error chain instead.
Similar to the last commit, this updates the #[error(...)] attribute on the DnsResolver variant on EarlyNetworkSetupError, to avoid recursing into the Display implementation of ResolveError. All callers appear to be handling EarlyNetworkSetupError correctly already, so this commit doesn't make any other changes.
627ef3b to
889deab
Compare
|
I took a look at how the |
No, not really. I was focusing on the logging sites because even if we derive
I'm not sure what you mean here - could you expand on that? |
Sure: it seems like the main problem here is that from a call site's perspective, or this: and then that line would fail to compile if the underlying error type doesn't provide the method. I just belatedly realized that deriving |
Ahh, got it. Yes, I think if we want to make more widespread use of the derive, having an explicit method like this that makes it obvious we're getting the full chain would be great.
Yeah, |
|
I suppose the logical conclusion of that approach is a trait that gets automatically implemented on other types that implement That wouldn't be materially different from how |
|
Looks like |
|
You're correct; that's a known-but-annoying test flake (#9230). |
|
As linked above, I ended up making a PR at oxidecomputer/slog-error-chain#20, whic uses a blanket implementation to provide all This style can be used with slog macros as well, so I think it's worth including. I don't consider the extra call to |
This cleans up handling of
DdmErrorandResolveError. It also updates any downstream code that was previously relying on the redundancy inDdmError'sDisplayimplementation, such that it instead relies on the implementation ofslog::Valuefrom theslog-error-chaincrate.Related to #9804 and #9803.