-
Notifications
You must be signed in to change notification settings - Fork 612
Fix Timer leakage in TimerBasedCredentialRefresher ... #1640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@MatthiasWerning Please sign the Contributor License Agreement! Click here to manually synchronize the status of this Pull Request. See the FAQ for frequently asked questions. |
Okay, so everything is done from my side, including local testing from within my components and adding a test case for the fixed behavior. I triggered all necessary instances internally at my company to get the CLA signed, I hope this can be resolved next week at latest. If you have any feedback or suggestions for code changes let me know. |
|
||
public ICredentialsRefresher.NotifyCredentialRefreshedAsync Callback { get; set; } | ||
|
||
public TimerRegistration(object lockObj, ICredentialsRefresher.NotifyCredentialRefreshedAsync callback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, this is tricky and potentially tricky to debug if errors happen, you use the same lock object in your main type.
The general recommendation is not to use the same lock object for multiple resources ands that's the case here, in the main type for your dictionary and here for your elapsed logic.
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/statements/lock#guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually something I noticed as well, but I'm quite unsure on what would be an alternative. Unregistering may happen at any time, even when the timer elapsed event is being triggered in parallel. When removing the lock the worst case would be, that the timer runs one more cycle and then shuts itself down. I guess this is acceptable in order to avoid passing the lock object down. Any other ideas?
} | ||
|
||
provider.Refresh(); | ||
ScheduleTimer(provider); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this called again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timer is initialized using AutoReset = false which leads to the timer Elapsed event only being thrown once and never again. Therefore a new timer needs to be scheduled to keep the "chain" of timers running until it is unregistered by the owning TimerBasedCredentialRefresher.
I did not alter this behavior from the version before this PR, see here:
https://github.com/rabbitmq/rabbitmq-dotnet-client/blob/main/projects/RabbitMQ.Client/client/api/ICredentialsRefresher.cs#L126
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, I'm not really familiar with that specific timer, there are simply too many in .NET.
If you call ScheduleTimer again, you need to make sure you dispose any existing old timer, so something like this might be necessary (line 180+)
{
var oldTimer = _timer;
_timer = newTimer;
oldTimer.Dispose();
}
I'm not sure, but I think using it with autoreset might be cleaner.
Just for my understanding, there are possibly multiple ICredentialsProvider provider and each might have a different refresh timeout? That's why there are multiple timers and it's not possible to have like the least common multiple timeout (besides 1s or something similar) timer and since old versions of .NET are supported, there is no PriorityQueue which might simplify that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know the timer has to be disposed, next commit will include that.
From my perspective (I'm not a C# expert) using a chain of self triggering timers is a cleaner approach than a single persisting timer firing in a specific interval. This is because the elapse event method may take an indefinite amount of time to complete. So theoretically if the elapse method requires more time than the interval itself we get some type of congestion when multiple elapse methods run and update stuff. Of course in this scenario we would have other problems as well. Thus the elapse method may never take longer than 1/3 of the credentials lifetime (JWT for example).
Also see this article here for an more in-depth example (JS, but basically transferable to this use case here): https://reallifejs.com/brainchunks/repeated-events-timeout-or-interval/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense, it might also make sense to look into making the ICredentialsProvider properly async in 8.0 and then looking at better ways to do periodic refresh, i.e. PeriodicTimer or some maybe even a dumb refresh/delay loop might be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might also make sense to look into making the ICredentialsProvider properly async in 8.0
No time like the present ... before version 7 ships!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No time like the present ... before version 7 ships!
Yeah, I just took a good long look at it, there is quite a bit of change there, including figuring out the problem this PR solves, just with the added complexity of async and different target frameworks (only .NET 6 supports PeriodicTimer, which is the only real async capable timer, including the fact that we don't know the timer period until the token has been refreshed atleast once and it might change while the timer is running), then a lot of interfaces and related types need to be changed.
And probably a different concept for static credentials (basic auth) vs. dynamic stuff (oauth2).
I tried out the above here:
Tornhoof@8ec63c1
I personally don't know how much in use the OAuth2 stuff is, I've never used it with RabbitMQ, imho the OAuthClient requires quite a bit of overhaul (see my changes above), a lot of disposes are missing, questionable use of System.Text.Json and the whole sync over async code is questionable at best.
If you really want to release 7.0 this month, I'd recommend moving this to 8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate your input, and I'll discuss with @MarcialRosales and other Team RabbitMQ members next week.
{ | ||
if (provider.ValidUntil == null) | ||
{ | ||
throw new ArgumentNullException("ValidUntil of " + nameof(provider) + " was null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nameof(provider) will just say "provider", you probably want provider.GetType().Name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will resolve this in the next commit.
@Tornhoof @MatthiasWerning thanks for looking into this issue. The original author, @MarcialRosales is also reviewing. |
8236789
to
371281a
Compare
371281a
to
20e9833
Compare
Made code changes and refactorings as suggested by @Tornhoof . I hope my manager will be able to sign the CLA this week, it's pretty much out of my hand as of right now. |
…y one registration across all connection recoveries
…alled again after updating
…wait of notification callback
…ration; dispose old timer on elapse; general suggested refactorings
88533cb
to
350be50
Compare
Isn't there something off with the refresher design that requires addressing instead?
Let me try to explain myself a bit more. I had a brief look at the issue description and at this PR. I thought in order to have one timer it would be possible to change the concurrent dictionary to use a The callbacks per connection though are never really expressed, which makes it difficult to have deterministic behavior when to remove callbacks etc. I might not have spent enough time with the design but something is smelly. |
@lukebakken it seems AutoRecovery tries to abort the inner connection if it was opened but never disposes the inner connection. Is that intentional? Because if the auto recovery connection would deterministically dispose the inner connection, this could be used to unregister the callback. But then you'd still have the problem that you have no deterministic point to dispose the timer and that while the new inner connection is opened refresh might be called on the old connection when the timer is due, which might not be desirable. I also wonder if it wouldn't be simpler to switch to System.Threading.Timer or Periodic Timer. At least with the System.Threading.Timer you could use the change method in the callback method to make the timer fire again and wouldn't have to recreate new timers again and again, which might also simplify the code. But even with those timers you would require disposing of the disposable timer which brings me back to thinking the refresher concept might need an some more thinking. |
I don't even see where the inner connection is aborted, to be honest. Not disposing the inner connection is certainly an oversight. @MarcialRosales @MatthiasWerning I have opened the following PR to this PR: I couldn't see the rationale for having multiple credentials provider registrations in I'm going to discuss this feature with the original implementor, @MarcialRosales, tomorrow. |
@lukebakken I had another brief look and it seems the whole existing design seems a bit raw. For example, the token refreshing in any production grade implementation would always do some sort of IO-bound call which is quite likely async. The credential provider isn't. The IOAuth2Client implementation is also not async and the token uses a weird mixture of casing on the properties. Next up, it seems there is always a one-to-one relationship by the refresher and the provider currently in the code. So why can't the design be simplified to basically initialize the refresher with the provider and let the code register delegates. Just spit balling ideas at the top of my head something like public interface ICredentialsRefresher
{
void Initialize(ICredentialsProvider provider);
IDisposable Register(NotifyCredentialRefreshed callback);
} Then the connection can dispose the returned registration, which then removes it from the internal registration list in a thread safe manner. Another approach could be to essentially marry everything together by having a token provider only that the client always calls when it needs a token and let it up to the implementation to do the necessary async fetching, caching etc. See for example how Azure does it https://github.com/search?q=repo%3AAzure/azure-sdk-for-net%20GetTokenAsync&type=code I think having a single TokenProvider interface that returns a ValueTask with a token is probably the best approach to simplify the code Again I have only spent a few minutes looking at things due to time constraints on my end so take this with a grain of salt. |
@danielmarbach thank you for the feedback! |
FYI you are aware the v6.x version has the same problem? |
Yes, of course. |
@MatthiasWerning - thanks for all of the work you have put into this. I thought about the whole "credential refresh" feature and I think I have a totally different implementation in mind. I'll open a new PR with that when it's done and will tag you, @danielmarbach and @MarcialRosales when it's ready (later today, hopefully). I'm going to close this PR but please don't delete your branch just yet, in case I'm way off-base 😸 |
... by maintaining only one registration across all connection recoveries
Proposed Changes
Replaced registrations in TimerBasedCredentialRefresher by custom class to maintain only one registration across Connection recovery. Callback will be updated for the registration on each subsequent Register call.
Since multiple dictionary operations are now required to be atomic, locking is now done via a helper object for each TimerBasedCredentialRefresher instance and passed to the registration to check if it has been disposed yet by the Unregister method.
Types of Changes
What types of changes does your code introduce to this project?
Put an
x
in the boxes that applyChecklist
CLA will be an issue within my company which I cannot promise will be resolved in a few days (this has been an issue earlier with another CLA which also isn't finally resolved yet). Unfortunately I'm not able to make decisions on behalf of my company.
CONTRIBUTING.md
document