-
Notifications
You must be signed in to change notification settings - Fork 754
VPP app install retry on 9610 #38008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #38008 +/- ##
==========================================
- Coverage 65.87% 65.86% -0.01%
==========================================
Files 2392 2392
Lines 190799 190739 -60
Branches 8364 8310 -54
==========================================
- Hits 125689 125638 -51
+ Misses 53690 53673 -17
- Partials 11420 11428 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
server/mdm/apple/vpp/api.go
Outdated
| // ManageVPPLicenses is using the deprecated V1 API to manage VPP licenses, the V1 API works synchronously. | ||
| // https://developer.apple.com/documentation/devicemanagement/manage-licenses | ||
| // This was added to hopefully reduce the number of 9610 (License not available) due to the async nature, and us not supporting the notification/async events correctly for the V2 API. | ||
| // https://github.com/fleetdm/fleet/issues/36724 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Erm, why are we adding use of a deprecated API? Seems like we'll have to back this out sooner rather than later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good question, we decided to do that since the move to V2 is not actually fully supported by us.
We moved, and started using batch/async endpoints but never implemented the notifications service that actually gets the response of those calls.
The async nature of that, plus no validation lead us to wanting to do it in a sync manner (V1), until we can prioritize the story to build out the Apple Notifications Service integration.
We followed it up with 3 retries, but since they are pretty close to the actual call, we weren't certain it is enough and we don't have a good/reliable way to reproduce this locally, as we think it stems from large orgs that hit apple API's a lot, therefore Apple might be slower in their envs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using this seems fine as Apple has deprecated it but doesn't seem to be saying they're going to stop supporting it yet. I think we should add a caveat here to make callers aware of this limitation. We probably don't want to change tihs to ever do multi-assignment without more work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't want to change tihs to ever do multi-assignment without more work
Given this concern, maybe we should change the function signature so that we only allow a single serial number rather than a slice as currently contemplated by the type of AssociateSerialNumbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our MDM stand-up, we reverted the v1 change, and kept the 3 times retry
| if err != nil { | ||
| return nil, ctxerr.Wrap(r.Context, err, "fetching host vpp install by command uuid") | ||
| } | ||
| if vppInstall.RetryCount < 3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the retry count ever get reset? If the cap is reached, does that prevent the admin/end user from trying to install again later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding(and testing on my local fleet server) re-running a VPP install creates a new row(and as such a new counter) in host_vpp_software_installs
JordanMontgomery
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good, would still like to see what software says
Related issue: Resolves #36724
This PR updates the VPP Software installation (Apple association) to use the V1 API which is non-async.
It also retries VPP apps if we receive a 9610 error 3 times.
Checklist for submitter
If some of the following don't apply, delete the relevant line.
changes/,orbit/changes/oree/fleetd-chrome/changes.See Changes files for more information.
Testing