-
Notifications
You must be signed in to change notification settings - Fork 46
Make the number of repository fetched at once configurable to handle large registries #353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the number of repository fetched at once configurable to handle large registries #353
Conversation
@microsoft-github-policy-service agree company="UiPath" |
@Wwwsylvia, @estebanreyl, @northtyphoon, @sajayantony, @shizhMSFT, @wangxiaoxuan273, @wju-MSFT can someone PTAL? |
Code LGTM. But I'm wondering if we really need this.
Getting |
In our ACR, it is not temporary. It just takes too long and timeouts as it seems proportional to the number of tags in a repository. |
This change seems perfectly reasonable to me, I see no reason to not approve it. The reason why repository listing might be slow at times when there are a lot of repos is due to an inefficiency on the server side on how we list repos for which we have a longstanding work item. I won't get too deep into the details here but in essence if there are a lot of repos, we can run into a noticeable delay when querying the repo metadata. It shouldn't be resulting in 502s however (maybe there is some specific network config leading to that, @JRBANCEL are you using a proxy / firewall?). I went over our logs and don't see any 502s returned for our catalog API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
To be honest, I can't find that log anymore. But, yes it was some Gateway error regarding a timeout. |
Also PTAL at #354 when you get a second 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add an example for the new flag in README.md and the command example message.
e8c6a43
to
e426248
Compare
e426248
to
f233e4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Purpose of the PR
From an empirical observation, listing the repositories is
O(# of OCI objects)
and notO(# of repositories)
. Therefore, what happens when a registry contains repositories with many objects (like hundred of thousands), listing repositories in batch of 100 breaks with an HTTP 502 because the gateway is taking too long list the repositories.This PR makes the batch size configurable such that these scenarios can be handled. Setting it to a smaller number like
5
just works as the total time spent in the gateway to list fewer repositories is less than the timeout.I followed the
concurrency
flag pattern.