-
Notifications
You must be signed in to change notification settings - Fork 11
Add timeout option and env var to the OADP CLI tool #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: oadp-dev
Are you sure you want to change the base?
Conversation
Prevents OADP CLI operation hanging indefinitely. Introduced default timeout of 10min which can be controlled via OADP_CLI_HTTP_TIMEOUT variable. Co-Authored-By: Claude <[email protected]> Signed-off-by: Michal Pryc <[email protected]>
This commit adds a --request-timeout flag (following kubectl conventions) to all OADP CLI commands, with full support for cluster unreachability: - Add --request-timeout flag to nonadmin backup logs and describe commands (takes precedence over OADP_CLI_REQUEST_TIMEOUT env var) - Add timeoutFactory wrapper that applies dial timeout to all Velero commands by overriding all client-creating methods (KubeClient, DynamicClient, DiscoveryClient, KubebuilderClient, KubebuilderWatchClient) - Add renameTimeoutFlag to rename Velero's --timeout to --request-timeout for consistent kubectl-style CLI experience - Set custom net.Dialer with timeout to handle TCP dial timeouts - Add context-based timeout handling with proper cancellation detection - Add FormatDownloadRequestTimeoutError for helpful timeout diagnostics - Add tests for timeout functionality (global timeout, config application, dialer timeout behavior) The timeout now applies to both HTTP requests and TCP connection attempts across all commands, ensuring the CLI times out quickly when the cluster is unreachable instead of waiting for the default ~30s TCP timeout. Co-Authored-By: Claude Opus 4.5 <[email protected]> Signed-off-by: Michal Pryc <[email protected]>
| // TimeoutEnvVar is the environment variable name that can be used to override the default timeout. | ||
| // Example: OADP_CLI_REQUEST_TIMEOUT=30m kubectl oadp nonadmin backup logs my-backup | ||
| const TimeoutEnvVar = "OADP_CLI_REQUEST_TIMEOUT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally should document valid formats, ie. days month year valid? only hour/min/second?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could mention/link the libraries used to parse which has doc.
| } | ||
| statusInfo = fmt.Sprintf(" Current status: %s.", strings.Join(conditions, ", ")) | ||
| } | ||
| return fmt.Errorf("timed out after %v waiting for NonAdminDownloadRequest %q to be processed.%s", timeout, req.Name, statusInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("timed out after %v waiting for NonAdminDownloadRequest %q to be processed.%s", timeout, req.Name, statusInfo) | |
| return fmt.Errorf("timed out after %v waiting for NonAdminDownloadRequest %q to be processed. statusInfo: %s", timeout, req.Name, statusInfo) |
kaovilai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few nits. could be follow up
Prevents OADP CLI operation hanging indefinitely.
This PR adds a --request-timeout flag (following kubectl conventions)
to all OADP CLI commands, with full support for cluster unreachability:
(takes precedence over OADP_CLI_REQUEST_TIMEOUT env var)
commands by overriding all client-creating methods (KubeClient,
DynamicClient, DiscoveryClient, KubebuilderClient, KubebuilderWatchClient)
for consistent kubectl-style CLI experience
dialer timeout behavior)
The timeout now applies to both HTTP requests and TCP connection
attempts across all commands, ensuring the CLI times out quickly when
the cluster is unreachable instead of waiting for the default ~30s TCP
timeout.
This PR includes similar changes to #99