-
Notifications
You must be signed in to change notification settings - Fork 734
fix(sagemaker): improve error handling when the Space SSH host check fails #8287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
vpbhargav
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think worth testing this change in Widows/Mac and Linux if possible. Hard to validate regex validations and some of these platforms have casing, newline peculiarities.
d4c4eb1 to
078ca7a
Compare
1937081 to
467e065
Compare
|
|
||
| // Check against known versions | ||
| const knownVersions = [ | ||
| this.createSSHConfigSection(proxyCommand).trim(), // Current version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary, but something to think about: Instead of generating all possible config values (every time), you could store a list of functions here. Instead of iterating the values, you iterate the functions, and for each one, you call the function and then check for the result in the config. If found, then you shouldn't need to proceed to the next function, avoiding generating additional strings. This isn't necessary in this case because these strings are relatively short, and there won't be very many, so the performance impact is negligible, but it's worth being aware of this "lazy" pattern for future situations.
| let startIndex = -1 | ||
| let endIndex = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to use these pointers? This feels over-complicated. If we stick to exact string matching (using the list of past expected sm_* sections), we can easily determine the start and end indices if we need to make a replacement.
const startIndex = configContent.indexOf(pastSection)
if (startIndex == -1) {
// Not found
}
const endIndex = startIndex + pastSection.length
// Then slice and generate the new content stringWe can probably simplify even further, as I don't imagine we need the indices.
if (configContent.includes(pastSection)) {
const newContent = configContent.replace(pastSection, newSection)
// Then write newContent to the file
}| errorDetails = `\n\nError: ${updateError.message}` | ||
| } | ||
|
|
||
| const message = `Failed to update your ${sshConfigPath} file automatically.${errorDetails}\n\nOpen the file to fix the issue manually.` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have to tell the user we failed to make an update.
When we tell the user "fix the issue manually", will it be clear what they should fix? For example, will we be displaying the ssh -G sm_test error output? If this applies to the case where the sm_* section needs to be deleted by the user, we can specify that in the message.
| await this.removeSshConfigSection() | ||
| // Write the new section | ||
| await this.writeSectionToConfig(proxyCommand) | ||
| logger.info('Successfully updated sm_* section') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a constant or variable in place of sm_*? It's possible we will need to use a different string for other IDEs (since the global storage path for the sagemaker_connect script will be different) so it's better not to hard-code this.
| * This is shown when the host section exists but is outdated. | ||
| */ | ||
| private async promptToUpdateSshConfig(): Promise<boolean> { | ||
| logger.warn(`Section is outdated for ${this.configHostName}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of these comments don't specifically mention the SSH Config. Previously, you had manually prefixed them all with SSH Config, but now that we are using a logger with the sagemaker prefix, SSH Config context is not present, making it hard to understand what these log messages are about. To improve, we can either add more context to the logger (e.g., sagemaker SSH config), or update the wording of each log message.
| try { | ||
| await this.promptOtherSshConfigError(sshError) | ||
| const configOpenedError = new ToolkitError(SshConfigOpenedForEditMessage(), { | ||
| code: 'SshConfigOpenedForEdit', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw a previous comment about defining constants for these error codes, but looks like that feedback wasn't yet implemented. Not required, but I figured I would remind you in case you planned on doing that.
| return Result.err(externalConfigError) | ||
| } | ||
| return Result.err( | ||
| ToolkitError.chain(e, 'Unexpected error while handling SSH config error', { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"error while handling ... error" sounds odd. Makes me wonder about how this logic is structured.
Problem
When attempting to connect to a SageMaker Space, if the user's SSH config had issues (like the old
User '%r'directive),users would see an unhelpful error :
This happened because:
Solution
Implemented a proactive SSH config validation that checks for outdated config before running SSH validation
Prompts users to update outdated config automatically
Provides helpful error messages with line numbers for external errors
Validates before showing progress dialogs (better UX)
Changes Made
New Files Created
packages/core/src/awsService/sagemaker/sshConfig.ts: CreatedSageMakerSshConfigclass that extends the baseSshConfigclass with SageMaker specific SSH logic.packages/core/src/test/awsService/sagemaker/sshConfig.test.ts- test suiteModified
packages/core/src/awsService/sagemaker/model.ts: RemovedensureValid()fromprepareDevEnvConnection()and added comment explaining validation happens incommands.tspackages/core/src/awsService/sagemaker/commands.ts: AddedvalidateSshConfig()which will validate before showing progress dialogs so that users can fix issues before any connection attempt startsTesting
User '%r')feature/xbranches will not be squash-merged at release time.