Skip to content

az ssh arc exits with code 0 even on SSH connection failure #8809

@Nishant-Kumar07

Description

@Nishant-Kumar07

Describe the bug

When running the az ssh arc command, even if the underlying SSH connection fails completely, the command still exits with code 0, indicating success.

This behavior is incorrect and breaks automation workflows (scripts, pipelines, retries), which depend on exit codes to determine success or failure.

Below screenshot taken from generated log files post running the following command (intentionally not providing substituted details but the code ensures all the substitution is correct)

cmdArcSSH := fmt.Sprintf("az ssh arc --debug --subscription %s --resource-group %s --name %s --local-user azureuser --private-key-file %s --port 22 --yes -- -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null &> %s", subscription, managedRGName, controlPlaneARCVM, internal.SSHPrivateKeyLocation, arcSSHCmdLogPath)

Image

I see the the code opens a subprocess here:
https://github.com/Azure/azure-cli-extensions/blob/1ceafea82844345065fd47d484ffdc75ab9a9bf1/src/ssh/azext_ssh/ssh_utils.py#L68C1-L68C21

and then waits:

but there is no check for exit code to propagate the failure

Related command

cmdArcSSH := fmt.Sprintf("az ssh arc --debug --subscription %s --resource-group %s --name %s --local-user azureuser --private-key-file %s --port 22 --yes -- -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null &> %s", subscription, managedRGName, controlPlaneARCVM, internal.SSHPrivateKeyLocation, arcSSHCmdLogPath)

Errors

WARNING: cli.azext_ssh.ssh_utils: SSH connection failure could still be due to Service Configuration update. Please re-run command.
DEBUG: cli.knack.cli: Event: CommandInvoker.OnTransformResult [<function _resource_group_transform at 0x7f13b022c2c0>, <function _x509_from_base64_to_hex_transform at 0x7f13b022c360>]
DEBUG: cli.knack.cli: Event: CommandInvoker.OnFilterResult []
DEBUG: cli.knack.cli: Event: Cli.SuccessfulExecute []
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x7f13b01d3600>]
INFO: az_command_data_logger: exit code: 0

Issue script & Debug output

WARNING: cli.azext_ssh.ssh_utils: SSH connection failure could still be due to Service Configuration update. Please re-run command.
DEBUG: cli.knack.cli: Event: CommandInvoker.OnTransformResult [<function _resource_group_transform at 0x7f13b022c2c0>, <function _x509_from_base64_to_hex_transform at 0x7f13b022c360>]
DEBUG: cli.knack.cli: Event: CommandInvoker.OnFilterResult []
DEBUG: cli.knack.cli: Event: Cli.SuccessfulExecute []
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x7f13b01d3600>]
INFO: az_command_data_logger: exit code: 0

Expected behavior

If the SSH connection succeeds:
The CLI should return exit code 0

Standard output and stderr should contain normal connection or command output

If the SSH connection fails (examples):
The proxy fails to connect (404 There are no listeners connected)

SSH times out

Authentication fails

Remote host is unreachable

Then:

The CLI should return a non-zero exit code, ideally matching the SSH or proxy process's return code (commonly 255).

This is consistent with standard CLI behavior (like ssh, scp, etc.) and is crucial for:

Automation

Scripting

Retry logic

CI/CD pipelines

Environment Summary

Image

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Auto-AssignAuto assign by botService AttentionThis issue is responsible by Azure service team.VM SSHbugThis issue requires a change to an existing behavior in the product in order to be resolved.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions