Skip to content

Files: Intermittent failure to establish SSH session. #188

@scblack321

Description

@scblack321

Users are sometimes seeing an error when attempting file operations.
TLDR: Root cause:

Caused by: java.net.SocketTimeoutException: No incoming initialization response received within PT15S msec.

Possibly the timeout can be increased or a retry mechanism implemented.

Although 15 seconds does seem like it should be enough time to establish an SSH connection.

More detail:
E.g.

"FILES_OPSC_ERR Operations error. Tenant: portals ApiUserId: sal OboTenant: portals OboUser: sal Operation: mkdir System: frontera Path: /home1/05089/sal/.tap Error: FILES_OPSC_ERR Operations error. OboTenant: portals OboUser: sal Operation: mkdirWithClient System: frontera Path: home1/05089/sal/.tap Error: FILES_CLIENT_SSH_OP_ERR1 Error during operation. OboTenant: portals OboUser: sal Operation: lstat System: frontera EffectiveUser: sal Host: frontera.tacc.utexas.edu Path: home1/ Error: SSH_POOL_UNABLE_TO_ESTABLISH_SESSION Unable to establish session. Tenant: portals, Host: frontera.tacc.utexas.edu, Port: 22, EffectiveUserId: sal, AuthnMethod: TMS_KEYS"

Root cause appears to be:

2025-08-18 19:18:47.702 ERROR [grizzly-http-server-0] e.u.t.t.f.l.services.FileOpsService:513 - FILES_OPSC_ERR Operations error. OboTenant: portals OboUser: kyle_yu Operation: mkdirWithClient System: frontera Path: home1/09476/kyle_yu/.tap Error: FILES_CLIENT_SSH_OP_ERR1 Error during operation. OboTenant: portals OboUser: kyle_yu Operation: lstat System: frontera EffectiveUser: kyle_yu Host: frontera.tacc.utexas.edu Path: home1/ Error: SSH_POOL_UNABLE_TO_ESTABLISH_SESSION Unable to establish session. Tenant: portals, Host: frontera.tacc.utexas.edu, Port: 22, EffectiveUserId: kyle_yu, AuthnMethod: TMS_KEYS
java.io.IOException: FILES_CLIENT_SSH_OP_ERR1 Error during operation. OboTenant: portals OboUser: kyle_yu Operation: lstat System: frontera EffectiveUser: kyle_yu Host: frontera.tacc.utexas.edu Path: home1/ Error: SSH_POOL_UNABLE_TO_ESTABLISH_SESSION Unable to establish session. Tenant: portals, Host: frontera.tacc.utexas.edu, Port: 22, EffectiveUserId: kyle_yu, AuthnMethod: TMS_KEYS
at edu.utexas.tacc.tapis.files.lib.clients.SSHDataClient.getStatInfo(SSHDataClient.java:587)
at edu.utexas.tacc.tapis.files.lib.clients.SSHDataClient.mkdir(SSHDataClient.java:265)
at edu.utexas.tacc.tapis.files.lib.services.FileOpsService.mkdir(FileOpsService.java:506)
at edu.utexas.tacc.tapis.files.lib.services.FileOpsService.mkdir(FileOpsService.java:469)
at edu.utexas.tacc.tapis.files.api.resources.OperationsApiResource.mkdir(OperationsApiResource.java:237)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:176)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:256)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:235)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684)
at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:356)
at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: SSH_POOL_UNABLE_TO_ESTABLISH_SESSION Unable to establish session. Tenant: portals, Host: frontera.tacc.utexas.edu, Port: 22, EffectiveUserId: kyle_yu, AuthnMethod: TMS_KEYS
at edu.utexas.tacc.tapis.files.lib.clients.SSHDataClient.borrowAutoCloseableSftpClient(SSHDataClient.java:1036)
at edu.utexas.tacc.tapis.files.lib.clients.SSHDataClient.getStatInfo(SSHDataClient.java:580)
... 28 common frames omitted
Caused by: edu.utexas.tacc.tapis.shared.exceptions.TapisException: SSH_POOL_UNABLE_TO_ESTABLISH_SESSION Unable to establish session. Tenant: portals, Host: frontera.tacc.utexas.edu, Port: 22, EffectiveUserId: kyle_yu, AuthnMethod: TMS_KEYS
at edu.utexas.tacc.tapis.shared.ssh.SshConnectionGroup.reserveSessionOnConnection(SshConnectionGroup.java:186)
at edu.utexas.tacc.tapis.shared.ssh.SshSessionPool.reserveSessionOnConnection(SshSessionPool.java:206)
at edu.utexas.tacc.tapis.shared.ssh.SshSessionPool.borrowSftpClient(SshSessionPool.java:166)
at edu.utexas.tacc.tapis.files.lib.clients.SSHDataClient.borrowAutoCloseableSftpClient(SSHDataClient.java:1016)
... 29 common frames omitted
Caused by: java.net.SocketTimeoutException: No incoming initialization response received within PT15S msec.
at org.apache.sshd.sftp.client.impl.DefaultSftpClient.waitForInitResponse(DefaultSftpClient.java:465)
at org.apache.sshd.sftp.client.impl.DefaultSftpClient.init(DefaultSftpClient.java:388)
at org.apache.sshd.sftp.client.impl.DefaultSftpClient.(DefaultSftpClient.java:117)
at org.apache.sshd.sftp.client.impl.DefaultSftpClientFactory.createDefaultSftpClient(DefaultSftpClientFactory.java:66)
at org.apache.sshd.sftp.client.impl.DefaultSftpClientFactory.createSftpClient(DefaultSftpClientFactory.java:50)
at org.apache.sshd.sftp.client.SftpClientFactory.createSftpClient(SftpClientFactory.java:69)
at org.apache.sshd.sftp.client.SftpClientFactory.createSftpClient(SftpClientFactory.java:46)
at edu.utexas.tacc.tapis.shared.ssh.apache.SSHSftpClient.(SSHSftpClient.java:80)
at edu.utexas.tacc.tapis.shared.ssh.apache.SSHConnection.getSftpClient(SSHConnection.java:254)
at edu.utexas.tacc.tapis.shared.ssh.SshConnectionContext.constructSftpClient(SshConnectionContext.java:268)
at edu.utexas.tacc.tapis.shared.ssh.SshSessionHolder.createSession(SshSessionHolder.java:39)
at edu.utexas.tacc.tapis.shared.ssh.SshConnectionGroup.reserveSessionOnConnection(SshConnectionGroup.java:177)
... 32 common frames omitted

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

To Do

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions