Skip to content

JDBC connection issues related to gRPC (TRANSPORT_UNAVAILABLE, BAD_SESSION) in YC serverless environment #138

@atten

Description

@atten

My application consists of 2 components deployed in Yandex Cloud:

  • Serverless YDB
  • Serverless Container (Java application: JPA + hibernate + ydb-jdbc-driver, without Spring)

Java dependencies:

implementation 'jakarta.persistence:jakarta.persistence-api:3.2.0'
implementation 'org.hibernate.orm:hibernate-core:7.1.0.Final'
implementation 'org.hibernate.orm:hibernate-hikaricp:7.1.0.Final'
implementation 'tech.ydb.dialects:hibernate-ydb-dialect:1.5.0'
implementation 'tech.ydb.jdbc:ydb-jdbc-driver:2.3.16'

No matter which JDBC connection pool I use, the same pattern eventually persist:

  1. Application initialized normally
main | INFO  | o.h.j.i.util.LogHelper                   | HHH000204: Processing PersistenceUnitInfo [name: default]
main | INFO  | org.hibernate.Version                    | HHH000412: Hibernate ORM core version 7.1.0.Final
main | INFO  | c.z.h.HikariDataSource                   | HikariPool-1 - Starting...
main | INFO  | t.y.c.i.YdbTransportImpl                 | Create YDB transport with endpoint Endpoint{host=ydb.serverless.yandexcloud.net, port=2135, node=0, location=null, overrideAuthority=null} and BalancingSettings{policy=USE_ALL_NODES, preferableLocation='null}
main | INFO  | t.y.c.i.p.ChannelFactoryLoader           | class io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder is found, use ShadedNettyChannelFactory
main | INFO  | t.y.c.a.StaticCredentials                | create static identity for database /local
ydb-jdbc-scheduler[1924503898]-thread-1 | INFO  | t.y.core.impl.YdbDiscovery               | Waiting for init discovery...
main | INFO  | t.y.t.i.pool.SessionPool                 | init session pool, min size = 10, max size = 50, keep alive period = 30000
main | INFO  | t.y.query.impl.SessionPool               | init QuerySession pool, min size = 0, max size = 50, keep alive period = 100000
main | INFO  | c.z.hikari.pool.PoolBase                 | HikariPool-1 - Driver does not support get/set network timeout for connections. (Set network timeout is not supported yet)
main | INFO  | t.y.query.impl.SessionPool               | closing QuerySession pool
main | INFO  | t.y.t.i.pool.SessionPool                 | closing session pool
main | INFO  | c.z.h.HikariDataSource                   | HikariPool-1 - Start completed.
HikariPool-1:connection-adder | INFO  | t.y.c.i.YdbTransportImpl                 | Create YDB transport with endpoint Endpoint{host=localhost, port=2136, node=0, location=null, overrideAuthority=null} and BalancingSettings{policy=USE_ALL_NODES, preferableLocation='null}
HikariPool-1:connection-adder | INFO  | t.y.c.a.StaticCredentials                | create static identity for database /local
ydb-jdbc-scheduler[1924503898]-thread-1 | INFO  | t.y.core.impl.YdbDiscovery               | Waiting for init discovery...
HikariPool-1:connection-adder | INFO  | t.y.t.i.pool.SessionPool                 | init session pool, min size = 10, max size = 50, keep alive period = 30000
HikariPool-1:connection-adder | INFO  | t.y.query.impl.SessionPool               | init QuerySession pool, min size = 0, max size = 50, keep alive period = 100000
main | INFO  | o.h.o.connections.pooling                | HHH10001005: Database info:
	Database JDBC URL [jdbc:ydb:grpcs://ydb.serverless.yandexcloud.net:2135/ru-central1/XXX/YYY?useQueryService=true&useMetadata=true]
	Database driver: tech.ydb.jdbc.YdbDriver
	Database dialect: YdbDialect
	Database version: 0.0
	Default catalog/schema: unknown/undefined
	Autocommit mode: true
	Isolation level: SERIALIZABLE
	JDBC fetch size: 1000
	Pool: HikariCPConnectionProvider
	Minimum pool size: 0
	Maximum pool size: 1
  1. Application handles user requests
  2. (after 5 min of inactivity):
grpc-default-executor-1 | WARN | t.y.core.grpc.GrpcStatuses | gRPC issue: UNAVAILABLE, Network closed for unknown reason
grpc-default-executor-1 | WARN  | t.y.query.impl.SessionPool               | QuerySession[ydb://session/3?node_id=XXX&id=MjEyY...WQwNTZjMzQ=] finished with status Status{code = TRANSPORT_UNAVAILABLE(code=401010), issues = [gRPC error: (UNAVAILABLE) Network closed for unknown reason (S_ERROR)]}
  1. Subsequent database requests via EntityManager fail:
grpc-default-executor-1 | WARN  | t.y.core.grpc.GrpcStatuses               | gRPC issue: UNAVAILABLE, io exception
grpc-default-executor-1 | WARN  | t.y.query.impl.SessionPool               | QuerySession[ydb://session/3?node_id=XXX&id=MjEyY...WQwNTZjMzQ=] broken with status Status{code = TRANSPORT_UNAVAILABLE(code=401010), issues = [gRPC error: (UNAVAILABLE) io exception (S_ERROR), io.grpc.netty.shaded.io.netty.handler.ssl.SslClosedEngineException: SSLEngine closed already (S_ERROR)], cause = SSLEngine closed already}
grpc-default-executor-1 | WARN | t.y.core.grpc.GrpcStatuses | gRPC issue: UNAVAILABLE, Network closed for unknown reason

I've tried different connection pools for the JPA container (built-in DriverManagerConnectionProviderImpl, HikariCP HikariCPConnectionProvider), played with pool settings (poolSize, maxLifetime, idleTimeout, allowPoolSuspension etc). It did not help fixing the issue, only change the status code from 401010 TRANSPORT_UNAVAILABLE to 400100 BAD_SESSION.

For some reason, these errors can't be reproduced while launching application on localhost or virtual machine, even with same YDB and connection settings. Only specified combination (Serverless YDB + Serverless Container) produces these errors.

My assumptions (please correct if I'm wrong):

  1. 5 min connection timeout is a built-in restriction to serverless YDB and can't be changed.
  2. Appropriate connection timeouts should be set on application side in order to handle this server-side restriction properly.
  3. TheHikariPool error Driver does not support get/set network timeout for connections implies these timeouts should be set somewhere else, probably for tech.ydb.query.impl.SessionPool.
  4. Class tech.ydb.query.impl.SessionPool does not support settings timeouts (manually or via JDBC properties).
  5. tech.ydb.query.impl.SessionPool does not have mechanisms to automatically recovery stale sessions.

If my assumptions aren't wrong, then I guess the root cause is in gRPC SessionPool. It seems to be unrelated to JDBC connection pool thus can't be configured. Why it creates 10 GRPC sessions while JDBC pool size is set to 1? Why the stale session did not return to the pool automatically? What is the purpose of it if it does have health checks or robust session management?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions