Skip to content

vertx-zookeeper doesn't gracefully disconnect from zookeeper server #101

@srinskit

Description

@srinskit

Version

Vertx and everything else vertx: 3.9.1
Zookeeper server: Checked with 3.4 and 3.6

Context

When the vertx instance using vertx-zookeeper exits, a socket exception is thrown at the zookeeper server. The server maintains the records from that instance till the session times out. This causes other vertx instances to attempt to connect to dead services.

Reproducer

A calculator application with an API-server verticle and an adder-service verticle. Repo.
The latest Zookeeper server docker image running at localhost and default port with default configurations. Ref.

  1. Starting Zookeeper server
docker run -p 2181:2181 zookeeper:latest
  1. Build reproducer package
mvn clean package
  1. Starting vertx instance 1
# Directory needed by API server
mkdir data

# API server and Adder service
java -jar target/calc-0.0.1-SNAPSHOT-fat.jar -m adder-service api-server --host localhost --zookeepers localhost
  1. Starting vertx instance 2
# Just the Adder service
java -jar target/calc-0.0.1-SNAPSHOT-fat.jar -m adder-service --host localhost --zookeepers localhost
  1. Check zookeeper tree
zookeepercli -servers localhost -c lsr /io.vertx

Two adder service addresses are present, as expected.

asyncMultiMap
asyncMultiMap/__vertx.subs
asyncMultiMap/__vertx.subs/adder-service-address
asyncMultiMap/__vertx.subs/adder-service-address/ab3d7101-ee73-4460-8f36-fc7a42aae813:localhost:36823
asyncMultiMap/__vertx.subs/adder-service-address/c94b5ad8-15ec-4847-9e94-768040eed01a:localhost:38607
cluster
cluster/nodes
cluster/nodes/ab3d7101-ee73-4460-8f36-fc7a42aae813
cluster/nodes/c94b5ad8-15ec-4847-9e94-768040eed01a
locks
locks/__cluster_init_lock
locks/__cluster_init_lock/leases
locks/__cluster_init_lock/locks
syncMap
syncMap/__vertx.haInfo
syncMap/__vertx.haInfo/ab3d7101-ee73-4460-8f36-fc7a42aae813
syncMap/__vertx.haInfo/c94b5ad8-15ec-4847-9e94-768040eed01a
  1. Close instance 2 with CTRL-C

Zookeeper server throws this instantly

2020-07-10 13:10:10,102 [myid:1] - WARN  [NIOWorkerThread-6:NIOServerCnxn@364] - Unexpected exception
EndOfStreamException: Unable to read additional data from client, it probably closed the socket: address = /172.17.0.1:53626, session = 0x100014c6f350004
	at org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163)
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:326)
	at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
	at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

Checking zookeeper tree again shows the adder service entry from instance 2 still exists.

asyncMultiMap
asyncMultiMap/__vertx.subs
asyncMultiMap/__vertx.subs/adder-service-address
asyncMultiMap/__vertx.subs/adder-service-address/ab3d7101-ee73-4460-8f36-fc7a42aae813:localhost:36823
asyncMultiMap/__vertx.subs/adder-service-address/c94b5ad8-15ec-4847-9e94-768040eed01a:localhost:38607
cluster
cluster/nodes
cluster/nodes/ab3d7101-ee73-4460-8f36-fc7a42aae813
cluster/nodes/c94b5ad8-15ec-4847-9e94-768040eed01a
locks
locks/__cluster_init_lock
locks/__cluster_init_lock/leases
locks/__cluster_init_lock/locks
syncMap
syncMap/__vertx.haInfo
syncMap/__vertx.haInfo/ab3d7101-ee73-4460-8f36-fc7a42aae813
syncMap/__vertx.haInfo/c94b5ad8-15ec-4847-9e94-768040eed01a

Zookeeper server eventually removes the session

2020-07-10 13:10:26,484 [myid:1] - INFO  [SessionTracker:ZooKeeperServer@600] - Expiring session 0x100014c6f350004, timeout of 20000ms exceeded

and the tree is fixed

asyncMultiMap
asyncMultiMap/__vertx.subs
asyncMultiMap/__vertx.subs/adder-service-address
asyncMultiMap/__vertx.subs/adder-service-address/ab3d7101-ee73-4460-8f36-fc7a42aae813:localhost:36823
cluster
cluster/nodes
cluster/nodes/ab3d7101-ee73-4460-8f36-fc7a42aae813
locks
locks/__cluster_init_lock
locks/__cluster_init_lock/leases
locks/__cluster_init_lock/locks
syncMap
syncMap/__vertx.haInfo
syncMap/__vertx.haInfo/ab3d7101-ee73-4460-8f36-fc7a42aae813

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions