Skip to content

Conversation

@difin
Copy link
Contributor

@difin difin commented Jan 14, 2026

What changes were proposed in this pull request?

Adding ability for scaling HMS REST Catalog Server independently from HMS.
Currently, HMS REST Catalog Server is tied to HMS, it can be started only together with HMS in a single instance.
This PR proposes HA mode for HMS REST Catalog Server with horizontal scaling when active-active mode is used.

HA Modes:

  • Active-Active: allows horizontal scaling via load distribution.
  • Active-Passive: no horizontal scaling, but supports leader failover.

HA for HMS REST Catalog Server was implemented using ZkRegistryBase, same as HS2 HA mode.

Why are the changes needed?

Allows independent scaling from HMS of the HMS REST Catalog Server.

Does this PR introduce any user-facing change?

No

How was this patch tested?

New integration tests.

@difin difin force-pushed the hms_rest_catalog_server_scaling branch from f846f08 to d02fe25 Compare January 14, 2026 18:10
@sonarqubecloud
Copy link

@okumin
Copy link
Contributor

okumin commented Jan 14, 2026

The intention sounds great! I have one challenge: Do we really need active-passive with ZooKeeper? RESTful API should always be able to make use of a load balancer, whose configuration is typically easier than ZK.

@difin
Copy link
Contributor Author

difin commented Jan 15, 2026

@okumin Active-passive mode is not necessary for scaling, but active-active seems to what is needed. I used Zookeeper for consistency and code reuse, because it is already used in several places in Hive.

@deniskuzZ
Copy link
Member

deniskuzZ commented Jan 15, 2026

@okumin Active-passive mode is not necessary for scaling, but active-active seems to what is needed. I used Zookeeper for consistency and code reuse, because it is already used in several places in Hive.

@difin FYI there is an ongoing work to decommission Zookeeper.
cc @abstractdog

btw, why do we need coordinator here? i would envision the following flow:
Client → Load Balancer / API Gateway → HMS REST instance

@abstractdog
Copy link
Contributor

g work to decommission Zookeeper.

@okumin Active-passive mode is not necessary for scaling, but active-active seems to what is needed. I used Zookeeper for consistency and code reuse, because it is already used in several places in Hive.

@difin FYI there is an ongoing work to decommission Zookeeper. cc @abstractdog

btw, why do we need coordinator here? i would envision the following flow: Client → Load Balancer / API Gateway → HMS REST instance

while we're looking for native kubernetes alternatives for things we're currently doing with ZK, ZK is still a valid choice, as getting rid of it in the whole hive codebase would be too much in one go, especially because it's battle-tested, so reusing ZkRegistryBase is fine now (not to mention that hive still runs in old-school clusters with ZK nowadays in many places I guess)

@okumin
Copy link
Contributor

okumin commented Jan 15, 2026

Client → Load Balancer / API Gateway → HMS REST instance

I also think this is more than enough in most cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants