Skip to content

Two users getting the same H2O Notebook #5728

@arunaryasomayajula

Description

@arunaryasomayajula
  1. SW cluster 1 started (multiple nodes) by user 1 with flow UI service on a certain port (for example, 000.000.000.001::54321).
  2. For some reason (could be timeout, oom etc.), the SW cluster 1 was dead and 000.000.000.001::54321 was released.
  3. In Spectrum Conductor, the status of the cluster 1 is still "started" with the flow UI link (000.000.000.001::54321).
  4. SW cluster 2 started by user 2 and it took 000.000.000.001::54321 and assigned flow UI service to this port.
  5. Now user 1 and user 2 will see the same cluster from Spectrum Conductor with flow UI service on 000.000.000.001::54321.

Sparkling Water Context:

  • Sparkling Water Version: 3.40.0.1-1-2.4
  • H2O name: k023042
  • cluster size: 6
  • list of used nodes:
    (executorId, host, port)

(0,10.119.198.87,54323)
(1,10.119.198.87,54325)
(2,10.119.198.88,54323)
(3,10.119.198.88,54325)
(4,10.119.198.173,54325)
(5,10.119.198.173,54335)

Open H2O Flow in browser: https://ppvra00a0011.osds..net:54325 (CMD + click in Mac OSX)

I suspect Flow UI crashed for some reason and port 54323 is released at Feb/20 05:02:30.

H2OContext has been closed! Please create a new H2OContext to a healthy and reachable (web enabled)
H2O cluster.
at ai.h2o.sparkling.H2OContext$$anon$1.run(H2OContext.scala:359)
Caused by: ai.h2o.sparkling.backend.exceptions.RestApiNotReachableException: H2O node https://10.119.198.87:54323 is not reachable.

AIMD H2O notebook starts at Feb/21 08:11:31, UI Flow binds to freed port 54323.
Providing us with the observed and expected behavior definitely helps. Giving us with the following information definitively helps:

  • Sparkling Water/PySparkling/RSparkling version
  • Hadoop Version & Distribution
  • Execution mode YARN-client, YARN-cluster, standalone, local ..
  • YARN logs in case of running on yarn. To collect such a logs you may run yarn logs -applicationId <application ID> where the application ID is displayed when Sparkling Water is started
  • H2O & Spark logs if not running on YARN. You can find these logs in Spark work directory
  • Are you using Windows/Linux/MAC?
  • Spark & Sparkling Water configuration including the memory configuration

Please also provide us with the full and minimal reproducible code.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions