Skip to content

Commit af573f3

Browse files
authored
Improve IP Monitor documentation (#675)
1 parent 0a33edc commit af573f3

File tree

3 files changed

+55
-5
lines changed

3 files changed

+55
-5
lines changed

.github/workflows/trivy.yaml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,4 +97,6 @@ jobs:
9797
9898
- name: Image Scan
9999
shell: bash
100-
run: make trivy-scan
100+
run: |
101+
echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u $ --password-stdin
102+
make trivy-scan

docs/coherence/090_ipmonitor.adoc

Lines changed: 37 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
///////////////////////////////////////////////////////////////////////////////
22

3-
Copyright (c) 2021, Oracle and/or its affiliates.
3+
Copyright (c) 2021, 2024, Oracle and/or its affiliates.
44
Licensed under the Universal Permissive License v 1.0 as shown at
55
http://oss.oracle.com/licenses/upl.
66

@@ -10,9 +10,41 @@
1010
1111
== Coherence IPMonitor
1212
13-
The Coherence IPMonitor is a failure detection mechanism used by Coherence to detect machine failures. It does this by pinging the echo port, (port 7) on remote hosts that other cluster members are running on. When running in Kubernetes, every Pod has its own IP address, so it looks to Coherence like every member is on a different host. Failure detection using IPMonitor is less useful in Kubernetes than it is on physical machines or VMs, so the Operator disables the IPMonitor by default. This is configurable though and if it is felt that using IPMonitor is useful to an application, it can be re-enabled.
13+
The Coherence IPMonitor is a failure detection mechanism used by Coherence to detect machine failures.
14+
It does this by pinging the echo port, (port 7) on remote hosts that other cluster members are running on.
15+
When running in Kubernetes, every Pod has its own IP address, so it looks to Coherence like every member is on a different host.
16+
Failure detection using IPMonitor is less useful in Kubernetes than it is on physical machines or VMs, so the Operator disables
17+
the IPMonitor by default. This is configurable though and if it is felt that using IPMonitor is useful to an application,
18+
it can be re-enabled.
1419
15-
To re-enable IPMonitor set the boolean flag `enableIpMonitor` in the `coherence` section of the Coherence resource yaml:
20+
=== Coherence Warning Message
21+
22+
Disabling IP Monitor causes Coherence to print a warning in the logs similar to the one shown below.
23+
This can be ignored when using the Operator.
24+
25+
[source]
26+
----
27+
2024-07-01 14:43:55.410/3.785 Oracle Coherence GE 14.1.1.2206.10 (dev-jonathanknight) <Warning> (thread=Coherence, member=n/a): IPMonitor has been explicitly disabled, this is not a recommended practice and will result in a minimum death detection time of 300 seconds for failed machines or networks.
28+
----
29+
30+
=== Re-Enable the IP Monitor
31+
32+
To re-enable IPMonitor set the boolean flag `enableIpMonitor` in the `coherence` section of the Coherence resource yaml.
33+
34+
[CAUTION]
35+
====
36+
The Coherence IP Monitor works by using Java's `INetAddress.isReachable()` method to "ping" another cluster member's IP address.
37+
Under the covers the JDK will use an ICMP echo request to port 7 of the server. This can fail if port 7 is blocked,
38+
for example using firewalls, or in Kubernetes using Network Policies or tools such as Istio.
39+
In particular when using Network Policies it is impossible to open a port for ICMP as currently Network Policies
40+
only support TCP or UDP and not ICMP.
41+
42+
If the Coherence IP Monitor is enabled in a Kubernetes cluster where port 7 is blocked then the cluster will fail to start.
43+
Typically, the issue will be seen as one member will start and become the senior member. None of the other cluster members
44+
will be abe to get IP Monitor to connect to the senior member, so they wil fail to start.
45+
====
46+
47+
The yaml below shows an example of re-enabling the IP Monitor.
1648
1749
[source,yaml]
1850
.coherence-storage.yaml
@@ -26,4 +58,5 @@ spec:
2658
enableIpMonitor: true
2759
----
2860
29-
Setting `enableIpMonitor` will disable the IPMonitor, which is the default behaviour when `enableIpMonitor` is not specified in the yaml.
61+
Setting `enableIpMonitor` field to `false` will disable the IPMonitor, which is the default behaviour when `enableIpMonitor` is
62+
not specified in the yaml.

docs/troubleshooting/01_trouble-shooting.adoc

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ This page will be updated and maintained over time to include common issues we s
3535
3636
* <<arm-java8, I'm using Arm64 and Java 8 and the JVM will not start due to using G1GC>>
3737
38+
* <<ipmon, Why do I see warnings about IPMonitor being disabled when Coherence starts>>
39+
3840
== Issues
3941
4042
[#no-operator]
@@ -250,3 +252,16 @@ This will cause errors on Arm64 Java 8 JMS unless the JVM option `-XX:+UnlockExp
250252
added in the Coherence resource spec (see <<docs/jvm/030_jvm_args.adoc,Adding Arbitrary JVM Arguments>>).
251253
Alternatively specify a different garbage collector, ideally on a version of Java this old, use CMS
252254
(see <<docs/jvm/040_gc.adoc,Garbage Collector Settings>>).
255+
256+
[#ipmon]
257+
=== Why do I see warnings about IPMonitor being disabled when Coherence starts
258+
259+
When Coherence starts a message similar to the following is displayed in the Coherence container's log:
260+
261+
[source]
262+
----
263+
2024-07-01 14:43:55.410/3.785 Oracle Coherence GE 14.1.1.2206.10 (dev-jonathanknight) <Warning> (thread=Coherence, member=n/a): IPMonitor has been explicitly disabled, this is not a recommended practice and will result in a minimum death detection time of 300 seconds for failed machines or networks.
264+
----
265+
266+
This message is because the default behaviour of the Operator is to disable the Coherence IP Monitor,
267+
see the <<_coherence_operator_api_docs/coherence/090_ipmonitor.adoc,IP Monitor documentation>> for an explanation.

0 commit comments

Comments
 (0)