Skip to content

Jbtm 3987 starting LRAs via a cluster of coordinators#50

Merged
mmusgrov merged 4 commits intojbosstm:mainfrom
mmusgrov:JBTM-3987
Jun 23, 2025
Merged

Jbtm 3987 starting LRAs via a cluster of coordinators#50
mmusgrov merged 4 commits intojbosstm:mainfrom
mmusgrov:JBTM-3987

Conversation

@mmusgrov
Copy link
Member

@mmusgrov mmusgrov commented Jun 16, 2025

https://issues.redhat.com/browse/JBTM-3987

The depends on pr has been merged.

!JACOCO !JDK21

This PR adds support for starting LRAs using a cluster of coordinators. This version supports a static list of coordinator endpoints via the config property "lra.coordinator.urls". The change uses the SmallRye Stork service discovery and load balancer framework. The discovery of coordinators is integrated into NarayanaLRAClient, ie client side, and currently uses the "lra.coordinator.urls" property to configure a static list of coordinators, a simple change can be made for supporting the many other discovery mechanisms such as Consul, Kubernetes etc provided by Stork but I thought it best to start out with a static list. Note that only the creation of new LRAs is load balanced and we don't support shared lifecycle management of LRAs yet so, in particular, it won't support failover.

I need the PR integration tests to run with the latest code so I included a "TEST CHANGE ONLY: enable the option to use war from Narayana when run commit that does that using the feature/fix that @arjantijms recently added.

I updated LRATest to use a cluster of two coordinators. They share the same store and even though such a config would not be viable for users it works well in the context of our test suite since the issues around sharing stores only becomes apparent when coordinators re-read the store and start managing each others transactions (I may add a test that shows the problem at a future date). I'll also follow up with a doc update and a jbossts blogspot article.

@mmusgrov mmusgrov changed the title Jbtm 3987 Jbtm 3987 starting LRAs via a cluster of coordinators Jun 16, 2025
@marcosgopen
Copy link
Member

Thanks @mmusgrov for your PR! It is great to have the 'support for starting LRAs using a cluster of coordinators'! I can see that the TCK failed, that is probably because the Stork module needs to be added in the lra-participant module of WildFly first[1].

[1]WELD-001474: Class io.narayana.lra.client.internal.NarayanaLRAClient is on the classpath, but was ignored because a class it references was not found: io.smallrye.stork.api.config.ConfigWithType from [Module \"org.jboss.narayana.lra.lra-participant\" version 1.0.2.Final-SNAPSHOT from local module loader @12591ac8 (finder: local module finder @5a7fe64f (roots: /home/runner/work/lra/lra/jboss-as/build/target/wildfly-37.0.0.Beta1-SNAPSHOT/modules,/home/runner/work/lra/lra/jboss-as/build/target/wildfly-37.0.0.Beta1-SNAPSHOT/modules/system/layers/base,/home/runner/work/lra/lra/jboss-as/testsuite/integration/microprofile-tck/lra/target/modules))].

@mmusgrov mmusgrov added the Hold label Jun 18, 2025
@mmusgrov
Copy link
Member Author

Thanks Marco, I've added the HOLD label while we investigate. I'll try out your recommendation and will report back later today.

@mmusgrov
Copy link
Member Author

mmusgrov commented Jun 18, 2025

FYI On a clean check out and build of WildFly followed by a run of the microprofile-tck testsuite [1], (after building my PR 1.0.2.Final-SNAPSHOT), I get a pass.

Since the PR runs with our version of WildFly (ie https://github.com/jbosstm/jboss-as.git) the failure must be related to our clone, although our clone was updated on Monday and only contains some unrelated testsuite changes, sha d46d20f9d1

I'll try building a standard WildFly using our script (./scripts/hudson/narayana.sh) to see if that is producing/doing something odd.

[1] ./build.sh -f testsuite/integration/microprofile-tck/lra/pom.xml -fae -B -Dversion.org.jboss.narayana.lra=1.0.2.Final-SNAPSHOT clean test

[Edit] Martin has given me a patch for our jboss-as clone that enables the change to run against the WildFly MP LRA testsuite. I'll update that with his patch and then re-trigger PR.

@mmusgrov
Copy link
Member Author

mmusgrov commented Jun 19, 2025

Thanks @xstefank - adding your commit fixes the WildFly testsuite failure ie `Test Suite: Integration - MicroProfile TCK - LRA' passes.

Since the fix was to add a "Depends on jbosstm/jboss-as#97" to the PR description I had to manually triggered the LRA-CORE / LRA github action for JDK 17 and 21 (https://github.com/jbosstm/lra/actions/runs/15759919429 shows them passing) and therefore the errors in the "Some checks were not successful" panel below remain - I wonder if there is a GH Action trigger corresponding to "Depends on ...".

Will you review the PR. @marcosgopen I'll remove the HOLD label.

@mmusgrov mmusgrov removed the Hold label Jun 19, 2025
@jbosstm-bot
Copy link

@jbosstm-bot
Copy link

@marcosgopen
Copy link
Member

Thanks Martin and Mike, I started the jobs on the CI to test this PR against our jboss-as repo.
The PR is great! I am going to approve it but let's wait for the tests to pass please. Thanks
BTW could we use a specific PR of jboss-as to test lra PRs with github actions? cc @tomjenkinson @jmfinelli

@jbosstm-bot
Copy link

@jbosstm-bot
Copy link

@mmusgrov
Copy link
Member Author

mmusgrov commented Jun 20, 2025

Thanks Martin and Mike, I started the jobs on the CI to test this PR against our jboss-as repo. The PR is great! I am going to approve it but let's wait for the tests to pass please. Thanks BTW could we use a specific PR of jboss-as to test lra PRs with github actions? cc @tomjenkinson @jmfinelli

They PR did run with with the specific PR because of the "Depends on jbosstm/jboss-as#97" in the pull description (see https://github.com/jbosstm/lra/actions/runs/15759919429 for the successful runs).

I thought we are using github actions for this repo, are we maintaining both build systems now on this repo?

@marcosgopen
Copy link
Member

marcosgopen commented Jun 20, 2025

Thanks Martin and Mike, I started the jobs on the CI to test this PR against our jboss-as repo. The PR is great! I am going to approve it but let's wait for the tests to pass please. Thanks BTW could we use a specific PR of jboss-as to test lra PRs with github actions? cc @tomjenkinson @jmfinelli

They PR did run with with the specific PR because of the "Depends on jbosstm/jboss-as#97" in the pull description (see https://github.com/jbosstm/lra/actions/runs/15759919429 for the successful runs).

I thought we are using github actions for this repo, are we maintaining both build systems now on this repo?

@mmusgrov the successful run you are pointing at (https://github.com/jbosstm/lra/actions/runs/15759919429) used the lra source code from main branch.

I thought we are using github actions for this repo, are we maintaining both build systems now on this repo?

Yes we are using github actions for this repo, but I can see the runs (https://github.com/jbosstm/lra/actions/runs/15689158243/job/44199809265?pr=50) are not using the correct jboss-as (jbosstm/jboss-as#97)

@jbosstm-bot
Copy link

@marcosgopen
Copy link
Member

I had to add explicitly 'jbosstm/jboss-as#97' in the description instead of 'jbosstm/jboss-as#97' in order to use that PR branch for testing cc @mmusgrov (this is something we need to remember/document)

@jbosstm-bot
Copy link

@mmusgrov
Copy link
Member Author

mmusgrov commented Jun 20, 2025

I had to add explicitly 'jbosstm/jboss-as#97' in the description instead of 'jbosstm/jboss-as#97' in order to use that PR branch for testing cc @mmusgrov (this is something we need to remember/document)

'jbosstm/jboss-as#97' is in the description twice now. What do we need to remember/document?

@tomjenkinson
Copy link
Member

I had to add explicitly 'jbosstm/jboss-as#97' in the description instead of 'jbosstm/jboss-as#97' in order to use that PR branch for testing cc @mmusgrov (this is something we need to remember/document)

'jbosstm/jboss-as#97' is in the description twice now. What do we need to remember/document?

I think the one that is not a full URL can be deleted from the description. I have proposed some text to help clarify the expectations (https://github.com/jbosstm/narayana/pull/2379/files)

@jbosstm-bot
Copy link

@mmusgrov
Copy link
Member Author

mmusgrov commented Jun 20, 2025

I had to add explicitly 'jbosstm/jboss-as#97' in the description instead of 'jbosstm/jboss-as#97' in order to use that PR branch for testing cc @mmusgrov (this is something we need to remember/document)

'jbosstm/jboss-as#97' is in the description twice now. What do we need to remember/document?

I think the one that is not a full URL can be deleted from the description. I have proposed some text to help clarify the expectations (https://github.com/jbosstm/narayana/pull/2379/files)

I'm not following you, the two links are identical (jbosstm/jboss-as#97 vs jbosstm/jboss-as#97).

[Edit] @tomjenkinson Ah I see what you mean: jbosstm/jboss-as#97 vs https://github.com/jbosstm/jboss-as/pull/97
I'll delete the first one.

@jbosstm-bot
Copy link

*/
public static final String LRA_COORDINATOR_URL_KEY = "lra.coordinator.url";
public static final String COORDINATOR_URLS_KEY = "lra.coordinator.urls";
public static final String USE_STORK_KEY = "lra.coordinator.discoverable";
Copy link
Member Author

@mmusgrov mmusgrov Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this since it isn't used.

fail(testName + ": current thread should not be associated with any LRAs");
}
if (lraClient.getCurrent() != null) {
lraClient.clearCurrent(false); // otherwise it will interfere with subsequent tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear this at the end otherwise the assert on line 203 is always going to pass!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcosgopen I'll push an update to these two after your review in case you want to request other changes.

Copy link
Member

@marcosgopen marcosgopen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mmusgrov for your PR, together with the changes you mentioned I am approving this great PR! The CI tests passed now

@jbosstm-bot
Copy link

@jbosstm-bot
Copy link

@mmusgrov mmusgrov closed this Jun 23, 2025
@mmusgrov mmusgrov reopened this Jun 23, 2025
@mmusgrov
Copy link
Member Author

mmusgrov commented Jun 23, 2025

I will merge this but note that github actions will fail until we have merged the jboss-as changes because the GH Actions do not have the ability to use that jboss-as clone/.

[edit] I have merged the jboss-as changes now so GH Actions should be working again,.

@mmusgrov mmusgrov merged commit 93cfe88 into jbosstm:main Jun 23, 2025
2 of 8 checks passed
@mmusgrov mmusgrov deleted the JBTM-3987 branch June 23, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants