Skip to content

Conversation

@ericywl
Copy link
Contributor

@ericywl ericywl commented Jun 17, 2025

Motivation/summary

apm-data plugin change in 9.0, i.e. 8.18 and 9.0 no longer share the same template: elastic/elasticsearch#129418. Hence upgrading from 8.18 to 9.0 will have lazy-rollover.

How to test these changes

Run integration-server-test workflow: https://github.com/elastic/apm-server/actions/runs/15700651775.

@ericywl ericywl self-assigned this Jun 17, 2025
@ericywl ericywl requested a review from a team as a code owner June 17, 2025 07:13
@github-actions
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@mergify
Copy link
Contributor

mergify bot commented Jun 17, 2025

This pull request does not have a backport label. Could you fix it @ericywl? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-7.17 is the label to automatically backport to the 7.17 branch.
  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • backport-9./d is the label to automatically backport to the 9./d branch. /d is the digit.
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@ericywl ericywl added the backport-skip Skip notification from the automated backport with mergify label Jun 17, 2025
# 8.19 and 9.1 have the same template due to
# https://github.com/elastic/elasticsearch/pull/128913.
9.1:
- "8.19"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked PR gets backported to 8.19.0, 9.0.3 and 9.1.0 - IMO upgrading from 8.18 to 9.0 would now lead to a lazy rollover.
Upgrading from 8.19 to 9.1 should not lead to a lazy rollover.

So I am not sure why 9.0 is listed as a lazy rollover exception now?

Copy link
Contributor Author

@ericywl ericywl Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may be some misunderstanding on the exceptions. The first level of the YAML are versions that will have lazy-rollover if upgraded to from another version. The second level are the exceptions, where upgrading from that version to the version listed in the first level will not yield any lazy-rollover.

So, in this case, upgrading from 8.19 to 9.1 will not have lazy-rollover. But upgrading from 8.18 to 9.0 will.

If there is a better way to represent this, please let me know. I also feel that this YAML config might be a little confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not particularly intuitive. Do we really need the first level? All versions onwards from X (I can't remember what X is) will do lazy rollover right? Does that need to be defined in configuration, or could we put that in the code, and test every major version from there on?

Then perhaps we could define the exceptions something like:

lazy-rollover-with-exceptions:
  - from: 8.19
    to: 9.1
    reason: "8.19 and 9.1 have the same template due to https://github.com/elastic/elasticsearch/pull/128913."

Copy link
Contributor Author

@ericywl ericywl Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not particularly intuitive. Do we really need the first level? All versions onwards from X (I can't remember what X is) will do lazy rollover right? Does that need to be defined in configuration, or could we put that in the code, and test every major version from there on?

It's not guaranteed that every new version will have lazy rollover. For example, 8.17 to 8.18 does not have lazy rollover because they share the same templates in ES apm-data plugin.

I do agree with it being unintuitive, perhaps the way you suggested (explicit from and to along with reason) would be better. Or maybe as Silvia mentioned in a previous PR, might be better to just define versions with lazy-rollover explicitly. I will look into improving it after this iteration. Thanks!

Copy link
Contributor

@endorama endorama Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been struggling with the current setup as well while writing tests for the resource version bug. As we bumped the resource version we expect lazy rollovers in more versions and adding all exceptions was not intuitive (for example now we expect a rollover if you upgrade to 8.17.8 from any 8.17 < 8.17.8). The fact that it also supports only matching major.minor instead of a full version is limiting.

I think we should ensure that every version is expected to have a lazy rollover, exceptions are the one that don't. The rationale is that if we default to "no rollover" we may miss cases where rollover is needed but we did not upgrade the test configuration. Given how critical rollovers are this failure mode is safer.

@ericywl ericywl requested review from a team and simitt June 18, 2025 03:25
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm not familiar with the details, but based on your description it looks sound.

# 8.19 and 9.1 have the same template due to
# https://github.com/elastic/elasticsearch/pull/128913.
9.1:
- "8.19"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not particularly intuitive. Do we really need the first level? All versions onwards from X (I can't remember what X is) will do lazy rollover right? Does that need to be defined in configuration, or could we put that in the code, and test every major version from there on?

Then perhaps we could define the exceptions something like:

lazy-rollover-with-exceptions:
  - from: 8.19
    to: 9.1
    reason: "8.19 and 9.1 have the same template due to https://github.com/elastic/elasticsearch/pull/128913."

@ericywl ericywl enabled auto-merge (squash) June 20, 2025 05:35
@ericywl ericywl merged commit 35ff144 into main Jun 20, 2025
17 checks passed
@ericywl ericywl deleted the iserver-test-fix-818-90-lazy-rollover branch June 20, 2025 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants