-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML] Refactor AutodetectMemoryLimitIT to use memory limit constants and improve model size assertions #135526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Refactor AutodetectMemoryLimitIT to use memory limit constants and improve model size assertions #135526
Conversation
…prove model size assertions - Removed muted tests for `testManyDistinctOverFields` and `testTooManyByAndOverFields`. - Introduced constants for memory limits in `AutodetectMemoryLimitIT`. - Updated assertions to check effective model size against calculated limits.
|
Pinging @elastic/ml-core (Team:ML) |
...ts/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/AutodetectMemoryLimitIT.java
Outdated
Show resolved
Hide resolved
...ts/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/AutodetectMemoryLimitIT.java
Outdated
Show resolved
Hide resolved
jan-elastic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally look good. Some small comments.
- Changed variable names from `memoryLimit` to `memoryLimitMb` for clarity. - Updated memory limit assertions to reflect the new variable naming. - Ensured consistency in memory limit usage across multiple test cases.
| */ | ||
| public class AutodetectMemoryLimitIT extends MlNativeAutodetectIntegTestCase { | ||
|
|
||
| private static final long PROCESS_OVERHEAD_BYTES = ByteSizeValue.ofMb(20).getBytes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this value coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is estimated empirically. I used a small, simple anomaly detection job with a trivial model to establish the lower bound on the process memory usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be helpful to add a comment explaining that? If there's a risk of that value changing at some point in the future, it would be good to know what needs to be do to recalculate and update it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the second though, checking if the autodetect process has a specific memory overhead make these tests unnecessary brittle. I removed it and only kept checking for hard_limit. In ml-cpp we have unit tests to ensure there are no memory leaks.
…tests/AutodetectMemoryLimitIT
| assertThat(modelSizeStats.getModelBytes(), lessThan(120500000L)); | ||
| assertThat(modelSizeStats.getModelBytes(), greaterThan(70000000L)); | ||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat(modelSizeStats.getMemoryStatus(), equalTo(ModelSizeStats.MemoryStatus.HARD_LIMIT)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reporting of memory usage is different on different plattforms: on Linux and Windows we report actual process memory usage, while on MacOS this information is not available and hence we report estimated memory usage. This makes the test for a specific memory limit brittle. To improve robustness, I only check that the job is in the hard_limit state.
| assertThat(modelSizeStats.getModelBytes(), lessThan(72000000L)); | ||
| assertThat(modelSizeStats.getModelBytes(), greaterThan(24000000L)); | ||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat(modelSizeStats.getMemoryStatus(), equalTo(ModelSizeStats.MemoryStatus.HARD_LIMIT)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only check if the job is in hard_limit state to increase test robustness on different plattforms
| assertThat(modelSizeStats.getModelBytes(), greaterThan(24000000L)); | ||
|
|
||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only check for hard_limit to increase test robustness.
| assertThat(modelSizeStats.getModelBytes(), lessThan(45000000L)); | ||
| assertThat(modelSizeStats.getModelBytes(), greaterThan(25000000L)); | ||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat(modelSizeStats.getMemoryStatus(), equalTo(ModelSizeStats.MemoryStatus.HARD_LIMIT)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only check hard_limit for robustness.
…MemoryLimitIT tests
…leriy42/elasticsearch into tests/AutodetectMemoryLimitIT
Previously, the upper bound for model memory checks was set in absolute terms, which is not easy to understand and is brittle. I adjusted the assertions to ensure that the memory usage does not exceed 5% of the memory limit. Additionally, on Linux, we now report the process size (see #131981), which includes approximately 20 MB of native code overhead. I made handling this overhead more explicit.
More details:
testManyDistinctOverFieldsandtestTooManyByAndOverFields.AutodetectMemoryLimitIT.Closes #132308
Closes #132310
Closes #132611