Skip to content

Commit 0c65592

Browse files
Fixing ModelLoaderUtils.split() to pass tests (#126009) (#126560)
Prior to these changes, the split method would fail tests. Additionally, the method had code which could be refactored. A new variable (numRanges) was introduced to replace the direct usage of numStreams. The method was refactored to make the code easier to understand. Javadocs were updated. Tests for this method now pass. Co-authored-by: Jason-Whitmore <[email protected]>
1 parent 85b507d commit 0c65592

File tree

2 files changed

+22
-22
lines changed

2 files changed

+22
-22
lines changed

docs/changelog/126009.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 126009
2+
summary: Change ModelLoaderUtils.split to return the correct number of chunks and ranges.
3+
area: Machine Learning
4+
type: bug
5+
issues:
6+
- 121799

x-pack/plugin/ml-package-loader/src/main/java/org/elasticsearch/xpack/ml/packageloader/action/ModelLoaderUtils.java

Lines changed: 16 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -327,50 +327,44 @@ static InputStream getFileInputStream(URI uri) {
327327
* Split a stream of size {@code sizeInBytes} into {@code numberOfStreams} +1
328328
* ranges aligned on {@code chunkSizeBytes} boundaries. Each range contains a
329329
* whole number of chunks.
330-
* The first {@code numberOfStreams} ranges will be split evenly (in terms of
331-
* number of chunks not the byte size), the final range split
330+
* All ranges except the final range will be split approximately evenly
331+
* (in terms of number of chunks not the byte size), the final range split
332332
* is for the single final chunk and will be no more than {@code chunkSizeBytes}
333333
* in size. The separate range for the final chunk is because when streaming and
334334
* uploading a large model definition, writing the last part has to handled
335335
* as a special case.
336-
* Less ranges may be returned in case the stream size is too small.
336+
* Fewer ranges may be returned in case the stream size is too small.
337337
* @param sizeInBytes The total size of the stream
338338
* @param numberOfStreams Divide the bulk of the size into this many streams.
339339
* @param chunkSizeBytes The size of each chunk
340-
* @return List of {@code numberOfStreams} + 1 ranges.
340+
* @return List of {@code numberOfStreams} + 1 or fewer ranges.
341341
*/
342342
static List<RequestRange> split(long sizeInBytes, int numberOfStreams, long chunkSizeBytes) {
343343
int numberOfChunks = (int) ((sizeInBytes + chunkSizeBytes - 1) / chunkSizeBytes);
344+
345+
int numberOfRanges = numberOfStreams + 1;
344346
if (numberOfStreams > numberOfChunks) {
345-
numberOfStreams = numberOfChunks;
347+
numberOfRanges = numberOfChunks;
346348
}
347349
var ranges = new ArrayList<RequestRange>();
348350

349-
int baseChunksPerStream = numberOfChunks / numberOfStreams;
350-
int remainder = numberOfChunks % numberOfStreams;
351+
int baseChunksPerRange = (numberOfChunks - 1) / (numberOfRanges - 1);
352+
int remainder = (numberOfChunks - 1) % (numberOfRanges - 1);
351353
long startOffset = 0;
352354
int startChunkIndex = 0;
353355

354-
for (int i = 0; i < numberOfStreams - 1; i++) {
355-
int numChunksInStream = (i < remainder) ? baseChunksPerStream + 1 : baseChunksPerStream;
356-
long rangeEnd = startOffset + (numChunksInStream * chunkSizeBytes) - 1; // range index is 0 based
357-
ranges.add(new RequestRange(startOffset, rangeEnd, startChunkIndex, numChunksInStream));
358-
startOffset = rangeEnd + 1; // range is inclusive start and end
359-
startChunkIndex += numChunksInStream;
360-
}
356+
for (int i = 0; i < numberOfRanges - 1; i++) {
357+
int numChunksInRange = (i < remainder) ? baseChunksPerRange + 1 : baseChunksPerRange;
361358

362-
// Want the final range request to be a single chunk
363-
if (baseChunksPerStream > 1) {
364-
int numChunksExcludingFinal = baseChunksPerStream - 1;
365-
long rangeEnd = startOffset + (numChunksExcludingFinal * chunkSizeBytes) - 1;
366-
ranges.add(new RequestRange(startOffset, rangeEnd, startChunkIndex, numChunksExcludingFinal));
359+
long rangeEnd = startOffset + (((long) numChunksInRange) * chunkSizeBytes) - 1; // range index is 0 based
367360

368-
startOffset = rangeEnd + 1;
369-
startChunkIndex += numChunksExcludingFinal;
361+
ranges.add(new RequestRange(startOffset, rangeEnd, startChunkIndex, numChunksInRange));
362+
startOffset = rangeEnd + 1; // range is inclusive start and end
363+
startChunkIndex += numChunksInRange;
370364
}
371365

372366
// The final range is a single chunk the end of which should not exceed sizeInBytes
373-
long rangeEnd = Math.min(sizeInBytes, startOffset + (baseChunksPerStream * chunkSizeBytes)) - 1;
367+
long rangeEnd = Math.min(sizeInBytes, startOffset + (baseChunksPerRange * chunkSizeBytes)) - 1;
374368
ranges.add(new RequestRange(startOffset, rangeEnd, startChunkIndex, 1));
375369

376370
return ranges;

0 commit comments

Comments
 (0)