Skip to content

(feature) Multithreading and batches in export#91

Open
DaniilSmirnov wants to merge 1 commit intoeroshenkoam:mainfrom
DaniilSmirnov:multithreading-export
Open

(feature) Multithreading and batches in export#91
DaniilSmirnov wants to merge 1 commit intoeroshenkoam:mainfrom
DaniilSmirnov:multithreading-export

Conversation

@DaniilSmirnov
Copy link

Based on idea from https://github.com/eroshenkoam/xcresults/pull/82/changes

I add cli option for thread count (with default value - 1) and add batches and multithreading in report generation

As a result it shows 30-50% speed up depends on report size and threads count

In attachment you can see diagram with difference between versions and their speed

image

Copy link

@irdkwmnsb irdkwmnsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all the items are already being stored in memory - using a List<CompletableFuture>> pattern doesn't impact performance that much - we use the same amount of memory, just in a shorter timespan. Although if there is a lot of testRefIds or testSummaries or testAttachments - there will be a lot of lambdas on the heap. ThreadPoolExecutor's task queue should be bounded to a reasonable number: https://github.com/eroshenkoam/xcresults/pull/91/changes#diff-d989955202a14b833a38506d59b6fd65123f68c0fe4f1062994cdd4253b34ea7R85

description = "Thread count for export processor",
defaultValue = "1"
)
protected String threadCount;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an Integer, instead of a String

Comment on lines +245 to +248
final List<JsonNode> summaries = new ArrayList<>();
summaries.addAll(getAttributeValues(entry.getKey(), ACTIVITY_SUMMARIES));
summaries.addAll(getAttributeValues(entry.getKey(), FAILURE_SUMMARIES));
summaries.forEach(summary -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to create a summaries list. You can instead concat two streams and then do forEach on that stream.

Comment on lines +250 to +253
if (attachmentSources.containsKey(name)) {
final List<String> sources = attachmentSources.get(name);
sources.forEach(source -> attachmentsRefs.putIfAbsent(source, ref));
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be replaced with attachmentSources.computeIfPresent

}, customThreadPool);
testFutures.add(future);
}
CompletableFuture.allOf(testFutures.toArray(new CompletableFuture[0])).join();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if allOf is anyhow better than testFutures.forEach(fut -> fut.join()), while the latter is simpler to read

Comment on lines +293 to +315
private void exportAttachmentsInBatches(Map<String, String> attachmentsRefs) {
List<Map.Entry<String, String>> entries = new ArrayList<>(attachmentsRefs.entrySet());
if (entries.isEmpty()) return;

int batchSize = Math.max(1, Math.min(entries.size() / (threadCount * 2), 100));
List<List<Map.Entry<String, String>>> batches = partition(entries, batchSize);

List<CompletableFuture<Void>> batchFutures = new ArrayList<>();
for (List<Map.Entry<String, String>> batch : batches) {
CompletableFuture<Void> future = CompletableFuture.runAsync(() -> {
for (Map.Entry<String, String> entry : batch) {
try {
Path attachmentPath = outputPath.resolve(entry.getKey());
exportReference(entry.getValue(), attachmentPath);
} catch (Exception e) {
System.err.printf("Failed to export attachment %s: %s%n", entry.getKey(), e.getMessage());
}
}
}, customThreadPool);
batchFutures.add(future);
}
CompletableFuture.allOf(batchFutures.toArray(new CompletableFuture[0])).join();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see a reason to extract this to a different method. It's only littering the namespace

exitCode, heicPath);
}
} catch (IOException e) {
System.err.printf("IO error converting HEIC to JPEG %s: %s%n", heicPath, e.getMessage());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bad idea to supress the error and not throw it up the chain. The callers of this method might expect the jpeg to exist, if the function exited successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants