-
-
Notifications
You must be signed in to change notification settings - Fork 738
Add approaches for Parallel Letter Frequency #2863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 14 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
a0ca392
Add approaches for Parallel-Letter-Frequency
masiljangajji b84ea23
fix: markdown according to the rules
masiljangajji 4007c38
docs: undo PULL_REQUEST_TEMPLATE.md
masiljangajji 616bcc2
docs: make it clear the method is from the Character
masiljangajji 1f0531d
docs: update(introduction.md) markdown format
masiljangajji 28a8ebc
docs: update markdown format and make it clear the method
masiljangajji 49512f7
docs: update markdown format(markdownlint-cli2)
masiljangajji 42bee45
docs: update markdown format(PULL_REQUEST_TEMPLATE.md)
masiljangajji faf10e0
docs: update fork-join content
masiljangajji 72b1485
docs: update parallel-stream content
masiljangajji d876b63
docs: update introduction
masiljangajji 9ed7813
docs: rollback invokeAll
masiljangajji 01cbae2
docs: erase whitespace
masiljangajji fbb7525
update introduction.md
masiljangajji ba55162
docs: clarified that compute is called through invokeAll
masiljangajji e624a68
docs: parallelStream hyperlink add
masiljangajji File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
27 changes: 27 additions & 0 deletions
27
exercises/practice/parallel-letter-frequency/.approaches/config.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| { | ||
| "introduction": { | ||
| "authors": [ | ||
| "masiljangajji" | ||
| ] | ||
| }, | ||
| "approaches": [ | ||
| { | ||
| "uuid": "dee2a79d-3e64-4220-b99f-55667549c12c", | ||
| "slug": "fork-join", | ||
| "title": "Fork/Join", | ||
| "blurb": "Parallel Computation Using Fork/Join", | ||
| "authors": [ | ||
| "masiljangajji" | ||
| ] | ||
| }, | ||
| { | ||
| "uuid": "75e9e93b-4da4-4474-8b6e-3c0cb9b3a9bb", | ||
| "slug": "parallel-stream", | ||
| "title": "Parallel Stream", | ||
| "blurb": "Parallel Computation Using Parallel Stream", | ||
| "authors": [ | ||
| "masiljangajji" | ||
| ] | ||
| } | ||
| ] | ||
| } |
91 changes: 91 additions & 0 deletions
91
exercises/practice/parallel-letter-frequency/.approaches/fork-join/content.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| # `Fork/Join` | ||
|
|
||
| ```java | ||
| import java.util.Map; | ||
| import java.util.List; | ||
| import java.util.concurrent.ConcurrentMap; | ||
| import java.util.concurrent.ConcurrentHashMap; | ||
| import java.util.concurrent.ForkJoinPool; | ||
| import java.util.concurrent.RecursiveTask; | ||
|
|
||
| class ParallelLetterFrequency { | ||
|
|
||
| List<String> texts; | ||
| ConcurrentMap<Character, Integer> letterCount; | ||
|
|
||
| ParallelLetterFrequency(String[] texts) { | ||
| this.texts = List.of(texts); | ||
| letterCount = new ConcurrentHashMap<>(); | ||
| } | ||
|
|
||
| Map<Character, Integer> countLetters() { | ||
| if (texts.isEmpty()) { | ||
| return letterCount; | ||
| } | ||
|
|
||
| ForkJoinPool forkJoinPool = new ForkJoinPool(); | ||
| forkJoinPool.invoke(new LetterCountTask(texts, 0, texts.size(), letterCount)); | ||
| forkJoinPool.shutdown(); | ||
|
|
||
| return letterCount; | ||
| } | ||
|
|
||
| private static class LetterCountTask extends RecursiveTask<Void> { | ||
| private static final int THRESHOLD = 10; | ||
| private final List<String> texts; | ||
| private final int start; | ||
| private final int end; | ||
| private final ConcurrentMap<Character, Integer> letterCount; | ||
|
|
||
| LetterCountTask(List<String> texts, int start, int end, ConcurrentMap<Character, Integer> letterCount) { | ||
| this.texts = texts; | ||
| this.start = start; | ||
| this.end = end; | ||
| this.letterCount = letterCount; | ||
| } | ||
|
|
||
| @Override | ||
| protected Void compute() { | ||
| if (end - start <= THRESHOLD) { | ||
| for (int i = start; i < end; i++) { | ||
| for (char c : texts.get(i).toLowerCase().toCharArray()) { | ||
| if (Character.isAlphabetic(c)) { | ||
| letterCount.merge(c, 1, Integer::sum); | ||
| } | ||
| } | ||
| } | ||
| } else { | ||
| int mid = (start + end) / 2; | ||
| LetterCountTask leftTask = new LetterCountTask(texts, start, mid, letterCount); | ||
| LetterCountTask rightTask = new LetterCountTask(texts, mid, end, letterCount); | ||
| invokeAll(leftTask, rightTask); | ||
| } | ||
| return null; | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| Using [`ConcurrentHashMap`][ConcurrentHashMap] ensures that frequency counting and updates are safely handled in a parallel environment. | ||
|
|
||
| If there are no strings, a validation step prevents unnecessary processing. | ||
|
|
||
| A [`ForkJoinPool`][ForkJoinPool] is then created. | ||
| The core of [`ForkJoinPool`][ForkJoinPool] is the Fork/Join mechanism, which divides tasks into smaller units and processes them in parallel. | ||
|
|
||
| `THRESHOLD` is the criterion for task division. | ||
| If the range of texts exceeds the `THRESHOLD`, the task is divided into two subtasks, and [`invokeAll(leftTask, rightTask)`][invokeAll] is called to execute both tasks in parallel. | ||
| Each subtask in LetterCountTask will continue calling compute() to divide itself further until the range is smaller than or equal to the `THRESHOLD`. | ||
| For tasks that are within the `THRESHOLD`, letter frequency is calculated. | ||
|
|
||
| The [`Character.isAlphabetic`][isAlphabetic] method identifies all characters classified as alphabetic in Unicode, covering characters from various languages like English, Korean, Japanese, Chinese, etc., returning `true`. | ||
| Non-alphabetic characters, including numbers, special characters, and spaces, return `false`. | ||
|
|
||
| Additionally, since uppercase and lowercase letters are treated as the same character (e.g., `A` and `a`), each character is converted to lowercase. | ||
|
|
||
| After updating letter frequencies, the final map is returned. | ||
|
|
||
| [ConcurrentHashMap]: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html | ||
| [ForkJoinPool]: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ForkJoinPool.html | ||
| [isAlphabetic]: https://docs.oracle.com/javase/8/docs/api/java/lang/Character.html#isAlphabetic-int- | ||
| [invokeAll]: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ExecutorService.html | ||
7 changes: 7 additions & 0 deletions
7
exercises/practice/parallel-letter-frequency/.approaches/fork-join/snippet.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| for (int i = start; i < end; i++) { | ||
| for (char c : texts.get(i).toLowerCase().toCharArray()) { | ||
| if (Character.isAlphabetic(c)) { | ||
| letterCount.merge(c, 1, Integer::sum); | ||
| } | ||
| } | ||
| } |
142 changes: 142 additions & 0 deletions
142
exercises/practice/parallel-letter-frequency/.approaches/introduction.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,142 @@ | ||
| # Introduction | ||
|
|
||
| There are multiple ways to solve the Parallel Letter Frequency problem. | ||
| One approach is to use [`Stream.parallelStream`][stream], and another involves using [`ForkJoinPool`][ForkJoinPool]. | ||
|
|
||
| ## General guidance | ||
|
|
||
| To count occurrences of items, a map data structure is often used, though arrays and lists can work as well. | ||
| A [`map`][map], being a key-value pair structure, is suitable for recording frequency by incrementing the value for each key. | ||
| If the data being counted has a limited range (e.g., `Characters` or `Integers`), an `int[] array` or [`List<Integer>`][list] can be used to record frequencies. | ||
|
|
||
| Parallel processing typically takes place in a multi-[`thread`][thread] environment. | ||
| The Java 8 [`stream`][stream] API provides methods that make parallel processing easier, including the `parallelStream()` method. | ||
| With `parallelStream()`, developers can use the [`ForkJoinPool`][ForkJoinPool] model for workload division and parallel execution, without the need to manually manage threads or create custom thread pools. | ||
|
|
||
| The [`ForkJoinPool`][ForkJoinPool] class, optimized for dividing and managing tasks, makes parallel processing efficient. | ||
| However, `parallelStream()` uses the common [`ForkJoinPool`][ForkJoinPool] by default, meaning multiple `parallelStream` instances share the same thread pool unless configured otherwise. | ||
|
|
||
| As a result, parallel streams may interfere with each other when sharing this thread pool, potentially affecting performance. | ||
| Although this doesn’t directly impact solving the Parallel Letter Frequency problem, it may introduce issues when thread pool sharing causes conflicts in other applications. | ||
| Therefore, a custom [`ForkJoinPool`][ForkJoinPool] approach is also provided below. | ||
|
|
||
| ## Approach: `parallelStream` | ||
|
|
||
| ```java | ||
| import java.util.Map; | ||
| import java.util.List; | ||
| import java.util.concurrent.ConcurrentMap; | ||
| import java.util.concurrent.ConcurrentHashMap; | ||
|
|
||
| class ParallelLetterFrequency { | ||
|
|
||
| List<String> texts; | ||
| ConcurrentMap<Character, Integer> letterCount; | ||
|
|
||
| ParallelLetterFrequency(String[] texts) { | ||
| this.texts = List.of(texts); | ||
| letterCount = new ConcurrentHashMap<>(); | ||
| } | ||
|
|
||
| Map<Character, Integer> countLetters() { | ||
| if (!letterCount.isEmpty() || texts.isEmpty()) { | ||
| return letterCount; | ||
| } | ||
| texts.parallelStream().forEach(text -> { | ||
| for (char c: text.toLowerCase().toCharArray()) { | ||
| if (Character.isAlphabetic(c)) { | ||
| letterCount.merge(c, 1, Integer::sum); | ||
| } | ||
| } | ||
| }); | ||
| return letterCount; | ||
| } | ||
|
|
||
| } | ||
| ``` | ||
|
|
||
| For more information, check the [`parallelStream` approach][approach-parallel-stream]. | ||
|
|
||
| ## Approach: `Fork/Join` | ||
|
|
||
| ```java | ||
| import java.util.Map; | ||
| import java.util.List; | ||
| import java.util.concurrent.ConcurrentMap; | ||
| import java.util.concurrent.ConcurrentHashMap; | ||
| import java.util.concurrent.ForkJoinPool; | ||
| import java.util.concurrent.RecursiveTask; | ||
|
|
||
| class ParallelLetterFrequency { | ||
|
|
||
| List<String> texts; | ||
| ConcurrentMap<Character, Integer> letterCount; | ||
|
|
||
| ParallelLetterFrequency(String[] texts) { | ||
| this.texts = List.of(texts); | ||
| letterCount = new ConcurrentHashMap<>(); | ||
| } | ||
|
|
||
| Map<Character, Integer> countLetters() { | ||
| if (!letterCount.isEmpty() || texts.isEmpty()) { | ||
| return letterCount; | ||
| } | ||
|
|
||
| ForkJoinPool forkJoinPool = new ForkJoinPool(); | ||
| forkJoinPool.invoke(new LetterCountTask(texts, 0, texts.size(), letterCount)); | ||
| forkJoinPool.shutdown(); | ||
|
|
||
| return letterCount; | ||
| } | ||
|
|
||
| private static class LetterCountTask extends RecursiveTask<Void> { | ||
| private static final int THRESHOLD = 10; | ||
| private final List<String> texts; | ||
| private final int start; | ||
| private final int end; | ||
| private final ConcurrentMap<Character, Integer> letterCount; | ||
|
|
||
| LetterCountTask(List<String> texts, int start, int end, ConcurrentMap<Character, Integer> letterCount) { | ||
| this.texts = texts; | ||
| this.start = start; | ||
| this.end = end; | ||
| this.letterCount = letterCount; | ||
| } | ||
|
|
||
| @Override | ||
| protected Void compute() { | ||
| if (end - start <= THRESHOLD) { | ||
| for (int i = start; i < end; i++) { | ||
| for (char c : texts.get(i).toLowerCase().toCharArray()) { | ||
| if (Character.isAlphabetic(c)) { | ||
| letterCount.merge(c, 1, Integer::sum); | ||
| } | ||
| } | ||
| } | ||
| } else { | ||
| int mid = (start + end) / 2; | ||
| LetterCountTask leftTask = new LetterCountTask(texts, start, mid, letterCount); | ||
| LetterCountTask rightTask = new LetterCountTask(texts, mid, end, letterCount); | ||
| invokeAll(leftTask, rightTask); | ||
| } | ||
| return null; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| ``` | ||
|
|
||
| For more information, check the [`fork/join` approach][approach-fork-join]. | ||
|
|
||
| ## Which approach to use? | ||
|
|
||
| When tasks are simple or do not require a dedicated thread pool (such as in this case), the `parallelStream` approach is recommended. | ||
| However, if the work is complex or there is a need to isolate thread pools from other concurrent tasks, the [`ForkJoinPool`][ForkJoinPool] approach is preferable. | ||
|
|
||
| [thread]: https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html | ||
| [stream]: https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html | ||
| [ForkJoinPool]: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ForkJoinPool.html | ||
| [map]: https://docs.oracle.com/javase/8/docs/api/?java/util/Map.html | ||
| [list]: https://docs.oracle.com/javase/8/docs/api/?java/util/List.html | ||
| [approach-parallel-stream]: https://exercism.org/tracks/java/exercises/parallel-letter-frequency/approaches/parallel-stream | ||
| [approach-fork-join]: https://exercism.org/tracks/java/exercises/parallel-letter-frequency/approaches/fork-join |
49 changes: 49 additions & 0 deletions
49
...cises/practice/parallel-letter-frequency/.approaches/parallel-stream/content.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| # `parallelStream` | ||
|
|
||
| ```java | ||
| import java.util.Map; | ||
| import java.util.List; | ||
| import java.util.concurrent.ConcurrentMap; | ||
| import java.util.concurrent.ConcurrentHashMap; | ||
|
|
||
| class ParallelLetterFrequency { | ||
|
|
||
| List<String> texts; | ||
| ConcurrentMap<Character, Integer> letterCount; | ||
|
|
||
| ParallelLetterFrequency(String[] texts) { | ||
| this.texts = List.of(texts); | ||
| letterCount = new ConcurrentHashMap<>(); | ||
| } | ||
|
|
||
| Map<Character, Integer> countLetters() { | ||
| if (texts.isEmpty()) { | ||
| return letterCount; | ||
| } | ||
| texts.parallelStream().forEach(text -> { | ||
| for (char c: text.toLowerCase().toCharArray()) { | ||
| if (Character.isAlphabetic(c)) { | ||
| letterCount.merge(c, 1, Integer::sum); | ||
| } | ||
| } | ||
| }); | ||
| return letterCount; | ||
| } | ||
|
|
||
| } | ||
| ``` | ||
|
|
||
| Using [`ConcurrentHashMap`][ConcurrentHashMap] ensures that frequency counting and updates are safely handled in a parallel environment. | ||
|
|
||
| If there are no strings to process, a validation step avoids unnecessary computation. | ||
|
|
||
| To calculate letter frequency, a parallel stream is used. | ||
| The [`Character.isAlphabetic`][isAlphabetic] method identifies all characters classified as alphabetic in Unicode, covering characters from various languages like English, Korean, Japanese, Chinese, etc., returning `true`. | ||
| Non-alphabetic characters, including numbers, special characters, and spaces, return `false`. | ||
|
|
||
| Since we treat uppercase and lowercase letters as the same character (e.g., `A` and `a`), characters are converted to lowercase. | ||
|
|
||
| After updating letter frequencies, the final map is returned. | ||
|
|
||
| [ConcurrentHashMap]: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html | ||
| [isAlphabetic]: https://docs.oracle.com/javase/8/docs/api/java/lang/Character.html#isAlphabetic-int- |
7 changes: 7 additions & 0 deletions
7
exercises/practice/parallel-letter-frequency/.approaches/parallel-stream/snippet.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| texts.parallelStream().forEach(text -> { | ||
| for (char c: text.toLowerCase().toCharArray()) { | ||
| if (Character.isAlphabetic(c)) { | ||
| letterCount.merge(c, 1, Integer::sum); | ||
| } | ||
| } | ||
| }); |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest clarifying that
computeis called throughinvokeAll. Your example only showsinvokeAllbeing called, notcompute, so it might be confusing. Also, suggest putting theLetterCountTaskandcomputein backticks for formatting.