Skip to content

Conversation

@quadri00
Copy link
Collaborator

@quadri00 quadri00 commented Dec 2, 2025

csv stream puahed to main for review

Copy link
Collaborator

@edsonesf edsonesf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, the current implementation does not meet the goal of the class/methods that were defined in the design document.

Blackboard: Team_5_GroupProjectProposal_2025_as_submitted.docx

@@ -0,0 +1,1001 @@
id,name,email
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to push this output file to the repo?
It does not look related or required.


IntStream.rangeClosed(1, 1000).forEach(i -> {
try {
String line = i + ",User_" + i + ",user" + i + "@mail.com";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why hard code it?

The stream could process anything, not only this use case.

writer.write("id,name,email");
writer.newLine();

IntStream.rangeClosed(1, 1000).forEach(i -> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not generate a large CSV file here.
The repository already has many sample CSV files at main/src/main/resources/sample-csvs

Check these:

  • snakes_count_10000.csv
  • hw_25000.csv
  • mlb_players.csv


import static org.junit.jupiter.api.Assertions.*;

class CsvStreamTest {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are empty

Copy link
Collaborator

@bogdansharp bogdansharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for the Start!

I suggest integrating your CsvStream class with the existing CSV library. Its main role is to be a lightweight wrapper around CsvReader and provide a Java Stream<Row> interface to make processing rows easy.

A minimal recommended API could look like this:

public final class CsvStream implements Closeable {

    private CsvStream(CsvReader reader);

    /** Wraps an existing CsvReader into a CsvStream */
    public static CsvStream from(CsvReader reader);

    /** Opens a CSV file with the given configuration */
    public static CsvStream fromPath(Path path, CsvConfig config) throws Exception;

    /** Opens a CSV file with configuration and explicit headers */
    public static CsvStream fromPath(Path path, CsvConfig config, Headers headers) throws Exception;

    /** Returns a Stream of Row objects for lazy processing */
    public Stream<Row> stream();

    /** Closes the underlying CsvReader */
    public void close();
}

How a user would use CsvStream:

Path file = Paths.get("data.csv");
CsvConfig config = CsvConfig.builder().setHasHeader(true).build();

try (CsvStream csvStream = CsvStream.fromPath(file, config)) {
    csvStream.stream()
             .filter(row -> row.get("age").asInt() > 30)
             .forEach(row -> System.out.println(row.get("name")));
}
// CsvStream automatically closes the underlying CsvReader

Key Points for Implementation:

  • CsvStream should wrap a CsvReader internally.
  • stream() should lazily read rows so you don’t load the entire file into memory.
  • close() should release resources from the underlying CsvReader.
  • Use try-with-resources when consuming CsvStream to ensure proper cleanup.

bogdansharp
bogdansharp previously approved these changes Dec 3, 2025
@edsonesf
Copy link
Collaborator

edsonesf commented Dec 3, 2025

Working on it, to continue from @quadri00 version and implement @bogdansharp suggestions.

- Wraps CsvReader for lazy, memory-efficient streaming
- Implements Closeable for resource management
- Factory methods: from(CsvReader), fromPath(Path, CsvConfig), fromPath(Path, CsvConfig, Headers)
- Supports all Stream operations (filter, map, limit, forEach, count, collect)
- 9 comprehensive unit tests covering all functionality
- Extensive Javadoc with usage examples
- Implements all suggestions from Bogdan's code review
- All tests pass, build successful
@edsonesf
Copy link
Collaborator

edsonesf commented Dec 3, 2025

Lots of commits too, let me squash it.

@edsonesf
Copy link
Collaborator

edsonesf commented Dec 3, 2025

Local build pass:

mvn test
...
[INFO] Results:
[INFO] 
[INFO] Tests run: 447, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
[INFO] 
[INFO] --- jacoco:0.8.12:report (report) @ csv-data-processor ---
[INFO] Loading execution data file /home/edson/Documents/ATU/SoftwareDevelopment/ATU-SoftDev-Grp5Project/target/jacoco.exec
[INFO] Analyzed bundle 'csv-data-processor' with 50 classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  3.083 s
[INFO] Finished at: 2025-12-03T23:34:05Z
[INFO] ------------------------------------------------------------------------

@edsonesf edsonesf assigned edsonesf and quadri00 and unassigned edsonesf Dec 3, 2025
@MichaelMcKibbin MichaelMcKibbin merged commit 0b0ed24 into main Dec 3, 2025
8 checks passed
@MichaelMcKibbin MichaelMcKibbin deleted the feature/csvstream branch December 3, 2025 23:40
@bogdansharp bogdansharp mentioned this pull request Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants