Skip to content

Conversation

@AndreiKingsley
Copy link
Collaborator

@AndreiKingsley AndreiKingsley commented Nov 13, 2025

Helps #527 (or closes it if we finally decided to keep both File and Path).

  • Now all public IO methods have Path overload (which does not depend on the overload of File or its implementation!).
  • Tests should be added for ALL IO overloads, so I moved it to Add tests for IO overloads #1564.

@AndreiKingsley AndreiKingsley marked this pull request as ready for review November 13, 2025 09:35
Copilot finished reviewing on behalf of AndreiKingsley November 13, 2025 09:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Path overloads to all public IO methods across the DataFrame library, progressively migrating from File to the more modern java.nio.file.Path API. The changes maintain backward compatibility by keeping existing File overloads that delegate to the new Path implementations.

  • Introduces Path parameter support for all read/write operations across JSON, CSV, TSV, Excel, Arrow, JDBC, Geo, and OpenAPI modules
  • Updates the SupportedDataFrameFormat interface to make Path the primary method with File as a backward-compatible default
  • Changes SupportedFormatSample.DataFile to use Path internally (renamed field from sampleFile to sampleFilePath)

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
dataframe-openapi-generator/.../isOpenApi.kt Replaces File parameter with Path in isOpenApi function
dataframe-openapi-generator/.../OpenApi.kt Updates to use sampleFilePath instead of sampleFile
dataframe-openapi-generator/api/...api Adds Path overload to public API
dataframe-json/.../json.kt Adds Path overloads for JSON read/write with delegation to URL-based implementation
dataframe-json/api/...api Exposes new Path-based JSON methods in public API
dataframe-jdbc/.../Jdbc.kt Adds Path overload (TODO placeholder implementation)
dataframe-jdbc/api/...api Adds Path method signature to public API
dataframe-geo/.../write.kt Implements Path-based GeoJSON and Shapefile writing using NIO APIs
dataframe-excel/.../xlsx.kt Adds Path overloads for Excel read/write operations
dataframe-excel/api/...api Exposes Excel Path methods in public API
dataframe-csv/.../csv.kt Adds Path overload delegating to File for backward compatibility
dataframe-csv/.../tsv.kt Adds Path overload delegating to File for backward compatibility
dataframe-csv/api/...api Adds Path methods to CSV/TSV public APIs
dataframe-arrow/.../arrowReading.kt Implements Path-based Arrow IPC and Feather reading
dataframe-arrow/.../arrowWriting.kt Implements Path-based Arrow IPC and Feather writing
dataframe-arrow/.../ArrowWriter.kt Adds Path support to ArrowWriter interface with proper NIO usage
dataframe-arrow/api/...api Exposes Arrow Path methods in public API
core/.../importDataSchema.kt Adds Path constructor and functions for schema import
core/.../guess.kt Refactors SupportedDataFrameFormat interface to use Path as primary, adds Path-based read methods
core/.../csv.kt Adds Path overload delegating to File with explanatory comment
core/.../tsv.kt Adds Path overload delegating to File with explanatory comment
core/api/core.api Updates public API with Path methods and interface changes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


override fun readDataFrame(file: File, header: List<String>): AnyFrame = DataFrame.readExcel(file)

override fun readDataFrame(path: Path, header: List<String>): AnyFrame = DataFrame.readExcel(path.toFile())
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Path overload converts back to File via toFile(), which defeats the purpose of adding Path support. Consider adding a proper Path-based implementation that directly reads from the Path, similar to how other modules (Arrow, JSON, Geo) handle Path parameters. If conversion is necessary temporarily, add a comment explaining why (like in CSV/TSV modules).

Suggested change
override fun readDataFrame(path: Path, header: List<String>): AnyFrame = DataFrame.readExcel(path.toFile())
override fun readDataFrame(path: Path, header: List<String>): AnyFrame =
DataFrame.readExcel(path.inputStream(), header)

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahah good advice!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's readExcel(path)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems you only added writeExcel(path)

@AndreiKingsley AndreiKingsley changed the title File to path Path overloads Nov 13, 2025
Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way better than the previous pr :) couple of notes


override fun readDataFrame(file: File, header: List<String>): AnyFrame = DataFrame.readExcel(file)

override fun readDataFrame(path: Path, header: List<String>): AnyFrame = DataFrame.readExcel(path.toFile())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahah good advice!


override fun readDataFrame(file: File, header: List<String>): AnyFrame = DataFrame.readExcel(file)

override fun readDataFrame(path: Path, header: List<String>): AnyFrame = DataFrame.readExcel(path.toFile())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's readExcel(path)?


override fun readDataFrame(file: File, header: List<String>): AnyFrame = DataFrame.readExcel(file)

override fun readDataFrame(path: Path, header: List<String>): AnyFrame = DataFrame.readExcel(path.toFile())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems you only added writeExcel(path)

// Write to an existing file with `keepFile` flag
if (keepFile && file.exists() && file.length() > 0L) {
val fis = file.inputStream()
if (keepFile && Files.exists(path) && Files.size(path) > 0L) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use Kotlin overloads

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thank you, they are really nice!
For some reason, AI models don't know about them, and never suggest them!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there's no Kotlin extensions for some specific methods like newByteChannel

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could use Path("").fileSystem.provider().newByteChannel() but it's a bit longer

@AndreiKingsley AndreiKingsley merged commit 05235f4 into master Nov 19, 2025
5 checks passed
@AndreiKingsley AndreiKingsley deleted the file_to_path branch November 19, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants