Skip to content

Commit 7810325

Browse files
feat: add length unit support in FileSystem limits (#781)
* feat: add length unit support in FileSystem limits Different filesystems and operating systems measure file and path lengths in different units: * macOS and Windows filesystems typically count **UTF-16 code units**. * Linux and other UNIX filesystems typically count **bytes**. This change introduces explicit unit support so these limits can be interpreted consistently. ### Key changes * **New API** * Added a `LengthUnit` enum and `FileSystem.getLengthUnit()` to expose the unit of measure used by `getMaxFileNameLength()` and `getMaxPathLength()`. * Added new overloads for `isLegalFileName` and `toLegalFileName` that accept a `Charset`, making conversions between bytes and UTF-16 explicit. * **Adjusted defaults** * Reduced the `GENERIC` filesystem defaults: * File name length → **1020 bytes** (covers 255 UTF-16 characters encoded as up to 3 UTF-8 bytes). * Path length → **1 MiB** (covers 32,767 UTF-16 code units, again at 3 UTF-8 bytes each). * **Testing** * Added unit tests to validate the new API and updated limits. * fix: move name-length handling into `NameLengthStrategy` * Refactors comparison and truncation logic into `LengthUnit`, renamed to `NameLengthStrategy`. * Makes the `NameLengthStrategy` value internal-only. * Improves Javadoc for `getMaxFileNameLength` and `getMaxPathLength` to clarify that staying within the reported limit is necessary but not sufficient for a name or path to be valid on all filesystems. * fix: Javadoc of `getMaxPathLength` * fix: do not truncate extension * fix: checkstyle violations * fix: make `nameLengthStrategy` * fix: make `nameLengthStrategy` private (2) * Fix PMD * fix: rename `UTF16_CHARS` -> `UTF16_CODE_UNITS` * fix: simplify `truncate` * fix: switch MacOS to bytes * fix: `testMaxNameLength_MatchesRealSystem` test * fix: `testMaxNameLength_MatchesRealSystem` * fix: improve `truncate` tests * fix: try fix macOS tests * fix: add support for grapheme clusters * fix: tests on JDK 19 or earlier --------- Co-authored-by: Gary Gregory <[email protected]>
1 parent cd20ece commit 7810325

File tree

3 files changed

+652
-108
lines changed

3 files changed

+652
-108
lines changed

src/changes/changes.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ The <action> type attribute can be add,update,fix,remove.
6060
<action dev="ggregory" type="add" due-to="Gary Gregory">Add org.apache.commons.io.output.ProxyOutputStream.writeRepeat(byte[], int, int, long).</action>
6161
<action dev="ggregory" type="add" due-to="Gary Gregory">Add org.apache.commons.io.output.ProxyOutputStream.writeRepeat(byte[], long).</action>
6262
<action dev="ggregory" type="add" due-to="Gary Gregory">Add org.apache.commons.io.output.ProxyOutputStream.writeRepeat(int, long).</action>
63+
<action dev="pkarwasz" type="add" due-to="Piotr P. Karwasz">Add length unit support in FileSystem limits.</action>
6364
<action dev="pkarwasz" type="add" due-to="Piotr P. Karwasz">Add IOUtils.toByteArray(InputStream, int, int) for safer chunked reading with size validation.</action>
6465
<!-- UPDATE -->
6566
<action type="update" dev="ggregory" due-to="Gary Gregory, Dependabot">Bump org.apache.commons:commons-parent from 85 to 87 #774.</action>

0 commit comments

Comments
 (0)