Commit 7810325
feat: add length unit support in FileSystem limits (#781)
* feat: add length unit support in FileSystem limits
Different filesystems and operating systems measure file and path lengths in different units:
* macOS and Windows filesystems typically count **UTF-16 code units**.
* Linux and other UNIX filesystems typically count **bytes**.
This change introduces explicit unit support so these limits can be interpreted consistently.
### Key changes
* **New API**
* Added a `LengthUnit` enum and `FileSystem.getLengthUnit()` to expose the unit of measure used by `getMaxFileNameLength()` and `getMaxPathLength()`.
* Added new overloads for `isLegalFileName` and `toLegalFileName` that accept a `Charset`, making conversions between bytes and UTF-16 explicit.
* **Adjusted defaults**
* Reduced the `GENERIC` filesystem defaults:
* File name length → **1020 bytes** (covers 255 UTF-16 characters encoded as up to 3 UTF-8 bytes).
* Path length → **1 MiB** (covers 32,767 UTF-16 code units, again at 3 UTF-8 bytes each).
* **Testing**
* Added unit tests to validate the new API and updated limits.
* fix: move name-length handling into `NameLengthStrategy`
* Refactors comparison and truncation logic into `LengthUnit`, renamed to `NameLengthStrategy`.
* Makes the `NameLengthStrategy` value internal-only.
* Improves Javadoc for `getMaxFileNameLength` and `getMaxPathLength` to clarify that staying within the reported limit is necessary but not sufficient for a name or path to be valid on all filesystems.
* fix: Javadoc of `getMaxPathLength`
* fix: do not truncate extension
* fix: checkstyle violations
* fix: make `nameLengthStrategy`
* fix: make `nameLengthStrategy` private (2)
* Fix PMD
* fix: rename `UTF16_CHARS` -> `UTF16_CODE_UNITS`
* fix: simplify `truncate`
* fix: switch MacOS to bytes
* fix: `testMaxNameLength_MatchesRealSystem` test
* fix: `testMaxNameLength_MatchesRealSystem`
* fix: improve `truncate` tests
* fix: try fix macOS tests
* fix: add support for grapheme clusters
* fix: tests on JDK 19 or earlier
---------
Co-authored-by: Gary Gregory <[email protected]>1 parent cd20ece commit 7810325
File tree
3 files changed
+652
-108
lines changed- src
- changes
- main/java/org/apache/commons/io
- test/java/org/apache/commons/io
3 files changed
+652
-108
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
| |||
0 commit comments