Add EXPath File Module 4.0 alongside native file module#6084
Open
joewiz wants to merge 4 commits intoeXist-db:developfrom
Open
Add EXPath File Module 4.0 alongside native file module#6084joewiz wants to merge 4 commits intoeXist-db:developfrom
joewiz wants to merge 4 commits intoeXist-db:developfrom
Conversation
b97bb5a to
5c5bccc
Compare
Member
Author
Migration guide: eXist-db native
|
Old file: function (http://exist-db.org/xquery/file) |
New EXPath file: equivalent (http://expath.org/ns/file) |
Notes |
|---|---|---|
file:exists($path) → xs:boolean |
file:exists($path) |
Parameter changes from item() to xs:string |
file:is-directory($path) → xs:boolean |
file:is-dir($path) |
Renamed |
file:is-readable($path) → xs:boolean |
(no equivalent) | Not in EXPath spec |
file:is-writeable($path) → xs:boolean |
(no equivalent) | Not in EXPath spec |
file:read($path) → xs:string? |
file:read-text($path) |
Renamed; EXPath normalizes newlines (CR/CRLF → LF) per spec |
file:read($path, $enc) → xs:string? |
file:read-text($path, $enc) |
Renamed |
file:read-unicode($path) → xs:string? |
file:read-text($path) |
Old function stripped BOM; EXPath read-text handles encoding transparently |
file:read-unicode($path, $enc) → xs:string? |
file:read-text($path, $enc) |
Same as above |
file:read-binary($path) → xs:base64Binary? |
file:read-binary($path) |
Same name; EXPath adds optional $offset/$length overloads |
file:serialize($nodes, $path, $params) → xs:boolean? |
file:write($path, $nodes, $params) |
Renamed; parameter order changed (path first); returns empty-sequence() — errors raise exceptions |
file:serialize($nodes, $path, $params, $append) → xs:boolean? |
file:append($path, $nodes, $params) |
Use file:append instead of the $append flag |
file:serialize-binary($data, $path) → xs:boolean |
file:write-binary($path, $data) |
Renamed; parameter order changed; returns empty-sequence() |
file:serialize-binary($data, $path, $append) → xs:boolean |
file:append-binary($path, $data) |
Use file:append-binary instead of the $append flag |
file:delete($path) → xs:boolean |
file:delete($path) |
Returns empty-sequence(); EXPath adds optional $recursive overload |
file:move($source, $dest) → xs:boolean |
file:move($source, $target) |
Returns empty-sequence() |
file:mkdir($path) → xs:boolean |
file:create-dir($path) |
Renamed; EXPath always creates parent dirs (like old file:mkdirs) |
file:mkdirs($path) → xs:boolean |
file:create-dir($path) |
Renamed; same behavior |
file:list($path) → node()* |
file:list($path) |
Different return type: old returned XML elements, EXPath returns xs:string* relative paths with trailing separator on dirs |
file:directory-list($path, $pattern) → node()? |
file:list($path, true(), $pattern) |
Use 3-argument file:list with $recursive := true(). Returns xs:string* instead of XML |
file:sync($collection, $target, $options) → document-node() |
util:file-sync($collection, $target, $options) |
Moved to util module (http://exist-db.org/xquery/util); same signature and behavior |
New functions with no old equivalent
New EXPath file: function |
Description |
|---|---|
file:is-file($path as xs:string) as xs:boolean |
Tests whether a path points to a regular file |
file:is-absolute($path as xs:string) as xs:boolean |
Tests whether a path is absolute |
file:last-modified($path as xs:string) as xs:dateTime |
Returns the last modification time of a file or directory |
file:size($path as xs:string) as xs:integer |
Returns the byte size of a file, or 0 for a directory |
file:size($path, $recursive as xs:boolean?) as xs:integer |
Returns recursive size if $recursive is true |
file:read-text-lines($file as xs:string) as xs:string* |
Reads file contents as a sequence of lines (default UTF-8) |
file:read-text-lines($file, $encoding as xs:string?) as xs:string* |
Reads file as lines with specified encoding |
file:write-text($file as xs:string, $value as xs:string) as empty-sequence() |
Writes a string to a file |
file:write-text($file, $value, $encoding as xs:string?) as empty-sequence() |
Writes a string with specified encoding |
file:write-text-lines($file as xs:string, $values as xs:string*) as empty-sequence() |
Writes strings as lines separated by platform line separator |
file:write-text-lines($file, $values, $encoding as xs:string?) as empty-sequence() |
Writes lines with specified encoding |
file:append-text($file as xs:string, $value as xs:string) as empty-sequence() |
Appends a string to a file |
file:append-text($file, $value, $encoding as xs:string?) as empty-sequence() |
Appends a string with specified encoding |
file:append-text-lines($file as xs:string, $lines as xs:string*) as empty-sequence() |
Appends strings as lines |
file:append-text-lines($file, $lines, $encoding as xs:string?) as empty-sequence() |
Appends lines with specified encoding |
file:copy($source as xs:string, $target as xs:string) as empty-sequence() |
Copies a file or directory (overwrites target if it exists) |
file:create-temp-dir($prefix as xs:string?, $suffix as xs:string?, $dir as xs:string?) as xs:string |
Creates a temporary directory |
file:create-temp-file($prefix as xs:string?, $suffix as xs:string?, $dir as xs:string?) as xs:string |
Creates a temporary file |
file:children($path as xs:string) as xs:string* |
Returns absolute paths of immediate children of a directory |
file:descendants($path as xs:string) as xs:string* |
Returns absolute paths of all descendants recursively |
file:list-roots() as xs:string* |
Returns the root directories of the file system |
file:name($path as xs:string) as xs:string |
Returns the name (last segment) of a path |
file:parent($path as xs:string) as xs:string? |
Returns the parent directory of a path |
file:path-to-native($path as xs:string) as xs:string |
Returns the native, canonical path |
file:path-to-uri($path as xs:string) as xs:anyURI |
Returns the path as a file:// URI |
file:resolve-path($path as xs:string) as xs:string |
Resolves a relative path against the current working directory |
file:resolve-path($path, $base as xs:string?) as xs:string |
Resolves against a base directory |
file:dir-separator() as xs:string |
Returns the OS directory separator (e.g., / or \) |
file:line-separator() as xs:string |
Returns the OS line separator (e.g., \n or \r\n) |
file:path-separator() as xs:string |
Returns the OS path separator (e.g., : or ;) |
file:temp-dir() as xs:string |
Returns the system temporary directory path |
file:base-dir() as xs:string? |
Returns the base directory of the current query |
file:current-dir() as xs:string |
Returns the current working directory |
🤖 Co-authored with Claude Code
03608f9 to
6f58982
Compare
… 4.0 Replace the custom file module (http://exist-db.org/xquery/file) with a spec-compliant EXPath File Module 4.0 (http://expath.org/ns/file) implementation, improving interoperability with other XQuery processors. The new module implements all 35+ functions from the EXPath File Module 4.0 specification including file properties, I/O, manipulation, path utilities, and system properties. The old module's eXist-specific file:sync function is relocated to util:file-sync. Includes 64 XQuery integration tests covering all major functions and error conditions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add missing function overloads: - file:read-text/read-text-lines 3-arg $fallback form - file:create-temp-dir/create-temp-file 2-arg form Fix error codes per EXPath File 4.0 spec: - file:copy/move raise file:no-dir when target parent missing - file:create-dir raises file:exists when path component is a file - file:read-binary rejects negative $length with file:out-of-range - file:write-binary validates $offset against file size Fix readBinary hang: replace BinaryValueFromInputStream (which uses CachingFilterInputStream/FilterInputStreamCacheMonitor infrastructure that prevents clean BrokerPool shutdown) with BinaryValueFromBinaryString. Reads file into byte[], base64-encodes, wraps in lightweight value type with no open handles and no-op close(). Tradeoff: ~2.4x memory for file content, acceptable for typical file module use cases. Resolve relative paths against XQuery static base URI when set as a file: URI, falling back to JVM working directory. Detect XML-illegal characters in read-text/read-text-lines: raise file:io-error by default, or replace with U+FFFD when $fallback=true. QT4 XQTS expath-file: 183/190 (96.3%), 0 hangs, 0 errors.
b2ce4d5 to
31bd3e9
Compare
Remove unused ENTRY_PARAM field, replace raw RuntimeException with IllegalStateException, fix parameter reassignment in noOtherNCNameAttribute(), and suppress NPathComplexity on list(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change the EXPath File module to coexist alongside eXist's original file module rather than replacing it. This is a non-breaking, additive change suitable for a feature release. - Change default prefix from "file" to "exfile" in ExpathFileModule.java (namespace URI unchanged: http://expath.org/ns/file) - Restore original file module (extensions/modules/file/) from develop - Register BOTH modules in conf.xml: - http://exist-db.org/xquery/file (original, unchanged) - http://expath.org/ns/file (EXPath File 4.0, new) - Keep file:sync in original FileModule; util:file-sync also available - Update all EXPath test files to use exfile: prefix - Restore exist-distribution pom.xml dependency on exist-file - Restore image module's original dependency on exist-file Both module test suites pass: - EXPath File: 108 tests, 0 failures - Original File: 63 tests, 0 failures Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a spec-compliant EXPath File Module 4.0 implementation alongside eXist's original file module. Both modules are available simultaneously — the original is unchanged, and the EXPath module adds W3C-aligned file operations with interoperability across XQuery processors (Saxon, BaseX, etc.).
file:*— eXist's original file module (http://exist-db.org/xquery/file) — unchanged, fully backward compatibleexfile:*— EXPath File 4.0 module (http://expath.org/ns/file) — new, W3C-aligned, 96% XQTS compliancefile:syncremains in the original module;util:file-syncis also available as a convenience aliasPrefix convention and migration timeline
The EXPath module uses the default prefix
exfile:to avoid conflicts with the existingfile:prefix. Explicit imports always work regardless of default prefix conventions:file:, EXPathexfile:file:, original getsfile-legacy:Details
New EXPath File Module (10 Java classes in
extensions/expath/src/main/java/org/expath/exist/file/):FilePropertiesexists,is-dir,is-file,is-absolute,last-modified,sizeFileIOread-text,read-text-lines,read-binary(with offset/length)FileWritewrite,write-text,write-text-lines,write-binary(with offset)FileAppendappend,append-binary,append-text,append-text-linesFileManipulationcopy,move,delete(recursive),create-dir,create-temp-dir,create-temp-file,list(with glob),children,descendants,list-rootsFilePathsname,parent,path-to-native,path-to-uri,resolve-pathFileSystemPropertiesdir-separator,line-separator,path-separator,temp-dir,base-dir,current-dirExpathFileErrorCodenot-found,invalid-path,exists,no-dir,is-dir,is-relative,unknown-encoding,out-of-range,io-errorExpathFileModuleHelperExpathFileModuleSecurity: all EXPath functions require DBA role. Every function in the EXPath module — including system properties (
exfile:dir-separator,exfile:temp-dir, etc.) — requires the calling user to have the DBA role, preventing unauthenticated users from learning anything about the server's filesystem.Key capabilities beyond the original module:
exfile:not-found,exfile:no-dir, etc.)exfile:read-textnormalizes newlines (CR/CRLF → LF) per specexfile:read-text/exfile:read-text-linesdetect XML-illegal characters and raiseexfile:io-error; the 3-arg$fallback=true()form replaces them with U+FFFDexfile:read-binarysupports$offset/$lengthparametersexfile:deletesupports$recursiveparameterfile:URIexfile:is-absolute,exfile:children,exfile:descendants,exfile:list-roots,exfile:create-temp-dir,exfile:create-temp-file,exfile:write(serialized),exfile:append,exfile:write-text-lines,exfile:append-text-lines,exfile:append-binaryXQTS Results
QT4 XQTS
expath-filetest set: 183/190 (96.3%) — 0 errors, 0 hangsRemaining 7 failures

) in XML test catalog normalized to LF by XML parser before XQuery engine sees itexfile:exists("../sandpit")— sandpit copied to temp direxfile:path-to-native("//test.txt")— platform-specific (BaseX also fails identically)All 7 failures are external to the eXist-db implementation.
readBinary implementation note
exfile:read-binaryreads the file into abyte[]viaFiles.readAllBytes()(orRandomAccessFile.readFully()for partial reads), base64-encodes it, and wraps it inBinaryValueFromBinaryString— a lightweight value type with no open file handles and a no-opclose().The previous stream-backed approach (
BinaryValueFromInputStream) used eXist-db'sCachingFilterInputStream/FilterInputStreamCacheMonitorinfrastructure, which prevented clean BrokerPool shutdown and caused deadlocks in the XQTS runner. The new approach trades ~2.4x memory for file content (raw bytes + base64 string) for zero resource leak risk — appropriate for the typical use cases of the EXPath File Module. Applications processing very large binary files should use streaming APIs (e.g., EXPath Binary Module) rather than loading entire files into XDM values.Test plan
exfile:prefix)%test:pending)ExpathFileTestsrunnerexpath-filetest set: 183/190 (96.3%), 0 hangsimport module namespace exfile="http://expath.org/ns/file"; exfile:exists("/tmp")🤖 Generated with Claude Code