0.8.0 (2023-05-31)
Breaking Changes
-
Rename methods of
FileConnectionclasses:get_directory→resolve_dirget_file→resolve_filelistdir→list_dirmkdir→create_dirrmdir→remove_dir
New naming should be more consistent.
They were undocumented in previous versions, but someone could use these methods, so this is a breaking change. (#36) -
Deprecate
onetl.core.FileFilterclass, replace it with new classes:onetl.file.filter.Globonetl.file.filter.Regexponetl.file.filter.ExcludeDir
Old class will be removed in v1.0.0. (#43)
-
Deprecate
onetl.core.FileLimitclass, replace it with new classonetl.file.limit.MaxFilesCount.Old class will be removed in v1.0.0. (#44)
-
Change behavior of
BaseFileLimit.resetmethod.This method should now return
selfinstead ofNone. Return value could be the same limit object or a copy, this is an implementation detail. (#44) -
Replaced
FileDownloader.filterand.limitwith new options.filtersand.limits:FileDownloader( ..., filter=FileFilter(glob="*.txt", exclude_dir="/path"), limit=FileLimit(count_limit=10), )
FileDownloader( ..., filters=[Glob("*.txt"), ExcludeDir("/path")], limits=[MaxFilesCount(10)], )
This allows to developers to implement their own filter and limit classes, and combine them with existing ones.
Old behavior still supported, but it will be removed in v1.0.0. (#45)
-
Removed default value for
FileDownloader.limits, user should pass limits list explicitly. (#45) -
Move classes from module
onetl.core:from onetl.core import DBReader from onetl.core import DBWriter from onetl.core import FileDownloader from onetl.core import FileUploader from onetl.core import FileResult from onetl.core import FileSet
with new modules
onetl.dbandonetl.file:from onetl.db import DBReader from onetl.db import DBWriter from onetl.file import FileDownloader from onetl.file import FileUploader # not a public interface from onetl.file.file_result import FileResult from onetl.file.file_set import FileSet
Imports from old module
onetl.corestill can be used, but marked as deprecated. Module will be removed in v1.0.0. (#46)
Features
-
Add
rename_dirmethod.Method was added to following connections:
FTPFTPSHDFSSFTPWebDAV
It allows to rename/move directory to new path with all its content.
S3does not have directories, so there is no such method in that class. (#40) -
Add
onetl.file.FileMoverclass.It allows to move files between directories of remote file system. Signature is almost the same as in
FileDownloader, but without HWM support. (#42)
Improvements
-
Document all public methods in
FileConnectionclasses:download_fileresolve_dirresolve_fileget_statis_diris_filelist_dircreate_dirpath_existsremove_filerename_fileremove_dirupload_filewalk(#39)
-
Update documentation of
checkmethod of all connections - add usage example and document result type. (#39) -
Add new exception type
FileSizeMismatchError.Methods
connection.download_fileandconnection.upload_filenow raise new exception type instead ofRuntimeError, if target file after download/upload has different size than source. (#39) -
Add new exception type
DirectoryExistsError- it is raised if target directory already exists. (#40) -
Improved
FileDownloader/FileUploaderexception logging.If
DEBUGlogging is enabled, print exception with stacktrace instead of printing only exception message. (#42) -
Updated documentation of
FileUploader.- Class does not support read strategies, added note to documentation.
- Added examples of using
runmethod with explicit files list passing, both absolute and relative paths. - Fix outdated imports and class names in examples. (#42)
-
Updated documentation of
DownloadResultclass - fix outdated imports and class names. (#42) -
Improved file filters documentation section.
Document interface class
onetl.base.BaseFileFilterand functionmatch_all_filters. (#43) -
Improved file limits documentation section.
Document interface class
onetl.base.BaseFileLimitand functionslimits_stop_at/limits_reached/reset_limits. (#44) -
Added changelog.
Changelog is generated from separated news files using towncrier. (#47)
Misc
- Improved CI workflow for tests.
- If developer haven't changed source core of a specific connector or its dependencies, run tests only against maximum supported versions of Spark, Python, Java and db/file server.
- If developed made some changes in a specific connector, or in core classes, or in dependencies, run tests for both minimal and maximum versions.
- Once a week run all aganst for minimal and latest versions to detect breaking changes in dependencies
- Minimal tested Spark version is 2.3.1 instead on 2.4.8. (#32)
Full Changelog: 0.7.2...0.8.0