Here described only the breaking and most significant changes. The full changelog and documentation for all released versions could be found in nicely formatted commit history.
- Local development has been migrated to using Hatch
- Rebased packaging on PEP 621
- Extracted experimental application/server from the codebase
- Implemented "Metadata.from_descriptor(allow_invalid=False)" (#1501)
- Various architectural and standards-compatibility improvements (minor breaking changes):
- Added new Console commands:
- list
- explore
- query
- script
- convert
- publish
- Rebased Console commands on Rich (nice output in the Console)
- Fixed
extractreturning the results depends on the source type (now it's always a dictionary indexed by the resource name) - Enforced type safety -- many tabular command will be marked as impossible for non-tabular resources if a type checker is used
- Improved
frictionless.Resource(source)guessing abilities; if you just like to open a table resource usefrictionless.resources.TableResource(path=path)
- Added new Console commands:
- Implemented Implemented
catalog/dataset/package/resource.deference(#1451)
- Various architectural and standards-compatibility improvements (minor breaking changes):
- Improved type detection mechanism (including remote descriptors)
- Added
resourcesmodule includingFile/Text/Json/TableResource - Deprecated
resource.typeargument -- use the classes above - Changed
catalog.packages[]tocatalog.datasets[].package - Made
resource.schemaoptional (resource.has_schemais removed) - Made
resource.normpathoptional (resource.normdatais removed) - Standards-compatability improvements: profile, stats
- Renamed
system/plugin.select_Check/etctosystem/plugin.select_check_class/etc
- Added support for
sqlalchemy@2(#1427)
- Implemented
program/resource.indexpreview (#1395)
- Support
dialect.skip_blank_rows(#1387)
- Support
steps.resource_updatefor resource transformations (#1381)
- Added support for
wktformat infields.StringField(#1363 by @jze)
- Support
descriptorargument foractions/program.extract(#1372)
- Frictionless Framework (v5) is out of Beta and released on PyPi
- Implemented CKAN Integration (#1185)
- ForeignKeyError has been extended with additional information:
fieldNames,fieldCells,referenceName, andreferenceFieldNames
- Implemented Github Integration (#1185)
- First beta version of Frictionless Framework (v5)
- Added Dialect support to packages (#1137)
- Fixed processing of incompatible decimal char in table schema and data (#1089)
- Added support for Time Zone data (#1097)
- Improved validation messages by adding
summaryand partial validation details (#1106) - Implemented new feature
summary(#1127)schema.to_summaryreport.to_summary- Added CLI command
summary
- Fixed file compression
package.to_zip(#1104) - Implemented feature to validate single resource (#1112)
- Improved error message to notify about invalid fields (#1117)
- Fixed type conversion of NaN values for data of type Int64 (#1115)
- Exposed valid/invalid flags in CLI
extractcommand (#1130) - Implemented feature
package.to_er_diagram(#1135)
- Implemented
checks.ascii_value(#1064) - Implemented
checks.deviated_cell(#1069) - Implemented
detector.field_true/false_values(#1074)
- Deprecated high-level legacy actions (use class-based alternatives):
describe_*extract_*transform_*validate_*
- Implemented pipeline actions:
pipeline.validate(will replacevalidate_pipelinein v5)pipeline.transform(will replacetransform_pipelinein v5)
- Implemented inqiury actions:
inqiury.validate(will replacevalidate_inqiuryin v5)
- Implemented schema actions:
Schema.describe(will replacedescribe_schemain v5)schema.validate(will replacevalidate_schemain v5)
- Implemented new transform steps:
steps.field_mergesteps.field_pack
- Implemented package actions:
Package.describe(will replacedescribe_packagein v5)package.extract(will replaceextract_packagein v5)package.validate(will replacevalidate_packagein v5)package.transform(will replacetransform_packagein v5)
- Implemented resource actions:
Resource.describe(will replacedescribe_resourcein v5)resource.extract(will replaceextract_resourcein v5)resource.validate(will replacevalidate_resourcein v5)resource.transform(will replacetransform_resourcein v5)
- Added to_markdown() feature to metadata (#1052)
- Added a feature that allows to export table schema as excel (#1040)
- Added nontabular note to validation results to indicate nontabular file (#1046)
- Excel stats now shows bytes and hash (#1045)
- Added pprint feature which displays metadata in a readable and pretty way (#1039)
- Improved error message if resource.data is not a string (#1036)
- Made Detector's private properties public and writable (#1025)
- Improved an order of the metadata in YAML representation
- Exposed Dialect options via CLI such as
sheet,table,keys, andkeyed(#886)
- Validate 'schema.fields[].example' (#998)
- Allows descriptors that subclass collections.abc.Mapping (#985)
- Added support for
SqlDialect.basepath(#982) (https://framework.frictionlessdata.io/docs/tutorials/formats/sql-tutorial)
- Added table dimensions check (#985)
- Added "extract --trusted" flag
- Added "--json/yaml" CLI options for transform
- Improved layout/schema detection algorithms (#945)
- Renamed
inlineDialect.keystoinlineDialect.data_keysdue to a conflict withdict.keysproperty
- Normalized metadata properties (increased type safety)
- Add fields, limit, sort and filter options to CkanDialect (#912)
- Implemented
system/plugin.create_candidates(#893)
- Implemented
system.get/use_http_session(#892)
- SQL Where Clause (#882)
- Implemented descriptor type detection for
extract/validate(#881)
- Support external profiles for data package (#864)
- Added
jsonargument toresource.to_snap
- Support resource/field renaming in transform (#843)
- Support
--pathCLI argument (#829)
- Added support for
Package(innerpath)argument for unzipping a data package's descriptor
- Support control/dialect as JSON in CLI (#806)
- Implemented
describe_dialectanddescribe(path, type="dialect") - Support
--dialectargument in CLI
- Implemented
Schema.from_jsonschema(#797)
- Use
field.constraints.maxLengthfor SQL's VARCHAR (#795)
- Implemented
resource.to_view()(#781)
- Make
fields[].arrayItemerrors more granular (#767)
- Added support for
fields[].arrayItem(#750)
- Released
frictionless@4🎉
- Updated loaders (#658) (BREAKING)
- Renamed
filelikeloader tostreamloader - Migrated from
textloader tobufferloader
- Renamed
- Improve transform API (#657) (BREAKING)
- Swithed to the
transform_resource(resource)signature - Swithed to the
transform_package(package)signature
- Swithed to the
- Improved resource/package import/export (#655) (BREAKING)
- Reworked
parser.write_row_streamAPI - Reworked
resource.from/toAPI - Reworked
package.from/toAPI - Reworked
StorageAPI - Reworked
system.create_storageAPI - Merged
PandasStorageintoPandasParser - Merged
SpssStorageintoSpssParser
- Reworked
- Improved transformation steps (#650) (BREAKING)
- Split value/formula/function concepts
- Renamed a few minor step arguments
- Improved layout and data streams concepts (#648) (BREAKING)
- Renamed
data_streamtolist_stream - Renamed
readDatatoreadLists - Renamed
sampletofragment(samplenow is raw lists) - Implemented loader.buffer
- Implemented parser.sample
- Added support for function based checks
- Added support for function based steps
- Renamed
- Reworked Error.tags (BREAKING)
- Reworked Check API and split labels/header (BREAKING)
- Rebased on
Detectorclass (BREAKING)- Migrated all infer_*, sync/patch_schema and detect_encoding parameters to
Detector - Made
resource.inferomit empty objects - Added
resource.read_*(size)argument - Added
resource.labelsproperty
- Migrated all infer_*, sync/patch_schema and detect_encoding parameters to
- Improved checks/steps API (#621) (BREAKING)
- Updated
validate(extra_checks=[...])tovalidate(checks=[{"code": 'code', ...}])
- Updated
- Updated describe/extract/transform/validate APIs (BREAKING)
- Removed
validate_table(usevalidate_resource) - Removed legacy
TableandFileclasses - Removed
dataflowsplugin - Replaced
nopoolbyparallel(not parallel by default) - Renamed
report.tablestoreport.tasks - Rebased on
report.tasks[].resource(instead of plain path/scheme/format/etc) - Flatten Pipeline steps signature
- Removed
- Introduced Layout class (BREAKING)
- Renamed
Queryclass and arguments/properties toLayout - Moved
headeroptions fromDialecttoLayout
- Renamed
- Updated transform API
- Added
transform(type)argument
- Added
- Updated describe API (BREAKING)
- Renamed
describe(source_type)argument totype
- Renamed
- Updated extract API (BREAKING)
- Removed
extract_table(useextract_resourcewith the same API) - Renamed
extract(source_type)argument totype
- Removed
- Initial API/codebase improvements for v4 (BREAKING)
- Allow
Package/Resource(source)notation (guess descriptor/path/etc) - Renamed
schema.infer->Schema.from_sample - Renamed
resource.inline->resource.memory - Renamed
compression_path->innerpath - Renamed
compression: no->compression: "" - Updated
Package/Resource.infernot to infer stats (usestats=True) - Removed
Package/Resource.infer(only_sample)argument - Removed
Resouce.from/to_zip(usePackage.from/to_zip) - Removed
Resouce.source(useResource.dataorResource.fullpath) - Removed
package/resource.infer(source)argument (use constructors) - Added some new API (will be covered in the updated docs after the v4 release)
- Allow
- Make Resource independent from Table/File (#607) (BREAKING)
- Resource can be opened like Table (it's recommended to use Resource instead of Table)
- Renamed
resource.read_sample()toresource.sample - Renamed
resource.read_header()toresource.header - Renamed
resource.read_stats()toresource.stats - Removed
resource.to_table() - Removed
resource.to_file()
- Optimize Row/Header/Table and rename header errors (#601) (BREAKING)
- Row object is now lazy; it casts data on-demand preserving the same API
- Method
resource/table.read_data(_stream)now includes a header row if present - Renamed
errors.ExtraHeaderError->ExtraLabelError(extra-label-error) - Renamed
errors.MissingHeaderError->MissingLabelError(missing-label-error) - Renamed
errors.BlankHeaderError->BlankLabelError(blank-label-error) - Renamed
errors.DuplicateHeaderError->DuplicateLabelError(duplicate-label-error) - Renamed
errors.NonMatchingHeaderError->IncorrectLabelError(incorrect-label-error) - Renamed
schema.read/write_data->read/write_cells
- Renamed aws plugin to s3 (#594) (BREAKING)
$ pip install frictionless[aws] # before
$ pip install frictionless[s3] # after- Drafted support for writing Multipart Data (#583)
- Added support for writing to Remote Data (#582)
- Add support to writing to Google Sheets (#581)
- Renamed
gsheetplugin/format togsheets(BREAKING: minor)
- Added support for writing to S3 (#580)
- Update Loader/Parser API to write to different targets (#579) (BREAKING: minor)
- Implemented a standalone multipart loader (#573)
- Fixed Header not being an original one (#572)
- Fix bad format validation (#571)
- Added default errors limit equals to 1000 (#570)
- Added support for field.float_number (#569)
- Improved ckan plugin (#560)
- Remove not working elastic plugin draft (#558)
- Support custom types (#557)
- Added "resolve" option to "resource/package.to_zip" (#556)
- Moved
frictionless.controlstofrictionless.plugins.*(BREAKING) - Moved
frictionless.dialectstofrictionless.plugins.*(BREAKING) - Moved
frictionless.exceptions.FrictionlessExceptiontofrictionless.FrictionlessException(BREAKING) - Moved
exceldependencies tofrictionless[excel]extras (BREAKING) - Moved
jsondependencies tofrictionless[json]extras (BREAKING) - Consider
jsonfiles to be a metadata by default (BREAKING)
Code example:
# Before
# pip install frictionless
from frictionless import dialects, exceptions
excel_dialect = dialects.ExcelDialect()
json_dialect = dialects.JsonDialect()
exception = exceptions.FrictionlessException()
# After
# pip install frictionless[excel,json]
from frictionless import FrictionlessException
from frictionless.plugins.excel import ExcelDialect
from frictionless.plugins.json import JsonDialect
excel_dialect = dialects.ExcelDialect()
json_dialect = dialects.JsonDialect()
exception = FrictionlessException()- Implemented resource.write (#537)
- Added url parameter to SQL import/export (#535)
- Made tables with header and no data rows valid (#534) (BREAKING: minor)
- Various CLI improvements (#532)
- Added autocompletion
- Added stdin support
- Added "extract --csv"
- Exposed more options
- Added experimental CKAN support (#528)
- Add a "nopool" argument to validate (#527)
- Stop sorting keyed sources as the order is now guaranteed by Python (#512) (BREAKING)
- Added "nolookup" argument for validate_package (#515)
- Add transform functionality (#505)
- Methods
schema.get/remove_fieldnow raise if not found (#505) (BREAKING) - Methods
package.get/remove_resourcenow raise if not found (#505) (BREAKING)
- Lower case resource.scheme/format/hashing/encoding/compression (#499) (BREAKING)
- Support "header_case" option for dialects (#488)
- Added suppport for DB2 format (#485)
- Improved SPSS plugin (#483)
- Improved BigQuery plugin (#470)
- Added support for SQL Views (#466)
- Rebased AwsLoader on streaming (#460)
- Added
hashingparameter todescribe/describe_package - Removed
table.onerrorproperty (BREAKING)
- Added timezone for datetime/time parsing (#457) (BREAKING)
- Fixed metadata.to_yaml (#455)
- Removed the
expandargument frommetadata.to_dict(BREAKING)
- Added native schema support to SqlParser (#452)
- Make Resource the main internal interface (#446) (BREAKING: for plugin authors)
- Move Resource's stats to
resource.stats(BREAKING) - Rename
on_errortoonerror(BREAKING) - Added
resource.stats.fields
- Add an
on_errorargument to Table/Resource/Package (#445)
- Added streaming to the extract functions (#442)
- Added experimental BigQuery support (#424)
- Added experimental SPSS support (#421)
- Rebased on a
goodtablessuccessor versioning
- Add support SQL/Pandas import/export (#31)
- Add support for custom JSONEncoder classes (#24)
- Normalize header terminology
- Initial public version