-
Notifications
You must be signed in to change notification settings - Fork 154
Open
Description
Overview
I am finding a very strange error when doing a transfrom (either in python code or via the command line tool). Depending on the size of the input file the transform succeeds fine, or throws an "I/O operation on closed file" exception. The number of lines required to trigger it seems to vary, even by execution environment.
On a M1 Mac Mini it's currently 198 lines crashes, 197 lines passes. On a gitpod instance (Ubuntu), it was around the same yesterday, but today is more like 150. In our code version it can take 10k lines+. But there is always a size above which this fails (and a size far short of e.g. settings.FIELD_SIZE_LIMIT).
Example Command Line
% frictionless transform data/crash-transform/data.csv --pipeline data/crash-transform/pipeline.json
╭─ Error ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [step-error] Step is not valid: "cell_replace" raises "[step-error] Step is not valid: "table_normalize" raises "[source-error] The data source has not supported or has │
│ inconsistent contents: I/O operation on closed file. " " │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Pipeline.json
{
"steps": [
{ "type": "table-normalize" },
{ "type": "cell-replace", "pattern": "BLANK", "replace": "" },
{ "type": "cell-replace", "pattern": "blank", "replace": "" },
{ "type": "cell-replace", "pattern": "NULL", "replace": "" },
{ "type": "cell-replace", "pattern": "null", "replace": "" },
{
"name": "NewSumField",
"type": "field-add",
"formula": "Field1 + Field2"
},
{ "name": "NewConstantField", "type": "field-add", "value": "NewValue" }
]
}data.csv
Field1,Field2,Random1,Random2,Random3,Random4,Random5,Random6,Random7,Random8,Random9
0,0,BLANK,blank,NULL,null,5val0,6val0,7val0,8val0,9val0
1,10,1val1,2val1,3val1,4val1,5val1,6val1,7val1,8val1,9val1
2,20,1val2,2val2,3val2,4val2,5val2,6val2,7val2,8val2,9val2
3,30,BLANK,2val3,3val3,4val3,5val3,6val3,7val3,8val3,9val3
... extend as needed ...
Sample files
Metadata
Metadata
Assignees
Labels
No labels