Cannot fetch data from a data source because there is a file with size 0 and no name which does not pass the sanity check but I would like to fetch the other files

I'm using Windows, Python 3.11, quilt3 version 7.0.0 installed with pip. I want to fetch data from https://open.quiltdata.com/b/cellpainting-gallery/tree/cpg0023-mpi/mpi/images/Batch1/images/C2018-04-10.00-181207-A/2018-12-08/48519/ and go on get files / code / get files using the quilt3 Python API from this site.

However, this code:

```
import quilt3 as q3

if __name__ == "__main__":
    b = q3.Bucket("s3://cellpainting-gallery")
    b.fetch("cpg0023-mpi/mpi/images/Batch1/images/C2018-04-10.00-181207-A/2018-12-08/48519/",
            "file:///M:/Temporary/48519/")
```

throws:

```
Traceback (most recent call last):
  File "test.py", line 5, in <module>    
    b.fetch("cpg0023-mpi/mpi/images/Batch1/images/C2018-04-10.00-181207-A/2018-12-08/48519/",
  File "..\Lib\site-packages\quilt3\bucket.py", line 184, in fetch
    copy_file(source, dest)
  File "..\Lib\site-packages\quilt3\data_transfer.py", line 901, in copy_file
    sanity_check(rel_path)
  File "..\Lib\site-packages\quilt3\data_transfer.py", line 891, in sanity_check
    raise ValueError("Invalid relative path: %r" % rel_path)
ValueError: Invalid relative path: ''
```

Debugging and stopping at line 901 in method copy_file in data_transfer.py and inspecting the results of list_url(src) I see that the files to be copied are

`[('', 0), ('181207_A01_s1_w12B5D7C20-9D24-4794-A524-9E4F2B881179.tif', 2337756), ('181207_A01_s1_w2058D7D88-B06E-4FBF-BAAD-C684F63EDD3E.tif', ...`

That is, there is a file without a name and size 0, which will not pass the sanity_check, which instead throws an error.

I don't need such a file and for my purpose the sanity_check seems to be too strong while a better way might be to simply skip this empty file without a name.

I edited data_transfer.py lines 900 and following to

```
        for rel_path, size in list_url(src):
            if size > 0:
                sanity_check(rel_path)
                url_list.append((src.join(rel_path), dest.join(rel_path), size))
```

and that does the trick for me. No error is thrown and I can download all the files that I want (all the files with size > 0).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot fetch data from a data source because there is a file with size 0 and no name which does not pass the sanity check but I would like to fetch the other files #4641

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cannot fetch data from a data source because there is a file with size 0 and no name which does not pass the sanity check but I would like to fetch the other files #4641

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions