Skip to content

Failing aarch tests due to s3_server fixture / minio_server_health_check #117

@h-vetinari

Description

@h-vetinari

This has been going on for a while; when we catch a slow agent, the tests on aarch fail as follows:

=========================== short test summary info ============================
ERROR pyarrow/tests/parquet/test_dataset.py::test_read_s3fs - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_dataset.py::test_read_directory_s3fs - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_dataset.py::test_read_partitioned_directory_s3fs - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_dataset.py::test_write_to_dataset_pathlib_nonlocal - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_dataset.py::test_write_to_dataset_with_partitions_s3fs - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_dataset.py::test_write_to_dataset_no_partitions_s3fs - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_pickling[builtin_pickle-S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_pickling[builtin_pickle-PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_is_functional_after_pickling[builtin_pickle-S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_is_functional_after_pickling[builtin_pickle-PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_pickling[cloudpickle-S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_pickling[cloudpickle-PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_is_functional_after_pickling[cloudpickle-S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_is_functional_after_pickling[cloudpickle-PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_metadata.py::test_write_metadata_fs_file_combinations - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_parquet_file.py::test_parquet_file_with_filesystem[True] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_parquet_file.py::test_parquet_file_with_filesystem[False] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_parquet_writer.py::test_parquet_writer_filesystem_s3 - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_parquet_writer.py::test_parquet_writer_filesystem_s3_uri - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/parquet/test_parquet_writer.py::test_parquet_writer_filesystem_s3fs - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_make_fragment_with_size - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[threaded] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3[serial] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_open_dataset_from_fileinfos[threaded] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_open_dataset_from_fileinfos[serial] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_open_dataset_from_uri_s3_fsspec - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_open_dataset_from_s3_with_filesystem_uri - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_write_dataset_s3 - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_dataset.py::test_write_dataset_s3_put_only - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_s3fs_limited_permissions_create_bucket - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_equals_none[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_equals_none[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_normalize_path[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_normalize_path[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_non_path_like_input_raises[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_non_path_like_input_raises[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_create_dir[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_create_dir[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_copy_file[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_copy_file[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_move_file[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_move_file[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_delete_file[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_delete_file[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[S3FileSystem-None-None-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[S3FileSystem-None-64-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[S3FileSystem-gzip-None-compress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[S3FileSystem-gzip-256-compress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-None-None-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-None-64-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-gzip-None-compress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-gzip-256-compress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_file[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_file[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream_not_found[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_input_stream_not_found[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[S3FileSystem-None-None-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[S3FileSystem-None-64-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[S3FileSystem-gzip-None-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[S3FileSystem-gzip-256-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-None-None-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-None-64-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-gzip-None-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-gzip-256-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[S3FileSystem-None-None-identity-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[S3FileSystem-None-64-identity-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[S3FileSystem-gzip-None-compress-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[S3FileSystem-gzip-256-compress-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-None-None-identity-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-None-64-identity-identity] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-gzip-None-compress-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_append_stream[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))-gzip-256-compress-decompress] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream_metadata[S3FileSystem] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_open_output_stream_metadata[PyFileSystem(FSSpecHandler(s3fs.S3FileSystem()))] - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_filesystem_from_uri_s3 - urllib.error.HTTPError: HTTP Error 503: Service Unavailable
ERROR pyarrow/tests/test_fs.py::test_copy_files - urllib.error.HTTPError: HTTP Error 503: Service Unavailable

I cannot actually see the URL, but it all seems to be s3 related, and there's the following stack trace (I haven't verified if it applies to all failures, but probably):

==================================== ERRORS ====================================
_______________________ ERROR at setup of test_read_s3fs _______________________

s3_connection = ('localhost', 52375, 'arrow', 'apachearrow')
tmpdir_factory = TempdirFactory(_tmppath_factory=TempPathFactory(_given_basetemp=None, _trace=<pluggy._tracing.TagTracerSub object at 0x40002614ac20>, _basetemp=PosixPath('/tmp/pytest-of-conda/pytest-0'), _retention_count=3, _retention_policy='all'))

    @pytest.fixture(scope='session')
    def s3_server(s3_connection, tmpdir_factory):
        @retry(attempts=5, delay=0.1, backoff=2)
        def minio_server_health_check(address):
            resp = urllib.request.urlopen(f"http://{address}/minio/health/cluster")
            assert resp.getcode() == 200
    
        tmpdir = tmpdir_factory.getbasetemp()
        host, port, access_key, secret_key = s3_connection
    
        address = '{}:{}'.format(host, port)
        env = os.environ.copy()
        env.update({
            'MINIO_ACCESS_KEY': access_key,
            'MINIO_SECRET_KEY': secret_key
        })
    
        args = ['minio', '--compat', 'server', '--quiet', '--address',
                address, tmpdir]
        proc = None
        try:
            proc = subprocess.Popen(args, env=env)
        except OSError:
            pytest.skip('`minio` command cannot be located')
        else:
            # Wait for the server to startup before yielding
>           minio_server_health_check(address)

pyarrow/tests/conftest.py:219: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/tests/conftest.py:188: in wrapper
    raise last_exception
pyarrow/tests/conftest.py:180: in wrapper
    return func(*args, **kwargs)
pyarrow/tests/conftest.py:197: in minio_server_health_check
    resp = urllib.request.urlopen(f"http://{address}/minio/health/cluster")
../urllib/request.py:216: in urlopen
    return opener.open(url, data, timeout)
../urllib/request.py:525: in open
    response = meth(req, response)
../urllib/request.py:634: in http_response
    response = self.parent.error(
../urllib/request.py:563: in error
    return self._call_chain(*args)
../urllib/request.py:496: in _call_chain
    result = func(*args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib.request.HTTPDefaultErrorHandler object at 0x40009408dc00>
req = <urllib.request.Request object at 0x40009408d8d0>
fp = <http.client.HTTPResponse object at 0x400024a367d0>, code = 503
msg = 'Service Unavailable'
hdrs = <http.client.HTTPMessage object at 0x400024a36b30>

    def http_error_default(self, req, fp, code, msg, hdrs):
>       raise HTTPError(req.full_url, code, msg, hdrs, fp)
E       urllib.error.HTTPError: HTTP Error 503: Service Unavailable

../urllib/request.py:643: HTTPError
Captured stdout setup -----------------------------

API: SYSTEM()
Time: 05:11:27 UTC 05/08/2024
Error: Unable to listen on `[::1]:52375`: listen tcp [::1]:52375: bind: cannot assign requested address (*errors.errorString)
       4: internal/logger/logger.go:260:logger.LogIf()
       3: cmd/server-main.go:779:cmd.serverMain.func8.1.1()
       2: internal/http/server.go:98:http.(*Server).Init()
       1: cmd/server-main.go:778:cmd.serverMain.func8.1()

Usually restarting once or twice solves the issue, but it's getting annoying through the sheer amount of times it's happening.

The function minio_server_health_check already has a built-in retry/back-off, so I think it's worth trying to increase that (at least for the feedstock here).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions