Skip to content

TPC error in push-mode from StoRM WebDAV to EOS for file size greater than 1MB #38

@andrearendina

Description

@andrearendina

Hi all,

I have noticed a costant error in the third-party push-copies from StoRM WebDAV to EOS with file size bigger than 1MB.

In my tests, the version of the StoRM WebDAV server is storm-webdav-1.4.1-1.el7.noarch, while the EOS version is the 5.0.22 and the XrootD version is the 5.4.3. Below, you can find two easy examples with the gfal tool of a successful transfer and a failed one.

$  gfal-copy -v -t 36000 --copy-mode push davs://ds-509.cr.cnaf.infn.it:8443/juno-test/rucio4juno/test_parallel_transfer/divina_commedia davs://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/
Copying 557042 bytes davs://ds-509.cr.cnaf.infn.it:8443/juno-test/rucio4juno/test_parallel_transfer/divina_commedia => davs://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/divina_commedia
[...]
event: [1652695555951] BOTH   http_plugin       TRANSFER:TYPE   3rd push
event: [1652695556028] BOTH   http_plugin       TRANSFER:EXIT   davs://ds-509.cr.cnaf.infn.it:8443/juno-test/rucio4juno/test_parallel_transfer/divina_commedia => davs://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/divina_commedia
$ gfal-copy -v -t 36000 --copy-mode push davs://ds-509.cr.cnaf.infn.it:8443/juno-test/rucio4juno/test_parallel_transfer/test2M davs://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/
Copying 2097152 bytes davs://ds-509.cr.cnaf.infn.it:8443/juno-test/rucio4juno/test_parallel_transfer/test2M => davs://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/test2M
[...]
event: [1652695787847] BOTH   http_plugin       TRANSFER:TYPE   3rd push
event: [1652695787899] BOTH   http_plugin       TRANSFER:EXIT   ERROR: Copy failed with mode 3rd push, with error: Transfer failed: failure: SSLException while pushing https://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/test2M: Connection reset by peer (Write failed)
event: [1652695787913] BOTH   http_plugin       TRANSFER:EXIT   davs://ds-509.cr.cnaf.infn.it:8443/juno-test/rucio4juno/test_parallel_transfer/test2M => davs://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/test2M
gfal-copy error: 5 (Input/output error) - TRANSFER  ERROR: Copy failed with mode 3rd push, with error: Transfer failed: failure: SSLException while pushing https://eos-mgm.cr.cnaf.infn.it:9000/eos/dockertest/andrea/test2M: Connection reset by peer (Write failed)

Checking the EOS logs, it seems there is a miscommunication between the file transfer services, in fact EOS always attempts to parse all the file as a header. So in the first example above all the file content has been written into the logs.
Indeed, from the following configuration of EOS I suppose that this kind of behavoiur can only work until the size is lower than 1MB:

220516 10:05:55 395 http_Protocol: getDataOneShot BuffAvailable: 1048576 maxread: 1048576
220516 10:05:55 395 http_Protocol: getDataOneShot sslavail: 1048576

and this is the reason for which the transfers fail for the files with size bigger than 1MB.
The failed transfer logs of EOS show the following error:

220516 10:09:47 1641 http_Protocol: getDataOneShot BuffAvailable: 1043735 maxread: 1043735
220516 10:09:47 1641 http_Protocol: getDataOneShot sslavail: 1043735
220516 10:09:47 1641 http_Protocol: read 8192 of 1043735 bytes
220516 10:09:47 1641 http_Protocol:  rc:77 got hdr line: E9wjrlRm7txrFiOrA4nJWjotjWOZhe417faFkbGylOwm1YEbHj1cRdALQbKGtjD3CH9KxKRl6QHV

220516 10:09:47 1641 http_Protocol:  Parsing first line: E9wjrlRm7txrFiOrA4nJWjotjWOZhe417faFkbGylOwm1YEbHj1cRdALQbKGtjD3CH9KxKRl6QHV

220516 10:09:47 1641 http_Protocol:  Parsing of first line failed with -1
220516 10:09:47 1641 http_Protocol:  Cleanup
220516 10:09:47 1641 http_Protocol:  SSL_shutdown failed
220516 10:09:47 1641 http_Protocol:  Reset
220516 10:09:47 1641 http_Req:  XrdHttpReq request ended.

in which EOS tries to parse the first content line of the file, but then it fails.

As further information, by looking at the storm webdav logs, the third-party copy it is successfully submitted in both cases.
In summary, it seems that in this specific case of the file transfer EOS is not able to distinguish the header from the content and it parses all the content as the header.
As a consequence, the file transfer fails whenever the size is greater than 1MB.

Sorry if I made any mistakes or I was not clear and, please, feel free to move this issue if needed.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions