Skip to content

Commit da392e6

Browse files
authored
remove StreamingBodyAdaptor that didn't allow choosing the chunk size (#2929)
This StreamingBodyAdaptor caused performance issues: Calling `__iter__()` on a StreamingBodyAdaptor object defaulted to chunks of 1 byte, which was really slowing down reads. I wasn't able to pass the `size` parameter to change the chunk size botocore StreamingBody implementation performs better: https://github.com/boto/botocore/blob/master/botocore/response.py#L90 It defaults to chunks of 1024 bytes, and can be adjusted. It also offers iter_lines() and iter_chunks() methods
1 parent 3c2712b commit da392e6

File tree

1 file changed

+1
-17
lines changed

1 file changed

+1
-17
lines changed

luigi/contrib/s3.py

Lines changed: 1 addition & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,6 @@
2323
from __future__ import division
2424

2525
import datetime
26-
import io
2726
import itertools
2827
import logging
2928
import os
@@ -76,21 +75,6 @@ class DeprecatedBotoClientException(Exception):
7675
pass
7776

7877

79-
class _StreamingBodyAdaptor(io.IOBase):
80-
"""
81-
Adapter class wrapping botocore's StreamingBody to make a file like iterable
82-
"""
83-
84-
def __init__(self, streaming_body):
85-
self.streaming_body = streaming_body
86-
87-
def read(self, size):
88-
return self.streaming_body.read(size)
89-
90-
def close(self):
91-
return self.streaming_body.close()
92-
93-
9478
class S3Client(FileSystem):
9579
"""
9680
boto3-powered S3 client.
@@ -586,7 +570,7 @@ def move_to_final_destination(self):
586570

587571
class ReadableS3File(object):
588572
def __init__(self, s3_key):
589-
self.s3_key = _StreamingBodyAdaptor(s3_key.get()['Body'])
573+
self.s3_key = s3_key.get()['Body']
590574
self.buffer = []
591575
self.closed = False
592576
self.finished = False

0 commit comments

Comments
 (0)