Skip to content

Commit 89ff7be

Browse files
committed
Cache the file during contiguous upload for Dropbox
Dropbox's rate limiting feature may cause uploading/copying/moving many files in parallel to fail with either "too_many_requests" or "too_many_write_operations". The former is literally used for "too many requests" while the latter for namespace lock contentions. See https://www.dropbox.com/developers/reference/data-ingress-guide for more details. In addition, Dropbox doesn't reveal how they rate-limit requests and is actively testing different algorithms. They recommend clients to retry according to the "Retry-After" header in the 429 response. However, the retry doesn't work in a straightforward way for upload requests since the stream will have been consumed when the request is finished. Please note both inter copy and move use upload internally. The solution is to cache the stream locally into a temporary file and stream from it for both the initial request and following 429 retries.
1 parent 1e2841c commit 89ff7be

File tree

1 file changed

+51
-17
lines changed

1 file changed

+51
-17
lines changed

waterbutler/providers/dropbox/provider.py

Lines changed: 51 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import json
22
import typing
33
import logging
4+
import tempfile
45
from http import HTTPStatus
56

67
from waterbutler.core import provider, streams
@@ -281,29 +282,62 @@ async def _contiguous_upload(self,
281282
:param conflict: whether to replace upon conflict
282283
:rtype: `dict`
283284
:return: A dictionary of the metadata about the file just uploaded
285+
286+
Quirks: Dropbox Rate Limiting
287+
288+
Making requests to Dropbox API via OAuth2 may be rate-limited with a 429 response. The error message can be
289+
"too_many_requests" which literally means too many request, or "too_many_write_operations" which means namespace
290+
lock contentions has occurred. Both can be solved by retry the request after the time indicated in the response
291+
header "Retry-After". In addition, Dropbox's rate-limiting algorithm is a black box. They also keep trying
292+
different ones. This makes balancing the requests less effective.
293+
294+
References: https://www.dropbox.com/developers/reference/data-ingress-guide
295+
296+
Quirks: Retry Upload Load Requests
297+
298+
When an upload request finishes, the stream will have been consumed. In order to retry such a request, WB needs
299+
to cache a temporary local file from the incoming stream so both the initial request and retries can stream from
300+
this file.
284301
"""
285302

286303
path_arg = {"path": path.full_path}
287304
if conflict == 'replace':
288305
path_arg['mode'] = 'overwrite'
289306

290-
resp = await self.make_request(
291-
'POST',
292-
self._build_content_url('files', 'upload'),
293-
headers={
294-
'Content-Type': 'application/octet-stream',
295-
'Dropbox-API-Arg': json.dumps(path_arg),
296-
'Content-Length': str(stream.size),
297-
},
298-
data=stream,
299-
expects=(200, 409,),
300-
throws=core_exceptions.UploadError,
301-
)
302-
303-
data = await resp.json()
304-
if resp.status == 409:
305-
self.dropbox_conflict_error_handler(data, path.path)
306-
return data
307+
file_cache = tempfile.TemporaryFile()
308+
chunk = await stream.read()
309+
while chunk:
310+
file_cache.write(chunk)
311+
chunk = await stream.read()
312+
313+
rate_limit_retry = 0
314+
while rate_limit_retry < 2:
315+
file_stream = streams.FileStreamReader(file_cache)
316+
resp = await self.make_request(
317+
'POST',
318+
self._build_content_url('files', 'upload'),
319+
headers={
320+
'Content-Type': 'application/octet-stream',
321+
'Dropbox-API-Arg': json.dumps(path_arg),
322+
'Content-Length': str(file_stream.size),
323+
},
324+
data=file_stream,
325+
expects=(200, 409, 429, ),
326+
throws=core_exceptions.UploadError,
327+
)
328+
data = await resp.json()
329+
if resp.status == 429:
330+
rate_limit_retry += 1
331+
logger.debug('Retry {} for {}'.format(rate_limit_retry, str(path)))
332+
continue
333+
elif resp.status == 409:
334+
file_cache.close()
335+
self.dropbox_conflict_error_handler(data, path.path)
336+
else:
337+
file_cache.close()
338+
return data
339+
file_cache.close()
340+
raise core_exceptions.UploadError(message='Upload failed for {} due to rate limiting'.format(str(path)))
307341

308342
async def _chunked_upload(self, stream: streams.BaseStream, path: WaterButlerPath,
309343
conflict: str='replace') -> dict:

0 commit comments

Comments
 (0)