-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
This seems to happen with particularly large files (440+ pages, all image-based):
β― python scribd-downloader.py
Input link Scribd: https://www.scribd.com/document/[REDACTED]
Link embed: https://www.scribd.com/embeds/[REDACTED]/content
Output filename: [REDACTED].pdf
π Starting headless Chrome browser...
β
Cookie dialogs hidden
π Found 433 pages, scrolling...
Scrolled 10/433 pages...
Scrolled 20/433 pages...
Scrolled 30/433 pages...
Scrolled 40/433 pages...
Scrolled 50/433 pages...
Scrolled 60/433 pages...
Scrolled 70/433 pages...
Scrolled 80/433 pages...
Scrolled 90/433 pages...
Scrolled 100/433 pages...
Scrolled 110/433 pages...
Scrolled 120/433 pages...
Scrolled 130/433 pages...
Scrolled 140/433 pages...
Scrolled 150/433 pages...
Scrolled 160/433 pages...
Scrolled 170/433 pages...
Scrolled 180/433 pages...
Scrolled 190/433 pages...
Scrolled 200/433 pages...
Scrolled 210/433 pages...
Scrolled 220/433 pages...
Scrolled 230/433 pages...
Scrolled 240/433 pages...
Scrolled 250/433 pages...
Scrolled 260/433 pages...
Scrolled 270/433 pages...
Scrolled 280/433 pages...
Scrolled 290/433 pages...
Scrolled 300/433 pages...
Scrolled 310/433 pages...
Scrolled 320/433 pages...
Scrolled 330/433 pages...
Scrolled 340/433 pages...
Scrolled 350/433 pages...
Scrolled 360/433 pages...
Scrolled 370/433 pages...
Scrolled 380/433 pages...
Scrolled 390/433 pages...
Scrolled 400/433 pages...
Scrolled 410/433 pages...
Scrolled 420/433 pages...
Scrolled 430/433 pages...
β
All 433 pages loaded
β
Top toolbar removed
β
Bottom toolbar removed
β
Cleaned 1 scroll containers
β
Print CSS injected
π₯ Saving PDF as: [REDACTED].pdf
Page size: Executive (7.25" x 10.5")
Margins: None
Headers/Footers: Disabled
β Error saving PDF: HTTPConnectionPool(host='localhost', port=58181): Read timed out. (read timeout=120)
β οΈ Auto-save failed. Opening print dialog as fallback...
Traceback (most recent call last):
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/connectionpool.py", line 534, in _make_request
response = conn.getresponse()
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/connection.py", line 571, in getresponse
httplib_response = super().getresponse()
File "/usr/lib/python3.13/http/client.py", line 1450, in getresponse
response.begin()
~~~~~~~~~~~~~~^^
File "/usr/lib/python3.13/http/client.py", line 336, in begin
version, status, reason = self._read_status()
~~~~~~~~~~~~~~~~~^^
File "/usr/lib/python3.13/http/client.py", line 297, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/usr/lib/python3.13/socket.py", line 719, in readinto
return self._sock.recv_into(b)
~~~~~~~~~~~~~~~~~~~~^^^
TimeoutError: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/wrk/avatar/src/scribd-downloader/scribd-downloader.py", line 448, in <module>
driver.execute_script("window.print();")
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/selenium/webdriver/remote/webdriver.py", line 518, in execute_script
return self.execute(command, {"script": script, "args": converted_args})["value"]
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/selenium/webdriver/remote/webdriver.py", line 429, in execute
response = cast(RemoteConnection, self.command_executor).execute(driver_command, params)
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/selenium/webdriver/remote/remote_connection.py", line 406, in execute
return self._request(command_info[0], url, body=data)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/selenium/webdriver/remote/remote_connection.py", line 430, in _request
response = self._conn.request(method, url, body=body, headers=headers, timeout=self._client_config.timeout)
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/_request_methods.py", line 143, in request
return self.request_encode_body(
~~~~~~~~~~~~~~~~~~~~~~~~^
method, url, fields=fields, headers=headers, **urlopen_kw
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/_request_methods.py", line 278, in request_encode_body
return self.urlopen(method, url, **extra_kw)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/poolmanager.py", line 457, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/connectionpool.py", line 841, in urlopen
retries = retries.increment(
method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
)
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/util/retry.py", line 490, in increment
raise reraise(type(error), error, _stacktrace)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/util/util.py", line 39, in reraise
raise value
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/connectionpool.py", line 787, in urlopen
response = self._make_request(
conn,
...<10 lines>...
**response_kw,
)
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/connectionpool.py", line 536, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avatar/src/scribd-downloader/lib/python3.13/site-packages/urllib3/connectionpool.py", line 367, in _raise_timeout
raise ReadTimeoutError(
self, url, f"Read timed out. (read timeout={timeout_value})"
) from err
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=58181): Read timed out. (read timeout=120)
[1] 312770 exit 1 python scribd-downloader.py
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels