-
|
I've started rewriting some web scraping code that I use to scrape a website containing many links to audio (MP3) files. The website requires maintaining cookies/authentication between requests, but when I get to the actual part where I download the MP3 files themselves, I'm not sure if Scrapling can handle that sort of thing. I am doing something like this to load the initial web pages: with FetcherSession() as session:
landing_page = session.get(url)
# additional logic to handle auth, and grab links to audio files
...
binary_response = session.get(audio_link)
length = binary_response.headers.get("content-length")
with NamedTemporaryFile(suffix=".mp3") as tf:
bytes_written = tf.write(binary_response.content) # <-- this doesn't workThere is no Before anyone says "just use something else for the binary requests"—again, I need to preserve authentication/cookies across requests. I'm not sure if Scrapling is using anything like requests or httpx under the hood, but if it does, it would be great if there was some way to expose that underlying session. If not, then is there a way to hand off the cookies/proxy info/etc. from the FetcherSession to something like requests? Or even better—is there any kind of parameter that can be passed to |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hey mate, replacing |
Beta Was this translation helpful? Give feedback.
Hey mate, replacing
binary_response.contentwithbinary_response.bodywould do the job