Skip to content

Commit edb562a

Browse files
committed
feat(downloading-files)
1 parent 67a9c2e commit edb562a

File tree

2 files changed

+99
-1
lines changed

2 files changed

+99
-1
lines changed
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
---
2+
title: Downloading files
3+
description: Learn how to automate the downloading and saving of files to the disk using Puppeteer or Playwright.
4+
menuWeight: 3
5+
paths:
6+
- puppeteer-playwright/common-use-cases/downloading-files
7+
---
8+
9+
# Downloading files
10+
11+
Downloading a file using Puppeteer can be tricky. On some systems, there can be issues with the usual file saving process that prevent you from doing it the easy way. However, there are different techniques that work (most of the time).
12+
13+
These techniques are only necessary when we don't have a direct file link, which is usually the case when the file being downloaded is based on more complicated data export.
14+
15+
## [](#setting-up-a-download-path) Setting up a download path
16+
17+
Let's start with the easiest technique. This method tells the browser in what folder we want to download a file from Puppeteer after clicking on it.
18+
19+
```JavaScript
20+
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './my-downloads'})
21+
```
22+
23+
We use the mysterious `_client` API which gives us access to all the functions of the underlying developer console protocol. Basically, it extends Puppeteer's functionality. Then we can download the file by clicking on the button.
24+
25+
```JavaScript
26+
await page.click('.export-button');
27+
```
28+
29+
Let's wait for one minute. In a real use case, you want to check the state of the file in the file system.
30+
31+
```JavaScript
32+
await page.waitFor(60000);
33+
```
34+
35+
To extract the file from the file system into memory, we have to first find its name, and then we can read it.
36+
37+
```JavaScript
38+
import fs from 'fs';
39+
40+
const fileNames = fs.readdirSync('./my-downloads');
41+
42+
// Let's pick the first one
43+
const fileData = fs.readFileSync(`./my-downloads/${fileNames[0]}`);
44+
```
45+
46+
## [](#intercepting-a-file-download-request) Intercepting and replicating a file download request
47+
48+
For this second option, we can trigger the file download, intercept the request going out, and then replicate it to get the actual data. First, we need to enable request interception. This is done using the following line of code:
49+
50+
```JavaScript
51+
await page.setRequestInterception(true);
52+
```
53+
54+
Next, we need to trigger the actual file export. We might need to fill in some form, select an exported file type, etc. In the end, it will look something like this:
55+
56+
```JavaScript
57+
await page.click('.export-button');
58+
```
59+
60+
We don't need to await this promise since we'll be waiting for the result of this action anyway (the triggered request).
61+
62+
The crucial part is intercepting the request that would result in downloading the file. Since the interception is already enabled, we just need to wait for the request to be sent.
63+
64+
```JavaScript
65+
const xRequest = await new Promise(resolve => {
66+
page.on('request', interceptedRequest => {
67+
interceptedRequest.abort(); //stop intercepting requests
68+
resolve(interceptedRequest);
69+
});
70+
});
71+
```
72+
73+
The last thing is to convert the intercepted Puppeteer request into a request-promise options object. We need to have the `request-promise` package installed.
74+
75+
```JavaScript
76+
import request from 'request-promise';
77+
```
78+
79+
Since the request interception does not include cookies, we need to add them subsequently.
80+
81+
```JavaScript
82+
const options = {
83+
encoding: null,
84+
method: xRequest._method,
85+
uri: xRequest._url,
86+
body: xRequest._postData,
87+
headers: xRequest._headers
88+
}
89+
90+
// Add the cookies
91+
const cookies = await page.cookies();
92+
options.headers.Cookie = cookies.map(ck => ck.name + '=' + ck.value).join(';');
93+
94+
// Resend the request
95+
const response = await request(options);
96+
```
97+
98+
Now, the response contains the binary data of the downloaded file. It can be saved to the disk, uploaded somewhere, or [submitted with another form]({{@link puppeteer_playwright/common_use_cases/submitting_a_form_with_a_file_attachment.md}}).

content/academy/puppeteer_playwright/common_use_cases/submitting_a_form_with_a_file_attachment.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Submitting a form with a file attachment
33
description: Understand how to download a file, attach it to a form using a headless browser in Playwright or Puppeteer, then submit the form.
4-
menuWeight: 3
4+
menuWeight: 4
55
paths:
66
- puppeteer-playwright/common-use-cases/submitting-a-form-with-a-file-attachment
77
---

0 commit comments

Comments
 (0)