Skip to content

Commit 5260411

Browse files
pawel/fix google drive download links (#510)
According to Google API documentation, the `webContentLink` and `exportLink` are intended to be used in browsers, not by scripts. This leads to a situation when e.g. `webContentLink` redirects to the Google'a auth login page, which is downloaded and sent to partition. Instead of that we should use the `googleclient`'s methods, that [call the Google Drive appropriate APIs to perform download/export operations](https://developers.google.com/workspace/drive/api/guides/manage-downloads#python): - `get_media` to download standalone files - `export` to export Google Workspace native files (Google Docs, Google Slides, Google Sheets) to corresponding office files (docx, pptx, xlsx, accordingly) - `download` to export Google Workspace native files for files that result with >10MB size - this operation uses LRO (Long Running Operation) mechanism described [here](https://developers.google.com/workspace/drive/api/guides/long-running-operations)
1 parent 484e863 commit 5260411

File tree

12 files changed

+3298
-327
lines changed

12 files changed

+3298
-327
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
### Fixes
1414

1515
* **Fix Makes user_pname optional for Sharepoint**
16+
* **Fix Google Drive download links and enhance download method to use LRO for large files**
1617

1718
## 1.0.27
1819

test_e2e/expected-structured-output/google-drive/fake.docx.json

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
],
1111
"filetype": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
1212
"data_source": {
13-
"url": "https://drive.google.com/uc?id=1SpQuE7jHz9nMt5hfQXsiok1SgIdRYX5o&export=download",
13+
"url": null,
1414
"record_locator": {
1515
"file_id": "1SpQuE7jHz9nMt5hfQXsiok1SgIdRYX5o"
1616
},
@@ -53,8 +53,7 @@
5353
"groups": []
5454
}
5555
}
56-
],
57-
"filesize_bytes": 36602
56+
]
5857
}
5958
}
6059
}
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
[
2+
{
3+
"type": "Title",
4+
"element_id": "5067da8d054c11853ddf38452877ba25",
5+
"text": "three",
6+
"metadata": {
7+
"languages": [
8+
"eng"
9+
],
10+
"filetype": "text/plain",
11+
"data_source": {
12+
"url": null,
13+
"record_locator": {
14+
"file_id": "1cTKXAreuj-wYmL38nFnqKvz3X8UKcaMC"
15+
},
16+
"date_created": "1686809759.687",
17+
"date_modified": "1686809739.0",
18+
"permissions_data": [
19+
{
20+
"read": {
21+
"users": [
22+
"03887347926440898356",
23+
"04774006893477068632",
24+
"09147371668407854156",
25+
"13662041828528429192",
26+
"18298851591250030956"
27+
],
28+
"groups": [
29+
"10619079449796831495"
30+
]
31+
}
32+
},
33+
{
34+
"update": {
35+
"users": [
36+
"03887347926440898356",
37+
"04774006893477068632",
38+
"09147371668407854156",
39+
"13662041828528429192",
40+
"18298851591250030956"
41+
],
42+
"groups": [
43+
"10619079449796831495"
44+
]
45+
}
46+
},
47+
{
48+
"delete": {
49+
"users": [
50+
"04774006893477068632"
51+
],
52+
"groups": []
53+
}
54+
}
55+
]
56+
}
57+
}
58+
}
59+
]

0 commit comments

Comments
 (0)