Skip to content

Commit 2b97d0d

Browse files
authored
Merge pull request #62 from oslokommune/DP-1619-public-data-no-auth
DP-1619 Allow unauthenticated access to public data
2 parents 8034854 + b8b7e29 commit 2b97d0d

File tree

4 files changed

+52
-13
lines changed

4 files changed

+52
-13
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
## ?.?.?
2+
3+
* Authentication is no longer necessary for downloading public ("green")
4+
datasets.
5+
16
## 0.6.1
27

38
* `PostEvent.post_event` now also supports retries.

README.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ If environment variables are not available, the system will try to load from a d
5252

5353
Table of contents:
5454
- [Upload data](#upload-data)
55+
- [Download data](#download-data)
5556
- [Sending events](#sending-events)
5657
- [Create and manage event streams](#create-and-manage-event-streams)
5758
- [Creating datasets with versions and editions](#creating-datasets-with-versions-and-editions)
@@ -60,7 +61,7 @@ Table of contents:
6061
## Upload data
6162

6263
When uploading data you need to refer to an existing dataset that you own, a version and an edition.
63-
If these are non existent then you can create them yourself. This can be achieved [using the sdk](#create-a-new-dataset-with-version-and-edition),
64+
If these are non existent then you can create them yourself. This can be achieved [using the sdk](#creating-datasets-with-versions-and-editions),
6465
or you can use our [command line interface](https://github.com/oslokommune/okdata-cli).
6566

6667

@@ -125,9 +126,15 @@ print(trace_events)
125126

126127
## Download data
127128

128-
When downloading data you need to refer to an existing dataset that you own, a version and an edition.
129-
If these are non existent then you can create them yourself. This can be achieved [using the sdk](#create-a-new-dataset-with-version-and-edition),
130-
or you can use our [command line interface](https://github.com/oslokommune/okdata-cli).
129+
To download data you need to refer to a dataset that you have access to. This
130+
could be a public dataset, a restricted dataset you've been given access to, or
131+
a dataset that you own yourself. If the dataset is public, [authenticating
132+
yourself](#environment-variables) is not necessary.
133+
134+
You will also need to refer to the specific version and edition of the dataset
135+
that you want to download. If this is your own dataset, make sure to create a
136+
[version and edition](#creating-datasets-with-versions-and-editions) before
137+
attempting to download it.
131138

132139
```python
133140
from okdata.sdk.data.download import Download

okdata/sdk/data/download.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,14 @@ def __init__(self, config=None, auth=None, env=None):
1515
self.data_exporter_url = self.config.get("dataExporterUrl")
1616

1717
def get_files(self, dataset_id, version, edition, retries=0):
18-
19-
get_download_urls_url = (
20-
f"{self.data_exporter_url}/{dataset_id}/{version}/{edition}"
18+
url = "{}/{}{}/{}/{}".format(
19+
self.data_exporter_url,
20+
"" if self.auth.token_provider else "public/",
21+
dataset_id,
22+
version,
23+
edition,
2124
)
22-
23-
response = self.get(get_download_urls_url, retries=retries)
24-
return response.json()
25+
return self.get(url, retries=retries).json()
2526

2627
def download(self, dataset_id, version, edition, output_path, retries=0):
2728
downloaded_files = []

tests/data/download_test.py

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@
44

55
from okdata.sdk.data.download import Download
66

7-
data_downloader = Download()
8-
97
file_name = "kake.csv"
108
s3_key = f"procecced/raw/green/{file_name}"
119
download_url = "https://www.dowload-stuff.com"
@@ -18,6 +16,20 @@
1816

1917

2018
def test_download(mock_home_dir, mock_http_calls):
19+
data_downloader = Download()
20+
output_path = f"{os.environ['HOME']}/my/path"
21+
result = data_downloader.download(
22+
dataset_id, version, edition, output_path=output_path
23+
)
24+
exp_output_file_path = f"{output_path}/{file_name}"
25+
with open(exp_output_file_path, "r") as f:
26+
assert str(f.read()) == test_file_content
27+
assert result == {"files": [exp_output_file_path]}
28+
29+
30+
def test_download_public(mock_home_dir, mock_http_calls_public):
31+
data_downloader = Download()
32+
data_downloader.auth.token_provider = None
2133
output_path = f"{os.environ['HOME']}/my/path"
2234
result = data_downloader.download(
2335
dataset_id, version, edition, output_path=output_path
@@ -38,7 +50,21 @@ def mock_http_calls(requests_mock):
3850

3951
requests_mock.register_uri(
4052
"GET",
41-
f"{data_downloader.data_exporter_url}/{dataset_id}/{version}/{edition}",
53+
f"{Download().data_exporter_url}/{dataset_id}/{version}/{edition}",
54+
text=json.dumps([{"key": s3_key, "url": download_url}]),
55+
status_code=200,
56+
)
57+
58+
requests_mock.register_uri(
59+
"GET", download_url, text=test_file_content, status_code=200
60+
)
61+
62+
63+
@pytest.fixture(scope="function")
64+
def mock_http_calls_public(requests_mock):
65+
requests_mock.register_uri(
66+
"GET",
67+
f"{Download().data_exporter_url}/public/{dataset_id}/{version}/{edition}",
4268
text=json.dumps([{"key": s3_key, "url": download_url}]),
4369
status_code=200,
4470
)

0 commit comments

Comments
 (0)