Skip to content

Commit b0b3917

Browse files
committed
Freshness update for dataset-bing-covid-19.md . . .
1 parent 17eb27c commit b0b3917

File tree

1 file changed

+54
-43
lines changed

1 file changed

+54
-43
lines changed

articles/open-datasets/dataset-bing-covid-19.md

Lines changed: 54 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -3,42 +3,61 @@ title: Bing COVID-19
33
description: Learn how to use the Bing COVID-19 dataset in Azure Open Datasets.
44
ms.service: open-datasets
55
ms.topic: sample
6-
ms.date: 04/16/2021
6+
ms.reviewer: franksolomon
7+
ms.date: 06/11/2024
78
---
89

910
# Bing COVID-19
1011

11-
Bing COVID-19 data includes confirmed, fatal, and recovered cases from all regions, updated daily.
12-
This data is reflected in the [Bing COVID-19 Tracker](https://bing.com/covid).
12+
Bing COVID-19 data includes confirmed, fatal, and recovered cases from all regions, updated daily. The [Bing COVID-19 Tracker](https://bing.com/covid) reflects this data.
1313

14-
Bing collects data from multiple trusted, reliable sources, including the [World Health Organization (WHO)](https://www.who.int/emergencies/diseases/novel-coronavirus-2019), [Centers for Disease Control and Prevention (CDC)](https://www.cdc.gov/coronavirus/2019-ncov/index.html), national/regional and state public health departments, [BNO News](https://bnonews.com/index.php/2020/04/the-latest-coronavirus-cases/), [24/7 Wall St.](https://247wallst.com/), and [Wikipedia](https://en.wikipedia.org/wiki/2019%E2%80%9320_coronavirus_pandemic).
14+
Bing collects data from multiple trusted, reliable sources, including:
15+
16+
- [BNO News](https://bnonews.com/index.php/2020/04/the-latest-coronavirus-cases/)
17+
- [Centers for Disease Control and Prevention (CDC)](https://www.cdc.gov/coronavirus/2019-ncov/index.html)
18+
- National/regional and state public health departments
19+
- [Wikipedia](https://en.wikipedia.org/wiki/2019%E2%80%9320_coronavirus_pandemic)
20+
- The [World Health Organization (WHO)](https://www.who.int/emergencies/diseases/novel-coronavirus-2019)
21+
- [24/7 Wall St.](https://247wallst.com/)
1522

1623
[!INCLUDE [Open Dataset usage notice](../../includes/open-datasets-usage-note.md)]
1724

1825
## Datasets
19-
Modified datasets are available in CSV, JSON, JSON-Lines, and Parquet.
2026

21-
- https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.csv
22-
- https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.json
23-
- https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.jsonl
24-
- https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet
27+
Modified Bing COVID-19 datasets are available in CSV, JSON, JSON-Lines, and Parquet:
28+
29+
- [CSV](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.csv)
30+
- [JSON](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.json)
31+
- [JSON-Lines](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.jsonl)
32+
- [Parquet](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet)
2533

26-
All modified datasets have ISO 3166 subdivision codes and load times added, and use lower case column names with underscore separators.
34+
All modified datasets have ISO 3166 subdivision codes and load times added. They use lower case column names with underscore separators.
2735

28-
Raw data: https://pandemicdatalake.blob.core.windows.net/public/raw/covid-19/bing_covid-19_data/latest/Bing-COVID19-Data.csv
36+
[CSV-format raw data](https://pandemicdatalake.blob.core.windows.net/public/raw/covid-19/bing_covid-19_data/latest/Bing-COVID19-Data.csv)
2937

30-
Previous versions of modified and raw data: https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/
38+
Earlier versions of modified and raw data are available at [this resource](https://github.com/microsoft/Bing-COVID-19-Data/commits/).
3139

3240
## Data volume
33-
All datasets are updated daily. As of May 11, 2020 they contained 125,576 rows (CSV 16.1 MB, JSON 40.0 MB, JSONL 39.6 MB, Parquet 1.1 MB).
41+
All datasets receive daily updates. As of March 5, 2023 they contained 4,766,737 rows. The dataset is available in these file formats:
42+
43+
- CSV (560.3 MB)
44+
- JSON (1515.6 MB)
45+
- JSONL (1506.2 MB)
46+
- Parquet (55.4 MB)
3447

3548
## License and use rights attribution
36-
This data is available strictly for educational and academic purposes, such as medical research, government agencies, and academic institutions, under [terms and conditions](https://github.com/microsoft/Bing-COVID-19-Data/blob/master/LICENSE.txt).
3749

38-
Data used or cited in publications should include an attribution to ‘Bing COVID-19 Tracker’ with a link to www.bing.com/covid.
50+
The data is available strictly for educational and academic purposes under [these terms and conditions](https://github.com/microsoft/Bing-COVID-19-Data/blob/master/LICENSE.txt). Valid purposes include:
51+
52+
- academic institutions
53+
- government agencies
54+
- medical research
55+
56+
Data used or cited in publications should include an attribution to **‘Bing COVID-19 Tracker’**, with a link to [www.bing.com/covid](https://www.bing.com/covid).
3957

4058
## Contact
41-
For any questions or feedback about this or other datasets in the COVID-19 Data Lake, please contact [email protected].
59+
60+
For any questions or feedback about this or other datasets in the COVID-19 Data Lake contact [[email protected]](mailto:[email protected]).
4261

4362
## Columns
4463

@@ -77,33 +96,27 @@ For any questions or feedback about this or other datasets in the COVID-19 Data
7796
| 339003 | 2020-01-29 | 6065 | 0 | null | null | Worldwide | null | null | null | 4/26/2021 12:06:34 AM | 1472 | 0 |
7897
| 339004 | 2020-01-30 | 7818 | 0 | null | null | Worldwide | null | null | null | 4/26/2021 12:06:34 AM | 1753 | 0 |
7998

80-
## Data access
81-
82-
### Azure Notebooks
99+
## Data access - Azure Notebooks
83100

84101
# [azure-storage](#tab/azure-storage)
85102

86103
<!-- nbstart https://opendatasets-api.azure.com/discoveryapi/OpenDataset/DownloadNotebook?serviceType=AzureNotebooks&package=azure-storage&registryId=bing-covid-19-data -->
87104

105+
> [!NOTE]
106+
> This notebook documents the URLs and sample code to access the [**Bing COVID-19 Dataset**](https://github.com/microsoft/Bing-COVID-19-Data).
88107
89-
#### This notebook documents the URLs and sample code to access the [Bing COVID-19 Dataset](https://github.com/microsoft/Bing-COVID-19-Data)
108+
Use these URLs to get specific file formats hosted on Azure Blob Storage:
90109

91-
Use the following URLs to get specific file formats hosted on Azure Blob Storage:
110+
- [CSV](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.csv)
111+
- [JSON](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.json)
112+
- [JSON-Lines](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.jsonl)
113+
- [Parquet](https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet)
92114

93-
CSV: https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.csv
115+
Download the dataset file using the built-in capability of Pandas to download from an HTTP URL. Pandas has readers for various file formats:
94116

95-
JSON: https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.json
96-
97-
JSONL: https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.jsonl
98-
99-
Parquet: https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet
100-
101-
Download the dataset file using the built-in capability download from an http URL in Pandas. Pandas has readers for various file formats:
102-
103-
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_parquet.html
104-
105-
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
117+
[pandas.read_parquet](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_parquet.html)
106118

119+
[pandas.read_csv](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)
107120

108121
```python
109122
import pandas as pd
@@ -115,13 +128,13 @@ df = pd.read_parquet("https://pandemicdatalake.blob.core.windows.net/public/cura
115128
df.head(10)
116129
```
117130

118-
Lets check the data types of the various fields and verify that the updated column is datetime format
131+
To verify that the updated column has a datetime format, check the data types of the various fields:
119132

120133
```python
121134
df.dtypes
122135
```
123136

124-
We will now look into Worldwide data and plot some simple charts to visualize the data
137+
Review the Worldwide data. To visualize the data, build some charts:
125138

126139
```python
127140
df_Worldwide=df[df['country_region']=='Worldwide']
@@ -144,21 +157,20 @@ df_Worldwide.plot(kind='line',x='updated',y="deaths_change",grid=True)
144157

145158
# [pyspark](#tab/pyspark)
146159

147-
Sample not available for this platform/package combination.
160+
A sample isn't available for this platform / package combination.
148161

149162
---
150163

151-
### Azure Databricks
164+
## Data Access - Azure Databricks
152165

153166
# [azure-storage](#tab/azure-storage)
154167

155-
Sample not available for this platform/package combination.
168+
A sample isn't available for this platform / package combination.
156169

157170
# [pyspark](#tab/pyspark)
158171

159172
<!-- nbstart https://opendatasets-api.azure.com/discoveryapi/OpenDataset/DownloadNotebook?serviceType=AzureDatabricks&package=pyspark&registryId=bing-covid-19-data -->
160173

161-
162174
```python
163175
# Azure storage access info
164176
blob_account_name = "pandemicdatalake"
@@ -193,17 +205,16 @@ display(spark.sql('SELECT * FROM source LIMIT 10'))
193205

194206
---
195207

196-
### Azure Synapse
208+
## Data Access - Azure Synapse
197209

198210
# [azure-storage](#tab/azure-storage)
199211

200-
Sample not available for this platform/package combination.
212+
A sample isn't available for this platform / package combination.
201213

202214
# [pyspark](#tab/pyspark)
203215

204216
<!-- nbstart https://opendatasets-api.azure.com/discoveryapi/OpenDataset/DownloadNotebook?serviceType=AzureSynapse&package=pyspark&registryId=bing-covid-19-data -->
205217

206-
207218
```python
208219
# Azure storage access info
209220
blob_account_name = "pandemicdatalake"
@@ -240,4 +251,4 @@ display(spark.sql('SELECT * FROM source LIMIT 10'))
240251

241252
## Next steps
242253

243-
View the rest of the datasets in the [Open Datasets catalog](dataset-catalog.md).
254+
View the rest of the datasets in the [Open Datasets catalog](./dataset-catalog.md).

0 commit comments

Comments
 (0)