Skip to content

Commit 6f4bf76

Browse files
authored
Update dataset-illumina-platinum-genomes.md
1 parent ce93af7 commit 6f4bf76

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

articles/open-datasets/dataset-illumina-platinum-genomes.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.date: 04/16/2021
88

99
# Illumina Platinum Genomes
1010

11-
Whole-genome sequencing is enabling researchers worldwide to characterize the human genome more fully and accurately. This requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes as a benchmark. Illumina has generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree. Illumina has called variants in each genome using a range of currently available algorithms.
11+
Whole-genome sequencing is enabling researchers worldwide to characterize the human genome more fully and accurately. This effort requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes as a benchmark. Illumina generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree. Illumina called variants in each genome using a range of currently available algorithms.
1212

1313
For more information on the data, see the official [Illumina site](https://www.illumina.com/platinumgenomes.html).
1414

@@ -51,7 +51,7 @@ For any questions or feedback about the dataset, contact platinumgenomes@illumin
5151

5252
## Getting the Illumina Platinum Genomes from Azure Open Datasets and Doing Initial Analysis
5353

54-
Use Jupyter notebooks, GATK, and Picard to do the following:
54+
Use Jupyter notebooks, GATK, and Picard in analyses such as:
5555

5656
1. Annotate genotypes using VariantFiltration
5757
2. Select Specific Variants
@@ -73,7 +73,7 @@ This notebook requires the following libraries:
7373

7474
## Getting the Genomics data from Azure Open Datasets
7575

76-
Several public genomics data has been uploaded as an Azure Open Dataset [here](https://azure.microsoft.com/services/open-datasets/catalog/). We create a blob service linked to this open dataset. You can find examples of data calling procedure from Azure Open Dataset for `Illumina Platinum Genomes` datasets in below:
76+
Several public genomics data has been uploaded as an Azure Open Dataset [here](https://azure.microsoft.com/services/open-datasets/catalog/). We create a blob service linked to this open dataset. You can find examples of data calling procedure from Azure Open Dataset for `Illumina Platinum Genomes` datasets as:
7777

7878
### Downloading the specific 'Illumina Platinum Genomes'
7979

@@ -160,7 +160,7 @@ Extract fields from a VCF file to a tab-delimited table. This tool extracts spec
160160

161161
INFO/site-level fields:
162162

163-
Use the `-F` argument to extract INFO fields; each field will occupy a single column in the output file. The field can be any standard VCF column (for example, CHROM, ID, QUAL) or any annotation name in the INFO field (for example, AC, AF). The tool also supports the following fields:
163+
Use the `-F` argument to extract INFO fields; each field occupies a single column in the output file. The field can be any standard VCF column (for example, CHROM, ID, QUAL) or any annotation name in the INFO field (for example, AC, AF). The tool also supports the following fields:
164164

165165
EVENTLENGTH (length of the event)
166166
TRANSITION (1 for a bi-allelic transition (SNP), 0 for bi-allelic transversion (SNP), -1 for INDELs and multi-allelics)

0 commit comments

Comments
 (0)