Merge pull request #896 from MicrosoftDocs/main

Saisang · web-flow · commit a186e658df51 · 2024-10-18T13:34:06.000+08:00
10/18 11:00 AM IST Publish
diff --git a/articles/ai-services/openai/how-to/batch.md b/articles/ai-services/openai/how-to/batch.md
@@ -157,71 +157,6 @@ Yes. Similar to other deployment types, you can create content filters and assoc
 
 Yes, from the quota page in the Studio UI. Default quota allocation can be found in the [quota and limits article](../quotas-limits.md#global-batch-quota).
 
-### How do I tell how many tokens my batch request contains, and how many tokens are available as quota?
-
-The `2024-10-01-preview` REST API adds two new response headers:
-
-* `deployment-enqueued-tokens` - A approximate token count for your jsonl file calculated immediately after the batch request is submitted. This value represents an estimate based on the number of characters and is not the true token count.
-* `deployment-maximum-enqueued-tokens` The total available enqueued tokens available for this global batch model deployment.
-
-These response headers are only available when making a POST request to begin batch processing of a file with the REST API. The language specific client libraries do not currently return these new response headers. To return all response headers you can add `-i` to the standard REST request.
-
-```http
-curl -i -X POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/batches?api-version=2024-10-01-preview \
-  -H "api-key: $AZURE_OPENAI_API_KEY" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "input_file_id": "file-abc123",
-    "endpoint": "/chat/completions",
-    "completion_window": "24h"
-  }'
-```
-
-```output
-HTTP/1.1 200 OK
-Content-Length: 619
-Content-Type: application/json; charset=utf-8
-Vary: Accept-Encoding
-Request-Context: appId=
-x-ms-response-type: standard
-deployment-enqueued-tokens: 139
-deployment-maximum-enqueued-tokens: 740000
-Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
-X-Content-Type-Options: nosniff
-x-aml-cluster: vienna-swedencentral-01
-x-request-time: 2.125
-apim-request-id: c8bf4351-c6f5-4bfe-9a79-ef3720eca8af
-x-ms-region: Sweden Central
-Date: Thu, 17 Oct 2024 01:45:45 GMT
-
-{
-  "cancelled_at": null,
-  "cancelling_at": null,
-  "completed_at": null,
-  "completion_window": "24h",
-  "created_at": 1729129545,
-  "error_file_id": null,
-  "expired_at": null,
-  "expires_at": 1729215945,
-  "failed_at": null,
-  "finalizing_at": null,
-  "id": "batch_c8dd49a7-c808-4575-9957-b188cd0dd642",
-  "in_progress_at": null,
-  "input_file_id": "file-f89384af0082485da43cb26b49dc25ce",
-  "errors": null,
-  "metadata": null,
-  "object": "batch",
-  "output_file_id": null,
-  "request_counts": {
-    "total": 0,
-    "completed": 0,
-    "failed": 0
-  },
-  "status": "validating",
-  "endpoint": "/chat/completions"
-}
-```
-
 ### What happens if the API doesn't complete my request within the 24 hour time frame?
 
 We aim to process these requests within 24 hours; we don't expire the jobs that take longer. You can cancel the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work.
diff --git a/articles/ai-services/openai/how-to/code-interpreter.md b/articles/ai-services/openai/how-to/code-interpreter.md
@@ -6,16 +6,16 @@ services: cognitive-services
 manager: nitinme
 ms.service: azure-ai-openai
 ms.topic: how-to
-ms.date: 05/20/2024
-author: mrbullwinkle
-ms.author: mbullwin
+ms.date: 10/15/2024
+author: aahill
+ms.author: aahi
 recommendations: false
 
 ---
 
 # Azure OpenAI Assistants Code Interpreter (Preview)
 
-Code Interpreter allows the Assistants API to write and run Python code in a sandboxed execution environment.  With Code Interpreter enabled, your Assistant can run code iteratively to solve more challenging code, math, and data analysis problems. When your Assistant writes code that fails to run, it can iterate on this code by modifying and running different code until the code execution succeeds.
+Code Interpreter allows the Assistants API to write and run Python code in a sandboxed execution environment. With Code Interpreter enabled, your Assistant can run code iteratively to solve more challenging code, math, and data analysis problems. When your Assistant writes code that fails to run, it can iterate on this code by modifying and running different code until the code execution succeeds.
 
 > [!IMPORTANT]
 > Code Interpreter has [additional charges](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) beyond the token based fees for Azure OpenAI usage. If your Assistant calls Code Interpreter simultaneously in two different threads, two code interpreter sessions are created. Each session is active by default for one hour.
@@ -28,7 +28,7 @@ Code Interpreter allows the Assistants API to write and run Python code in a san
 
 The [models page](../concepts/models.md#assistants-preview) contains the most up-to-date information on regions/models where Assistants and code interpreter are supported.
 
-We recommend using assistants with the latest models to take advantage of the new features, as well as the larger context windows, and more up-to-date training data.
+We recommend using assistants with the latest models to take advantage of the new features, larger context windows, and more up-to-date training data.
 
 ### API Versions
 
@@ -69,7 +69,7 @@ We recommend using assistants with the latest models to take advantage of the ne
 
 ### File upload API reference
 
-Assistants use the [same API for file upload as fine-tuning](/rest/api/azureopenai/files/upload?view=rest-azureopenai-2024-02-15-preview&tabs=HTTP&preserve-view=true). When uploading a file you have to specify an appropriate value for the [purpose parameter](/rest/api/azureopenai/files/upload?view=rest-azureopenai-2024-02-15-preview&tabs=HTTP&preserve-view=true#purpose).
+Assistants use the [same API for file upload as fine-tuning](/rest/api/azureopenai/files/upload?view=rest-azureopenai-2024-02-15-preview&tabs=HTTP&preserve-view=true). When uploading a file, you have to specify an appropriate value for the [purpose parameter](/rest/api/azureopenai/files/upload?view=rest-azureopenai-2024-02-15-preview&tabs=HTTP&preserve-view=true#purpose).
 
 ## Enable Code Interpreter
 
@@ -136,7 +136,7 @@ assistant = client.beta.assistants.create(
   instructions="You are an AI assistant that can write code to help answer math questions.",
   model="gpt-4-1106-preview",
   tools=[{"type": "code_interpreter"}],
-  file_ids=[file.id]
+  tool_resources={"code interpreter":{"file_ids":[file.id]}}
 )
 ```
 
@@ -161,7 +161,11 @@ curl https://YOUR_RESOURCE_NAME.openai.azure.com/openai/assistants?api-version=2
       { "type": "code_interpreter" }
     ],
     "model": "gpt-4-1106-preview",
-    "file_ids": ["assistant-123abc456"]
+    "tool_resources"{
+      "code interpreter": {
+          "file_ids": ["assistant-1234"]
+      }
+    }
   }'
 ```
 
diff --git a/articles/ai-services/translator/language-support.md b/articles/ai-services/translator/language-support.md
@@ -114,7 +114,7 @@ ms.author: lajanuar
 |Mongolian (Traditional)|`mn-Mong`|✔|✔| | | |
 |Myanmar|`my`|✔|✔| |✔| |
 |Nepali|`ne`|✔|✔| |✔| |
-|Norwegian|`nb`|✔|✔|✔|✔|✔|
+|Norwegian Bokmål|`nb`|✔|✔|✔|✔|✔|
 |Nyanja|`nya`|✔|✔| | | |
 |Odia|`or`|✔|✔|✔|✔| |
 |Pashto|`ps`|✔|✔| |✔| |
@@ -241,7 +241,7 @@ ms.author: lajanuar
 |Mongolian (Traditional)|`mn-Mong`|Yes|No|
 |Myanmar (Burmese)|`my`|Yes|No|
 |Nepali|`ne`|Yes|Yes|
-|Norwegian|`nb`|Yes|Yes|
+|Norwegian Bokmål|`nb`|Yes|Yes|
 |Odia|`or`|Yes|No|
 |Pashto|`ps`|Yes|No|
 |Persian|`fa`|Yes|No|
diff --git a/articles/machine-learning/how-to-deploy-custom-container.md b/articles/machine-learning/how-to-deploy-custom-container.md
@@ -226,15 +226,30 @@ blue_deployment = ManagedOnlineDeployment(
 
 ---
 
-There are a few important concepts to notice in this YAML/Python parameter:
+There are a few important concepts to note in this YAML/Python parameter:
 
-#### Readiness route vs. liveness route
+#### Base image
 
-An HTTP server defines paths for both _liveness_ and _readiness_. A liveness route is used to check whether the server is running. A readiness route is used to check whether the server is ready to do work. In machine learning inference, a server could respond 200 OK to a liveness request before loading a model. The server could respond 200 OK to a readiness request only after the model is loaded into memory.
+The base image is specified as a parameter in environment, and `docker.io/tensorflow/serving:latest` is used in this example. As you inspect the container, you can find that this server uses `ENTRYPOINT` to start an entry point script, which takes the environment variables such as `MODEL_BASE_PATH` and `MODEL_NAME`, and exposes ports such as `8501`. These details are all specific information for this chosen server. You can use this understanding of the server, to determine how to define the deployment. For example, if you set environment variables for `MODEL_BASE_PATH` and `MODEL_NAME` in the deployment definition, the server (in this case, TF Serving) will take the values to initiate the server. Likewise, if you set the port for the routes to be `8501` in the deployment definition, the user request to such routes will be correctly routed to the TF Serving server.
 
-For more information about liveness and readiness probes, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/).
+Note that this specific example is based on the TF Serving case, but you can use any containers that will stay up and respond to requests coming to liveness, readiness, and scoring routes. You can refer to other examples and see how the dockerfile is formed (for example, using `CMD` instead of `ENTRYPOINT`) to create the containers.
 
-Notice that this deployment uses the same path for both liveness and readiness, since TF Serving only defines a liveness route.
+#### Inference config
+
+Inference config is a parameter in environment, and it specifies the port and path for 3 types of the route: liveness, readiness, and scoring route. Inference config is required if you want to run your own container with managed online endpoint.
+
+#### Readiness route vs liveness route
+
+The API server you choose may provide a way to check the status of the server. There are two types of the route that you can specify: _liveness_ and _readiness_. A liveness route is used to check whether the server is running. A readiness route is used to check whether the server is ready to do work. In the context of machine learning inferencing, a server could respond 200 OK to a liveness request before loading a model, and the server could respond 200 OK to a readiness request only after the model is loaded into the memory.
+
+For more information about liveness and readiness probes in general, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/).
+
+The liveness and readiness routes will be determined by the API server of your choice, as you would have identified when testing the container locally in earlier step. Note that the example deployment in this article uses the same path for both liveness and readiness, since TF Serving only defines a liveness route. Refer to other examples for different patterns to define the routes.
+
+#### Scoring route
+
+The API server you choose would provide a way to receive the payload to work on. In the context of machine learning inferencing, a server would receive the input data via a specific route. Identify this route for your API server as you test the container locally in earlier step, and specify it when you define the deployment to create.
+Note that the successful creation of the deployment will update the scoring_uri parameter of the endpoint as well, which you can verify with `az ml online-endpoint show -n <name> --query scoring_uri`.
 
 #### Locating the mounted model
 
diff --git a/articles/open-datasets/dataset-1000-genomes.md b/articles/open-datasets/dataset-1000-genomes.md
@@ -9,6 +9,8 @@ ms.date: 07/10/2024
 
 # 1000 Genomes
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 The 1000 Genomes Project ran between 2008 and 2015, to create the largest public catalog of human variation and genotype data. The final data set contains data for 2,504 individuals from 26 populations and 84 million identified variants. For more information, visit the 1000 Genome Project [website](https://www.internationalgenome.org/) and these publications:
 
 [Pilot Analysis: A map of human genome variation from population-scale sequencing Nature 467, 1061-1073 (28 October 2010)](https://www.nature.com/articles/nature09534)
diff --git a/articles/open-datasets/dataset-clinvar-annotations.md b/articles/open-datasets/dataset-clinvar-annotations.md
@@ -9,6 +9,8 @@ ms.date: 06/13/2024
 
 # ClinVar Annotations
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 The [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/) resource is a freely accessible, public archive of reports - with supporting evidence - about the relationships among human variations and phenotypes. It facilitates access to and communication about the claimed relationships between human variation and observed health status, and about the history of that interpretation. It provides access to a broader set of clinical interpretations that researchers can incorporate into genomics workflows and applications.
 
 Visit the [Data Dictionary](https://www.ncbi.nlm.nih.gov/projects/clinvar/ClinVarDataDictionary.pdf) and the [FAQ resource](https://www.ncbi.nlm.nih.gov/clinvar/docs/faq/) for more information about the data.
diff --git a/articles/open-datasets/dataset-encode.md b/articles/open-datasets/dataset-encode.md
@@ -8,6 +8,8 @@ ms.date: 04/16/2021
 
 # ENCODE: Encyclopedia of DNA Elements
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 The [Encyclopedia of DNA Elements (ENCODE) Consortium](https://www.encodeproject.org/help/project-overview/) is an ongoing international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). ENCODE's goal is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.
 
 ENCODE investigators employ various assays and methods to identify functional elements. The discovery and annotation of gene elements is accomplished primarily by sequencing a diverse range of RNA sources, comparative genomics, integrative bioinformatic methods, and human curation. Regulatory elements are typically investigated through DNA hypersensitivity assays, assays of DNA methylation, and immunoprecipitation (IP) of proteins that interact with DNA and RNA, that is, modified histones, transcription factors, chromatin regulators, and RNA-binding proteins, followed by sequencing.
diff --git a/articles/open-datasets/dataset-gatk-resource-bundle.md b/articles/open-datasets/dataset-gatk-resource-bundle.md
@@ -8,6 +8,8 @@ ms.date: 04/16/2021
 
 # GATK Resource Bundle
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 The [GATK resource bundle](https://gatk.broadinstitute.org/hc/articles/360035890811-Resource-bundle) is a collection of standard files for working with human resequencing data with the GATK.
 
 [!INCLUDE [Open Dataset usage notice](./includes/open-datasets-usage-note.md)]
diff --git a/articles/open-datasets/dataset-genomics-data-lake.md b/articles/open-datasets/dataset-genomics-data-lake.md
@@ -30,6 +30,8 @@ The Genomics Data Lake is hosted in the West US 2 and West Central US Azure regi
 | [GATK Resource Bundle](dataset-gatk-resource-bundle.md) | GATK Resource bundle |
 | [TCGA Open Data](dataset-the-cancer-genome-atlas.md) | TCGA Open Data |
 | [Pan UK-Biobank](dataset-panancestry-uk-bio-bank.md) | Pan UK-Biobank |
+| [ImmuneCODE database](dataset-immunecode.md) | ImmuneCODE database |
+| [Open Targets dataset](dataset-panancestry-uk-bio-bank.md) | Open Targets dataset |
 
 ## Next steps
 
diff --git a/articles/open-datasets/dataset-human-reference-genomes.md b/articles/open-datasets/dataset-human-reference-genomes.md
@@ -8,6 +8,8 @@ ms.date: 04/16/2021
 
 # Human Reference Genomes
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 This dataset includes two human-genome references assembled by the [Genome Reference Consortium](https://www.ncbi.nlm.nih.gov/grc): Hg19 and Hg38.
 
 For more information on Hg19 (GRCh37) data, see the [GRCh37 report at NCBI](https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.13/).
diff --git a/articles/open-datasets/dataset-illumina-platinum-genomes.md b/articles/open-datasets/dataset-illumina-platinum-genomes.md
@@ -8,6 +8,8 @@ ms.date: 04/16/2021
 
 # Illumina Platinum Genomes
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 Whole-genome sequencing is enabling researchers worldwide to characterize the human genome more fully and accurately. This requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes as a benchmark. Illumina has generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree. Illumina has called variants in each genome using a range of currently available algorithms.
 
 For more information on the data, see the official [Illumina site](https://www.illumina.com/platinumgenomes.html).
@@ -206,4 +208,4 @@ run gatk VariantsToTable -V NA12877.vcf.gz -F CHROM -F POS -F TYPE -F AC -F AD -
 
 ## Next steps
 
-View the rest of the datasets in the [Open Datasets catalog](dataset-catalog.md).
+View the rest of the datasets in the [Open Datasets catalog](dataset-catalog.md).
diff --git a/articles/open-datasets/dataset-open-cravat.md b/articles/open-datasets/dataset-open-cravat.md
@@ -8,6 +8,8 @@ ms.date: 04/16/2021
 
 # OpenCravat: Open Custom Ranked Analysis of Variants Toolkit
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 OpenCRAVAT is a Python package that performs genomic variant interpretation including variant impact, annotation, and scoring. OpenCRAVAT has a modular architecture with a wide variety of analysis modules and annotation resources that can be selected and installed/run based on the needs of a given study.
 
 For more information on the data, see the [OpenCravat](https://opencravat.org/).
diff --git a/articles/open-datasets/dataset-open-targets.md b/articles/open-datasets/dataset-open-targets.md
@@ -8,6 +8,8 @@ ms.date: 04/16/2021
 
 # Open Targets
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 The Open Targets Platform is a data resource to facilitate the systematic identification and prioritization of potential therapeutic drug targets. This resource integrates publicly available datasets, including those datasets that are generated by the Open Targets consortium, to build and score target-disease associations, aiding in the identification and prioritization of drug targets. Additionally, it incorporates pertinent annotation information about targets, diseases, phenotypes, drugs, and their key relationships.
 
 The Open Targets Genetics highlights variant-centric statistical evidence to allow both prioritization of candidate causal variants at trait-associated loci and identification of potential drug targets. It collects and combines genetic associations gathered from published literature as well as newly derived data from sources like UK Biobank and FinnGen. Additionally, it includes functional genomics information such as chromatin conformation and interactions, along with quantitative trait loci (eQTLs, pQTLs, and sQTLs). Large-scale pipelines apply statistical fine-mapping across thousands of trait-associated loci to resolve association signals and link each variant to its proximal and distal target genes using a 'Locus2Gene' assessment. Integrated cross-trait colocalisation analyses and linking to detailed pharmaceutical compounds extend the capacity of Open Targets Genetics to explore drug repositioning opportunities and shared genetic architecture.
diff --git a/articles/open-datasets/dataset-the-cancer-genome-atlas.md b/articles/open-datasets/dataset-the-cancer-genome-atlas.md
@@ -12,6 +12,8 @@ ms.date: 09/22/2022
 
 # TCGA Open Data
 
+[!INCLUDE [Open Dataset access change notice](./includes/open-datasets-change-note.md)]
+
 The Cancer Genome Atlas (TCGA), a landmark cancer genomics program, molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types[[1]](https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga). The TCGA cancer data made available publically are two tiers: open or controlled access. 
 
 - Open access [available on Azure]: This dataset contains deindentified clinical and biospecimen data or summarized data that doesn't contain any individually identifiable information. The data types included are Gene expression, methylation beta values and protein quantification. DNA level datatype includes gene level copy number and masked copy number segment.
diff --git a/articles/open-datasets/includes/open-datasets-change-note.md b/articles/open-datasets/includes/open-datasets-change-note.md