Update k8s compute TSG and log info

jiaochenlu · jiaochenlu · commit ebb3c00659eb · 2023-01-12T14:25:13.000+08:00
diff --git a/articles/machine-learning/how-to-troubleshoot-online-endpoints.md b/articles/machine-learning/how-to-troubleshoot-online-endpoints.md
@@ -600,7 +600,7 @@ Below are common error codes when consuming Kubernetes online endpoints with RES
 | 429 | Too many pending requests | Your model is getting more requests than it can handle. We allow maximum 2 * `max_concurrent_requests_per_instance` * `instance_count` requests in parallel at any time. Additional requests are rejected. You can confirm these settings in your model deployment config under `request_settings` and `scale_settings`, respectively. If you're using auto-scaling, your model is getting requests faster than the system can scale up. With auto-scaling, you can try to resend requests with [exponential backoff](https://aka.ms/exponential-backoff). Doing so can give the system time to adjust. Apart from enable auto-scaling, you could also increase the number of instances by using the below [code](#how-to-prevent-503-status-codes). |
 | 502 | Has thrown an exception or crashed in the `run()` method of the score.py file | When there's an error in `score.py`, for example an imported package does not exist in the conda environment, a syntax error, or a failure in the `init()` method. You can follow [here](#error-resourcenotready) to debug the file. |
 | 503 | Receive large spikes in requests per second | The autoscaler is designed to handle gradual changes in load. If you receive large spikes in requests per second, clients may receive an HTTP status code 503. Even though the autoscaler reacts quickly, it takes AKS a significant amount of time to create more containers. You can follow [here](#how-to-prevent-503-status-codes) to prevent 503 status codes.|
-| 504 | Request has timed out | A 504 status code indicates that the request has timed out. The default timeout is 1 minute. You can increase the timeout or try to speed up the endpoint by modifying the score.py to remove unnecessary calls. If these actions don't correct the problem, you can follow [here](#error-resourcenotready) to debug the score.py file. The code may be in a non-responsive state or an infinite loop. |
+| 504 | Request has timed out | A 504 status code indicates that the request has timed out. The default timeout setting is 5s. You can increase the timeout or try to speed up the endpoint by modifying the score.py to remove unnecessary calls. If these actions don't correct the problem, you can follow [here](#error-resourcenotready) to debug the score.py file. The code may be in a non-responsive state or an infinite loop. |
 | 500 | Internal server error | Azure ML-provisioned infrastructure is failing. |
 
 
diff --git a/articles/machine-learning/reference-kubernetes.md b/articles/machine-learning/reference-kubernetes.md
@@ -391,15 +391,18 @@ More information about how to use ARM template can be found from [ARM template d
 ## Azureml extension release note
 > [!NOTE]
  >
- > New features are released at a biweekly cadance.
+ > New features are released at a biweekly calendar.
 
 | Date | Version |Version description |
 |---|---|---|
-| Aug 29, 2022 | 1.1.9 | Improved health check logic. Bugs fixed.|
-| Jun 23, 2022 | 1.1.6 | Bugs fixed. |
-| Jun 15, 2022 | 1.1.5 | Updated training to use new common runtime to run jobs. Removed Azure Relay usage for AKS extension. Removed service bus usage from the extension. Updated security context usage. Updated inference scorefe to v2. Updated to use Volcano as training job scheduler. Bugs fixed. |
+| Dec 27, 2022 | 1.1.17 | Move the Fluent-bit from DaemonSet to sidecars. Add MDC support. Refine error messages. Support cluster mode (windows, linux) jobs. Bugfixes|
+| Nov 29, 2022 | 1.1.16 |Add instance type validation by new crd. Support Tolerance. Shorten SVC Name. Workload Core hour. Multiple Bugfixes and improvements. |
+| Sep 13, 2022 | 1.1.10 | Bugfixes.|
+| Aug 29, 2022 | 1.1.9 | Improved health check logic. Bugfixes.|
+| Jun 23, 2022 | 1.1.6 | Bugfixes. |
+| Jun 15, 2022 | 1.1.5 | Updated training to use new common runtime to run jobs. Removed Azure Relay usage for AKS extension. Removed service bus usage from the extension. Updated security context usage. Updated inference azureml-fe to v2. Updated to use Volcano as training job scheduler. Bugfixes. |
 | Oct 14, 2021 | 1.0.37 | PV/PVC volume mount support in AMLArc training job. |
-| Sept 16, 2021 | 1.0.29 | New regions available, WestUS, CentralUS, NorthCentralUS, KoreaCentral. Job queue explainability. See job queue details in AML Workspace Studio. Auto-killing policy. Support max_run_duration_seconds in ScriptRunConfig. The system will attempt to automatically cancel the run if it took longer than the setting value. Performance improvement on cluster autoscale support. Arc agent and ML extension deployment from on premises container registry.|
+| Sept 16, 2021 | 1.0.29 | New regions available, WestUS, CentralUS, NorthCentralUS, KoreaCentral. Job queue explainability. See job queue details in AML Workspace Studio. Auto-killing policy. Support max_run_duration_seconds in ScriptRunConfig. The system will attempt to automatically cancel the run if it took longer than the setting value. Performance improvement on cluster auto scaling support. Arc agent and ML extension deployment from on premises container registry.|
 | August 24, 2021 | 1.0.28 | Compute instance type is supported in job YAML. Assign Managed Identity to AMLArc compute.|
 | August 10, 2021 | 1.0.20 |New Kubernetes distribution support, K3S - Lightweight Kubernetes. Deploy AzureML extension to your AKS cluster without connecting via Azure Arc. Automated Machine Learning (AutoML) via Python SDK. Use 2.0 CLI to attach the Kubernetes cluster to AML Workspace. Optimize AzureML extension components CPU/memory resources utilization.|
 | July 2, 2021 | 1.0.13 | New Kubernetes distributions support, OpenShift Kubernetes and GKE (Google Kubernetes Engine). Autoscale support. If the user-managed Kubernetes cluster enables the autoscale, the cluster will be automatically scaled out or scaled in according to the volume of active runs and deployments. Performance improvement on job launcher, which shortens the job execution time to a great deal.|