MicrosoftDocs
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/1-introduction.yml
Lines changed: 14 additions & 14 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/1-introduction.yml
Lines changed: 14 additions & 14 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/2-detect-correct-data.yml
Lines changed: 15 additions & 15 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/2-detect-correct-data.yml
Lines changed: 15 additions & 15 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/3-exercise-detect-visualize-missing-data.yml
Lines changed: 14 additions & 14 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/3-exercise-detect-visualize-missing-data.yml
Lines changed: 14 additions & 14 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/4-examine-data-types.yml
Lines changed: 15 additions & 15 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/4-examine-data-types.yml
Lines changed: 15 additions & 15 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/5-exercise-normalize-data-predict-missing-values.yml
Lines changed: 15 additions & 15 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/5-exercise-normalize-data-predict-missing-values.yml
Lines changed: 15 additions & 15 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/6-evaluate-image-language-data.yml
Lines changed: 14 additions & 14 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/6-evaluate-image-language-data.yml
Lines changed: 14 additions & 14 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/7-exercise-one-hot-vectors.yml
Lines changed: 15 additions & 15 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/7-exercise-one-hot-vectors.yml
Lines changed: 15 additions & 15 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/8-knowledge-check.yml
Lines changed: 60 additions & 60 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/8-knowledge-check.yml
Lines changed: 60 additions & 60 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/9-summary.yml
Lines changed: 14 additions & 14 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/9-summary.yml
Lines changed: 14 additions & 14 deletions
diff --git a/‎learn-pr/azure/introduction-to-data-for-machine-learning/includes/1-introduction.md
Lines changed: 3 additions & 3 deletions b/‎learn-pr/azure/introduction-to-data-for-machine-learning/includes/1-introduction.md
Lines changed: 3 additions & 3 deletions
@@ -1,14 +1,14 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.introduction
-title: Introduction
-metadata:
-  title: Introduction
-  description: Introduction to data for machine learning module.
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.reviewer: franksolomon
-  ms.topic: unit
-durationInMinutes: 2
-content: |
-  [!include[](includes/1-introduction.md)] 
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.introduction
+title: Introduction
+metadata:
+  title: Introduction
+  description: Introduction to data for machine learning module.
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.reviewer: franksolomon
+  ms.topic: unit
+durationInMinutes: 2
+content: |
+  [!include[](includes/1-introduction.md)] 
@@ -1,15 +1,15 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.detect-correct-data
-title: Good, bad, and missing data
-metadata:
-  title: Good, bad, and missing data
-  description: Conceptual unit introducing types of data in machine learning
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.reviewer: franksolomon
-  ms.topic: unit
-durationInMinutes: 3
-content: |
-  [!include[](includes/2-detect-correct-data.md)]
-
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.detect-correct-data
+title: Good, bad, and missing data
+metadata:
+  title: Good, bad, and missing data
+  description: Conceptual unit introducing types of data in machine learning
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.reviewer: franksolomon
+  ms.topic: unit
+durationInMinutes: 3
+content: |
+  [!include[](includes/2-detect-correct-data.md)]
+
@@ -1,14 +1,14 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.exercise-detect-visualize-missing-data
-title: Exercise - Visualize missing data
-metadata:
-  title: Exercise - Visualize missing data
-  description: Learn how to detect and visualize missing data.
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.topic: unit
-durationInMinutes: 8
-sandbox: true
-notebook: notebooks/3-3-exercise-detect-visualize-missing-data.ipynb
-
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.exercise-detect-visualize-missing-data
+title: Exercise - Visualize missing data
+metadata:
+  title: Exercise - Visualize missing data
+  description: Learn how to detect and visualize missing data.
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.topic: unit
+durationInMinutes: 8
+sandbox: true
+notebook: notebooks/3-3-exercise-detect-visualize-missing-data.ipynb
+
@@ -1,15 +1,15 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.examine-data-types
-title: Examine different types of data
-metadata:
-  title: Examine different types of data
-  description: Conceptual unit about examining different types of data in machine learning
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.topic: unit
-  ms.reviewer: franksolomon
-durationInMinutes: 4
-content: |
-  [!include[](includes/4-examine-data-types.md)]
-
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.examine-data-types
+title: Examine different types of data
+metadata:
+  title: Examine different types of data
+  description: Conceptual unit about examining different types of data in machine learning
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.topic: unit
+  ms.reviewer: franksolomon
+durationInMinutes: 4
+content: |
+  [!include[](includes/4-examine-data-types.md)]
+
@@ -1,15 +1,15 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.exercise-normalize-data-predict-missing-values
-title: Exercise - Work with data to predict missing values
-metadata:
-  title: Exercise - Work with data to predict missing values
-  description: Exercise unit about predicting missing values in machine learning
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.topic: unit
-  ms.reviewer: franksolomon
-durationInMinutes: 8
-sandbox: true
-notebook: notebooks/3-5-exercise-normalize-data-predict-missing-values.ipynb
-
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.exercise-normalize-data-predict-missing-values
+title: Exercise - Work with data to predict missing values
+metadata:
+  title: Exercise - Work with data to predict missing values
+  description: Exercise unit about predicting missing values in machine learning
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.topic: unit
+  ms.reviewer: franksolomon
+durationInMinutes: 8
+sandbox: true
+notebook: notebooks/3-5-exercise-normalize-data-predict-missing-values.ipynb
+
@@ -1,14 +1,14 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.evaluate-image-language-data
-title: One-hot vectors
-metadata:
-  title: One-hot vectors
-  description: Conceptual unit about one-hot vectors in machine learning
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.topic: unit
-  ms.reviewer: franksolomon
-durationInMinutes: 5
-content: |
-  [!include[](includes/6-evaluate-image-language-data.md)] 
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.evaluate-image-language-data
+title: One-hot vectors
+metadata:
+  title: One-hot vectors
+  description: Conceptual unit about one-hot vectors in machine learning
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.topic: unit
+  ms.reviewer: franksolomon
+durationInMinutes: 5
+content: |
+  [!include[](includes/6-evaluate-image-language-data.md)] 
@@ -1,15 +1,15 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.exercise-one-hot-vectors
-title: Exercise - Predict unknown values using one-hot vectors
-metadata:
-  title: Exercise - Predict unknown values using one-hot vectors
-  description: Exercise unit using one-hot vectors to predict unknown values in machine learning
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.topic: unit
-  ms.reviewer: franksolomon
-durationInMinutes: 10
-notebook: notebooks/3-7-exercise-one-hot-vectors.ipynb
-sandbox: true
-
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.exercise-one-hot-vectors
+title: Exercise - Predict unknown values using one-hot vectors
+metadata:
+  title: Exercise - Predict unknown values using one-hot vectors
+  description: Exercise unit using one-hot vectors to predict unknown values in machine learning
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.topic: unit
+  ms.reviewer: franksolomon
+durationInMinutes: 10
+notebook: notebooks/3-7-exercise-one-hot-vectors.ipynb
+sandbox: true
+
@@ -1,60 +1,60 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.knowledge-check
-title: Module assessment
-metadata:
-  title: Module assessment
-  description: Multiple-choice questions
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.reviewer: franksolomon
-  ms.topic: unit
-durationInMinutes: 3
-quiz:
-  title: Check your knowledge
-  questions:
-  - content: 'Why do we clean our data before training?'
-    choices:
-    - content: "Removing rows of data makes our model more powerful."
-      isCorrect: false
-      explanation: "Incorrect. While removing bad data makes models perform better, only removing data doesn't make models more powerful."
-    - content: "Cleaning data helps us select features that help the performance of the model."
-      isCorrect: false
-      explanation: "Incorrect. Cleaning data might help us select features, but cleaning data is used to fix problems with the data."
-    - content: "Removing rows that have errors prevents these rows from misleading the training process."
-      isCorrect: true
-      explanation: "Correct. Cleaning data helps prevent errors from incomplete or error-prone data points."
-  - content: 'What kind of data are best encoded with one-hot vectors?'
-    choices:
-    - content: "Ordinal data"
-      isCorrect: false
-      explanation: "Incorrect. One-hot vectors are best used in other areas where we have clear classes."
-    - content: "Categorical data with two possible values"
-      isCorrect: false
-      explanation: "Incorrect. This kind of data can be encoded in a single column as a 0 and a 1."
-    - content: "Categorical data with three or more values"
-      isCorrect: true
-      explanation: "Correct. One-hot vectors are best used with multiple classes or categories so that models can better interpret them."
-  - content: 'What is a data sample? What is a population?'
-    choices:
-    - content: "A sample is all possible data we care about. A population is the subset of that data which we actually have on hand."
-      isCorrect: false
-      explanation: "Incorrect. A sample is a portion, or subset, of the data we care about. A population is all the available data."
-    - content: "Both population and sample refer to data we use to train our model."
-      isCorrect: false
-      explanation: "Incorrect. Although we can train models with population and sample data, they mean different things."
-    - content: "A population is all possible data we care about. A sample is the subset of that data which we actually have on hand."
-      isCorrect: true
-      explanation: "Correct. A population is all the possible data we could collect for a data set, and a sample is a portion of the data which we already have."
-  - content: "You have a model that doesn't perform well. Which of these options definitely do **not** help improve its performance?"
-    choices:
-    - content: "Adding more samples (rows)."
-      isCorrect: false
-      explanation: "Incorrect. Adding rows of data likely helps your dataset become more representative, and so helps your model train."
-    - content: "Adding a few features (columns) that you know relate to what the model is trying to predict."
-      isCorrect: false
-      explanation: "Incorrect. So long as you have enough rows of data, adding relevant features is likely to help your model train."
-    - content: "Adding a large number of features that you know have no relation to what the model is trying to predict."
-      isCorrect: true
-      explanation: "Correct. Adding more features that aren't relevant probably harms its performance"
-
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.knowledge-check
+title: Module assessment
+metadata:
+  title: Module assessment
+  description: Multiple-choice questions
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.reviewer: franksolomon
+  ms.topic: unit
+durationInMinutes: 3
+quiz:
+  title: Check your knowledge
+  questions:
+  - content: 'Why do we clean our data before training?'
+    choices:
+    - content: "Removing rows of data makes our model more powerful."
+      isCorrect: false
+      explanation: "Incorrect. While removing bad data makes models perform better, only removing data doesn't make models more powerful."
+    - content: "Cleaning data helps us select features that help the performance of the model."
+      isCorrect: false
+      explanation: "Incorrect. Cleaning data might help us select features, but cleaning data is used to fix problems with the data."
+    - content: "Removing rows that have errors prevents these rows from misleading the training process."
+      isCorrect: true
+      explanation: "Correct. Cleaning data helps prevent errors from incomplete or error-prone data points."
+  - content: 'What kind of data are best encoded with one-hot vectors?'
+    choices:
+    - content: "Ordinal data"
+      isCorrect: false
+      explanation: "Incorrect. One-hot vectors are best used in other areas where we have clear classes."
+    - content: "Categorical data with two possible values"
+      isCorrect: false
+      explanation: "Incorrect. This kind of data can be encoded in a single column as a 0 and a 1."
+    - content: "Categorical data with three or more values"
+      isCorrect: true
+      explanation: "Correct. One-hot vectors are best used with multiple classes or categories so that models can better interpret them."
+  - content: 'What is a data sample? What is a population?'
+    choices:
+    - content: "A sample is all possible data we care about. A population is the subset of that data which we actually have on hand."
+      isCorrect: false
+      explanation: "Incorrect. A sample is a portion, or subset, of the data we care about. A population is all the available data."
+    - content: "Both population and sample refer to data we use to train our model."
+      isCorrect: false
+      explanation: "Incorrect. Although we can train models with population and sample data, they mean different things."
+    - content: "A population is all possible data we care about. A sample is the subset of that data which we actually have on hand."
+      isCorrect: true
+      explanation: "Correct. A population is all the possible data we could collect for a data set, and a sample is a portion of the data which we already have."
+  - content: "You have a model that doesn't perform well. Which of these options definitely do **not** help improve its performance?"
+    choices:
+    - content: "Adding more samples (rows)."
+      isCorrect: false
+      explanation: "Incorrect. Adding rows of data likely helps your dataset become more representative, and so helps your model train."
+    - content: "Adding a few features (columns) that you know relate to what the model is trying to predict."
+      isCorrect: false
+      explanation: "Incorrect. So long as you have enough rows of data, adding relevant features is likely to help your model train."
+    - content: "Adding a large number of features that you know have no relation to what the model is trying to predict."
+      isCorrect: true
+      explanation: "Correct. Adding more features that aren't relevant probably harms its performance"
+
@@ -1,14 +1,14 @@
-### YamlMime:ModuleUnit
-uid: learn.machinelearning.introduction-to-data-for-machine-learning.summary
-title: Summary
-metadata:
-  title: Summary
-  description: An overview of the content covered in the module.
-  ms.date: 10/10/2024
-  author: fbsolo-ms1
-  ms.author: franksolomon
-  ms.reviewer: franksolomon
-  ms.topic: unit
-durationInMinutes: 2
-content: |
-  [!include[](includes/9-summary.md)] 
+### YamlMime:ModuleUnit
+uid: learn.machinelearning.introduction-to-data-for-machine-learning.summary
+title: Summary
+metadata:
+  title: Summary
+  description: An overview of the content covered in the module.
+  ms.date: 05/21/2025
+  author: fbsolo-ms1
+  ms.author: franksolomon
+  ms.reviewer: franksolomon
+  ms.topic: unit
+durationInMinutes: 2
+content: |
+  [!include[](includes/9-summary.md)] 
@@ -1,14 +1,14 @@
 Machine learning gets its predictive power from the data that shapes it. To build effective models, you must understand the data you use.
 
-Here, we explore how both humans and computers categorize, store, and interpret data. We examine what makes a good dataset, and how to fix issues in our available data. We also practice exploration of new data, and we see how deep thinking about a dataset can help us build better predictive models.
+Here, we explore how both humans and computers categorize, store, and interpret data. We examine what makes a good dataset, and how to fix issues in our available data. We also practice exploring new data, and we see how deep thinking about a dataset can help us build better predictive models.
 
 ## Scenario: the last voyage of the Titanic
 
-As an eager marine archaeologist, you have an unusually keen interest in maritime disasters. Late one night, while clicking between images of whale bones and ancient scrolls about Atlantis, you find a public dataset that lists known passengers and crew of the first, and last, voyage of the Titanic. Drawn in by the balance between fate and chance, you wonder, what factors determined the survival of a Titanic passenger? Data from this period are somewhat incomplete. Much information for certain passengers is unknown. You must find ways to patch up this data before you can fully analyze it.
+As an eager marine archaeologist, you have an unusually keen interest in maritime disasters. Late one night, while clicking between images of whale bones and ancient scrolls about Atlantis, you find a public dataset that lists known passengers and crew of the first (and last) voyage of the Titanic. Drawn in by the balance between fate and chance, you wonder, what factors determined the survival of a Titanic passenger? Data from this period are somewhat incomplete. Information for certain passengers is unknown. You must find ways to patch up this data before you can fully analyze it.
 
 ## Prerequisites
 
-- Some familiarity with machine learning concepts (such as models and cost) helps, but it's not required.
+- Some familiarity with machine-learning concepts (such as models and cost) helps, but it's not required.
 
 ## Learning objectives