|
| 1 | +--- |
| 2 | +title: "Inferential Statistics" |
| 3 | +sidebar_label: Inferential Statistics |
| 4 | +description: "Understanding how to make predictions and inferences about populations using samples, hypothesis testing, and p-values." |
| 5 | +tags: [statistics, inference, hypothesis-testing, p-value, confidence-intervals, mathematics-for-ml] |
| 6 | +--- |
| 7 | + |
| 8 | +In Descriptive Statistics, we describe the data we have. In **Inferential Statistics**, we use that data to make "educated guesses" or predictions about data we *don't* have. This is the foundation of scientific discovery and model validation in Machine Learning. |
| 9 | + |
| 10 | +## 1. The Core Workflow |
| 11 | + |
| 12 | +Inferential statistics allows us to take a small sample and project those findings onto a larger population. |
| 13 | + |
| 14 | +```mermaid |
| 15 | +sankey-beta |
| 16 | + %% source,target,value |
| 17 | + Population,Sample,30 |
| 18 | + Sample,Analysis,30 |
| 19 | + Analysis,Point Estimates,15 |
| 20 | + Analysis,Confidence Intervals,15 |
| 21 | + Point Estimates,Population Inference,15 |
| 22 | + Confidence Intervals,Population Inference,15 |
| 23 | +
|
| 24 | +``` |
| 25 | + |
| 26 | +## 2. Point Estimation |
| 27 | + |
| 28 | +A **Point Estimate** is a single value (a statistic) used to estimate a population parameter. |
| 29 | + |
| 30 | +* **Sample Mean ($\bar{x}$)** estimates the **Population Mean ($\mu$)**. |
| 31 | +* **Sample Variance ($s^2$)** estimates the **Population Variance ($\sigma^2$)**. |
| 32 | + |
| 33 | +However, because samples are smaller than populations, point estimates are rarely 100% accurate. We use **Confidence Intervals** to express our uncertainty. |
| 34 | + |
| 35 | +## 3. Hypothesis Testing |
| 36 | + |
| 37 | +Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. |
| 38 | + |
| 39 | +### The Two Hypotheses |
| 40 | + |
| 41 | +1. **Null Hypothesis ($H_0$):** The "status quo." It assumes there is no effect or no difference. (e.g., "This new feature does not improve model accuracy.") |
| 42 | +2. **Alternative Hypothesis ($H_a$):** What we want to prove. (e.g., "This new feature improves model accuracy.") |
| 43 | + |
| 44 | +### The Decision Process |
| 45 | + |
| 46 | +We use the **P-value** to decide whether to reject the Null Hypothesis. |
| 47 | + |
| 48 | +```mermaid |
| 49 | +flowchart TD |
| 50 | + Start["State Hypotheses H0 and Ha"] --> Alpha[Set Significance Level α - usually 0.05] |
| 51 | + Alpha --> Test[Perform Statistical Test - t-test, Z-test] |
| 52 | + Test --> PVal{Calculate P-value} |
| 53 | + PVal -- "P < α" --> Reject[Reject H0: Results are Statistically Significant] |
| 54 | + PVal -- "P ≥ α" --> Fail[Fail to Reject H0: No significant effect found] |
| 55 | +
|
| 56 | +``` |
| 57 | + |
| 58 | +## 4. Confidence Intervals |
| 59 | + |
| 60 | +A **Confidence Interval (CI)** provides a range of values that is likely to contain the population parameter. |
| 61 | + |
| 62 | +$$ |
| 63 | +\text{CI} = \text{Point Estimate} \pm (\text{Critical Value} \times \text{Standard Error}) |
| 64 | +$$ |
| 65 | + |
| 66 | +:::note Example |
| 67 | +We are 95% confident that the true accuracy of our model on all future data is between 88% and 92%. |
| 68 | +::: |
| 69 | + |
| 70 | +## 5. Common Statistical Tests in ML |
| 71 | + |
| 72 | +| Test | Use Case | Example in ML | |
| 73 | +| --- | --- | --- | |
| 74 | +| **Z-Test** | Comparing means with a large sample size (n > 30). | Comparing the average spend of two large user groups. | |
| 75 | +| **T-Test** | Comparing means with a small sample size (n < 30). | Comparing performance of two model architectures on a small dataset. | |
| 76 | +| **Chi-Square Test** | Testing relationships between categorical variables. | Is the "Click" rate independent of the "Device Type"? | |
| 77 | +| **ANOVA** | Comparing means across 3 or more groups. | Does the choice of optimizer (Adam, SGD, RMSprop) significantly change accuracy? | |
| 78 | + |
| 79 | +## 6. Type I and Type II Errors |
| 80 | + |
| 81 | +When making inferences, we can be wrong in two ways: |
| 82 | + |
| 83 | +```mermaid |
| 84 | +quadrantChart |
| 85 | + title Statistical Decision Matrix |
| 86 | + x-axis "Null Hypothesis is True" --> "Null Hypothesis is False" |
| 87 | + y-axis "Reject Null" --> "Fail to Reject" |
| 88 | + quadrant-1 "Type I Error (False Positive)" |
| 89 | + quadrant-2 "Correct Decision (True Positive)" |
| 90 | + quadrant-3 "Correct Decision (True Negative)" |
| 91 | + quadrant-4 "Type II Error (False Negative)" |
| 92 | +
|
| 93 | +``` |
| 94 | + |
| 95 | +<br /> |
| 96 | + |
| 97 | +1. **Type I Error (\alpha):** You claim there is an effect when there isn't. (False Positive). |
| 98 | +2. **Type II Error (\beta):** You fail to detect an effect that actually exists. (False Negative). |
| 99 | + |
| 100 | +## 7. Why this matters for ML Engineers |
| 101 | + |
| 102 | +* **A/B Testing:** Inferential statistics is the engine behind A/B testing new model versions in production. |
| 103 | +* **Feature Selection:** We use tests like Chi-Square to see if a feature actually has a relationship with the target variable. |
| 104 | +* **Model Comparison:** If Model A has 91% accuracy and Model B has 91.5%, is that difference "real" or just luck? Inferential stats tells you if the improvement is **statistically significant**. |
| 105 | + |
| 106 | +--- |
| 107 | + |
| 108 | +Understanding inference allows us to trust our model's results. Now, we dive into the specific probability distributions that model the randomness we see in the real world. |
0 commit comments