You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/Machine-Learning/Overfitting, Underfitting.md
+33-33Lines changed: 33 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,13 +4,13 @@ description: Comprehensive guide to Overfitting and Underfitting with mathematic
4
4
comments: true
5
5
---
6
6
7
-
# =Ø Overfitting and Underfitting
7
+
# 🎯 Overfitting and Underfitting
8
8
9
9
Overfitting and Underfitting are fundamental concepts in machine learning that describe how well a model generalizes to unseen data - the central challenge in building reliable predictive models.
**Overfitting** occurs when a model learns the training data too well, capturing noise and specific patterns that don't generalize to new data. **Underfitting** happens when a model is too simple to capture the underlying patterns in the data.
16
16
@@ -42,7 +42,7 @@ Overfitting and Underfitting are fundamental concepts in machine learning that d
42
42
-**Regularization**: Primary technique to prevent overfitting
43
43
-**Cross-Validation**: Method to detect and measure fitting issues
44
44
45
-
## >à Intuition
45
+
## 🧠 Intuition
46
46
47
47
### How Overfitting and Underfitting Work
48
48
@@ -86,7 +86,7 @@ Training and validation error as functions of:
86
86
-**Sample size**: $\text{Error}(n)$
87
87
-**Model complexity**: $\text{Error}(\lambda)$ where $\lambda$ controls complexity
88
88
89
-
## =" Implementation using Libraries
89
+
## 🛠️ Implementation using Libraries
90
90
91
91
### Scikit-learn Implementation
92
92
@@ -148,7 +148,7 @@ for i, (name, model) in enumerate(models.items(), 1):
Copy file name to clipboardExpand all lines: docs/Machine-Learning/PCA.md
+20-20Lines changed: 20 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Comprehensive guide to Principal Component Analysis with mathematic
4
4
comments: true
5
5
---
6
6
7
-
# =Ø Principal Component Analysis (PCA)
7
+
# 🎯 Principal Component Analysis (PCA)
8
8
9
9
PCA is a fundamental dimensionality reduction technique that transforms high-dimensional data into a lower-dimensional space while preserving maximum variance, making it invaluable for data visualization, noise reduction, and feature extraction.
10
10
@@ -38,7 +38,7 @@ Principal Component Analysis (PCA) is an unsupervised linear dimensionality redu
38
38
-**Sparse PCA**: Incorporates sparsity constraints on components
39
39
-**Incremental PCA**: For large datasets that don't fit in memory
40
40
41
-
## >à Intuition
41
+
## >� Intuition
42
42
43
43
### How PCA Works
44
44
@@ -242,7 +242,7 @@ def compare_with_without_pca(X, y, n_components=2):
242
242
compare_with_without_pca(X_scaled, y, n_components=2)
243
243
```
244
244
245
-
## From Scratch Implementation
245
+
## � From Scratch Implementation
246
246
247
247
```python
248
248
import numpy as np
@@ -458,7 +458,7 @@ for comp, error in zip(components, errors):
458
458
print(f"{comp} components: {error:.6f}")
459
459
```
460
460
461
-
## Assumptions and Limitations
461
+
## � Assumptions and Limitations
462
462
463
463
### Key Assumptions
464
464
@@ -506,15 +506,15 @@ for comp, error in zip(components, errors):
506
506
- With very sparse data (consider specialized sparse PCA)
507
507
- When you need exactly interpretable features for regulatory compliance
508
508
509
-
## =¡ Interview Questions
509
+
## ❓ Interview Questions
510
510
511
511
??? question "What is the mathematical intuition behind PCA and how does it work?"
512
512
513
513
**Answer:** PCA finds the directions (principal components) in the data that capture the maximum variance. Mathematically, it performs eigenvalue decomposition on the covariance matrix:
514
514
515
515
1. **Center the data**: Subtract the mean from each feature
3. **Find eigenvalues and eigenvectors**: C*v = »*v
517
+
3. **Find eigenvalues and eigenvectors**: C*v = �*v
518
518
4. **Sort by eigenvalues**: Largest eigenvalues correspond to directions with most variance
519
519
5. **Project data**: Transform original data onto selected eigenvectors
520
520
@@ -530,7 +530,7 @@ for comp, error in zip(components, errors):
530
530
531
531
**Example**: Without standardization, if you have height (cm, ~170) and weight (kg, ~70), height will dominate simply due to larger numerical values, not because it's more important.
532
532
533
-
**Solution**: Use z-score standardization: (x - ¼) / Ã for each feature.
533
+
**Solution**: Use z-score standardization: (x - �) / � for each feature.
534
534
535
535
??? question "How do you choose the optimal number of principal components?"
536
536
@@ -583,20 +583,20 @@ for comp, error in zip(components, errors):
0 commit comments