Edits for pausing content creation

optimalcharb · optimalcharb · commit 12c00ffdd78d · 2025-03-28T12:25:42.000-04:00
diff --git a/_includes/footer/custom.html b/_includes/footer/custom.html
@@ -1,4 +1,4 @@
 <!-- start custom footer snippets -->
 
-<p>Open to Work <i class="fa fa-briefcase" aria-hidden="true"></i></p>
+<p>Open to Networking <i class="fa fa-briefcase" aria-hidden="true"></i></p>
 <!-- end custom footer snippets -->
diff --git a/_posts/2025-02-22-newsvendor.md b/_posts/2025-02-22-newsvendor.md
@@ -87,6 +87,7 @@ If the quantity ordered is also continuous, like 3.42 gallons, then there is a p
 
 $$F(q^*)=\frac{c_s-c_p}{c_s+c_h}$$
 
+To prove the critical ratio is the same, you would set the derivative of the cost function equal to zero; however, computing this derivative will require Leibniz's Rule.
 
 ## Example
 
diff --git a/_posts/2025-03-05-estimates.md b/_posts/2025-03-05-estimates.md
@@ -146,33 +146,31 @@ $$ f(x; \lambda) = \lambda e^{-\lambda x}, \quad x \geq 0 $$
 
 The loglikelihood function and derivative follow.
 
-$$l(\lambda) = \sum_{i=1}^{n} \log(\lambda e^{-\lambda x}) = n \log \lambda - \lambda \sum_{i=1}^n x_i$$
+$$l(\lambda) = \sum_{i=1}^{n} \log(\lambda e^{-\lambda x}) = n \log \lambda - \lambda \sum_{i=1}^n x_i$$ \\
 $$ \frac{dl}{d\lambda} = \frac{n}{\lambda} - \sum_{i=1}^{n} x_i $$
 
-The MLE has derivative equal to zero
-$$\frac{n}{\lambda} - \sum_{i=1}^{n} x_i = 0 $$
-$$ \hat{\lambda} = \frac{n}{\sum_{i=1}^{n} x_i} $$
+The MLE has derivative equal to zero \\
+$$\frac{n}{\lambda} - \sum_{i=1}^{n} x_i = 0 $$ \\
+$$ \hat{\lambda} = \frac{n}{\sum_{i=1}^{n} x_i} $$ \\
 
 ## Normal
 
 For a normal distribution with mean $\mu$ and variance $\sigma^2$, the pdf is
 $$ f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x - \mu)^2}{2\sigma^2}} $$
 
-The likelihood is 
+The likelihood is \\
 $$ L(\mu, \sigma^2) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x_i - \mu)^2}{2\sigma^2}} $$
 
-The loglikelihood is
+The loglikelihood is \\
 $$ l(\mu, \sigma^2) = -\frac{n}{2} \log(2\pi\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^{n} (x_i - \mu)^2 $$
 
-To find the MLEs, we take the partial derivatives of $l$ with respect to $\mu$ and $\sigma^2$ and set them equal to zero. For $\mu$
+To find the MLEs, we take the partial derivatives of $l$ with respect to $\mu$ and $\sigma^2$ and set them equal to zero. For $\mu$ \\
+$$ \frac{\partial l}{\partial \mu} = \frac{1}{\sigma^2} \sum_{i=1}^{n} (x_i - \mu) = 0 $$ \\
+$$ \hat{\mu} = \frac{1}{n} \sum_{i=1}^{n} x_i $$ \\
 
-$$ \frac{\partial l}{\partial \mu} = \frac{1}{\sigma^2} \sum_{i=1}^{n} (x_i - \mu) = 0 $$
-$$ \hat{\mu} = \frac{1}{n} \sum_{i=1}^{n} x_i $$
-
-For $\sigma^2$
-
-$$ \frac{\partial l}{\partial \sigma^2} = -\frac{n}{2\sigma^2} + \frac{1}{2\sigma^4} \sum_{i=1}^{n} (x_i - \mu)^2 = 0 $$
-$$ \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \hat{\mu})^2 $$
+For $\sigma^2$ \\
+$$ \frac{\partial l}{\partial \sigma^2} = -\frac{n}{2\sigma^2} + \frac{1}{2\sigma^4} \sum_{i=1}^{n} (x_i - \mu)^2 = 0 $$ \\
+$$ \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \hat{\mu})^2 $$ \\
 
 For Exponential and Normal, the maximum likelihood estimates are the same as the method of moments estimates.
 That's a good illustration of why method of moments is a good rule of thumb for simple distributions.
@@ -230,7 +228,7 @@ The Fisher Information quantifies the amount of information that an observable r
 Let $X=(X_1,X_2,...,X_N)$ be a vector of i.i.d. random variables, $T(X)$ be a transformation of $X$, and $x$ denote an outcome/value of $X$. Then $X_i$ has pmf/pdf $f(x_i;\theta)$ with parameter $\theta$ and $X$ has joint pmf/pdf $f(x;\theta) $ $=$ $f(x_1,x_2,...,x_n; \theta) $ $ =$ $\prod_{i=1}^n f(x_i; \theta)$. 
 An estimate for $\theta$ calculated from a data sample is not necessarily a sufficient statistic. However, any efficient estimate for $\theta$ must be a function of a sufficient statistic $T(X)$. Definition: $T(X)$ is a sufficient statistic if the probability distribution of $X$ given a value for $T(X)$ is constant with respect to $\theta$. In math notation, $f(x \| T(X)=T(x))$ is not a function of $\theta$. In other words, the statistic $T(X)$ provides as much information about $\theta$ as the entire data sample $X$.
 
-The Fisher-Neyman Factorization Theorem provides a shortcut to prove that a statistic is sufficient. The theorem states $T(X)$ is a sufficient statistic for $\theta$ if joint pmf/pdf $f(x; \theta)$ $=$ $g(T(x), \theta) h(x)$ for some $g$ that is a function of $T(X)$ and the parameters and some $h$ that is any function of $X$ and the parameters besides $\theta$. 
+The Fisher-Neyman Factorization Theorem provides a shortcut to prove that a statistic is sufficient. The theorem states $T(X)$ is a sufficient statistic for $\theta$ if joint pmf/pdf $f(x; \theta)$ $=$ $g(T(x), \theta) h(x)$ for some $g$ that is a function of $T(X)$ and the parameters and some $h$ that is any function of $X$ and the parameters besides $\theta$. In otherwords, $g$ is an expression with $x$ only inside $T(x)$ and $h$ is an expression without $\theta$.
 
 For example, let $X_1, X_2, ..., X_n \sim \text{Poisson} (\lambda)$. The maximum likelihood estimate for $\lambda$ is the sample mean $\frac{1}{n} \sum_{i=1}^n x_i$.
 This is a function of $T(x)$ $ =$ $\sum_{i=1}^n x_i$, which is a sufficient statistic for $\lambda$. The sufficiency can be shown by the definition or theorem.
diff --git a/_posts/2026-01-01-regression1.md b/_posts/2026-01-01-regression1.md
@@ -0,0 +1,15 @@
+---
+layout: single
+title: "Multivariable Linear Regression"
+excerpt: ""
+categories:
+  - Intro-Data-Science
+tags:
+  - regression
+toc: true
+toc_label: "Table of Contents"
+#  toc_icon: 
+# header:
+#   image:
+#   teaser:
+---
diff --git a/_posts/2026-01-01-regression2.md b/_posts/2026-01-01-regression2.md
@@ -1,6 +1,6 @@
 ---
 layout: single
-title: "Model Selection in Linear Regression"
+title: "Model Selection for Linear Regression"
 excerpt: ""
 categories:
   - Intro-Data-Science
@@ -14,7 +14,21 @@ toc_label: "Table of Contents"
 #   teaser:
 ---
 
-# Var
+# Train-Test Split
+The **train-test split** is a method used to evaluate the performance of a machine learning model. The dataset is divided into two subsets: the training set and the testing set. The model is trained on the training set and evaluated on the testing set.
+
+
+## Train-Test-Validate
+The **train-test-validate** approach involves splitting the dataset into three subsets: training, validation, and testing. The model is trained on the training set, tuned on the validation set, and evaluated on the testing set. The test data is reserved until after models are selected and compared. to provide an unseen dataset to test whether the model selection process and final model is accurate on unseen data
+
+
+## Cross-fold Validation
+**Cross-fold validation** (also called $k$-folds or CV) is a technique where the dataset is divided into $k$ subsets. The model is trained and evaluated \( k \) times, each time using a different fold as the testing set and the remaining folds as the training set. The final performance is the average of the \( k \) evaluations.
+
+
+## 
+
+# Variable Selection
 
 ## Forward Selection
 
diff --git a/_posts/2026-01-01-regression3.md b/_posts/2026-01-01-regression3.md
@@ -0,0 +1,15 @@
+---
+layout: single
+title: "Gradient Descent for Linear Regression"
+excerpt: ""
+categories:
+  - Intro-Data-Science
+tags:
+  - regression
+toc: true
+toc_label: "Table of Contents"
+#  toc_icon: 
+# header:
+#   image:
+#   teaser:
+---
diff --git a/_posts/2026-01-01-regression4.md b/_posts/2026-01-01-regression4.md
@@ -0,0 +1,15 @@
+---
+layout: single
+title: "Regularization for Linear Regression"
+excerpt: ""
+categories:
+  - Intro-Data-Science
+tags:
+  - regression
+toc: true
+toc_label: "Table of Contents"
+#  toc_icon: 
+# header:
+#   image:
+#   teaser:
+---
diff --git a/_posts/2026-01-01-regression5.md b/_posts/2026-01-01-regression5.md
@@ -0,0 +1,15 @@
+---
+layout: single
+title: "Object Oriented Program for Linear Regression"
+excerpt: ""
+categories:
+  - Intro-Data-Science
+tags:
+  - regression
+toc: true
+toc_label: "Table of Contents"
+#  toc_icon: 
+# header:
+#   image:
+#   teaser:
+---
diff --git a/code/regression_classification.py b/code/regression_classification.py
@@ -39,7 +39,7 @@ def __init__(self):
 
     def predict(self, X: np.array) -> np.array:
         self.check_num_features(X)
-        X_b = np.c_[np.ones((X.shape[0], 1)), X]
+        X_b = self.add_bias(X)
         return X_b @ self.coefficients
 
     def fit(self, X: np.array, y: np.array):
@@ -168,8 +168,13 @@ def fit(self, X: np.array, y: np.array, learning_rate=1, epochs=1000, sgd=False)
         self.beta = np.zeros((self.num_features + 1, self.num_classes))
         self.loss_history = self.accuracy_history = []
         for _ in range(epochs):
-            probs = self.softmax(X_b @ self.beta)
-            grad = -X_b.T @ (y_enc - probs) / X_b.shape[0]
+            if sgd:
+                i = np.random.randint(X_b.shape[0])
+                probs = self.softmax(X_b[i:i+1] @ self.beta)
+                grad = -X_b[i:i+1].T @ (y_enc[i:i+1] - probs)
+            else:
+                probs = self.softmax(X_b @ self.beta)
+                grad = -X_b.T @ (y_enc - probs) / X_b.shape[0]
             self.beta -= learning_rate * grad
             self.loss_history.append(self.loss(probs, y_enc))
             self.accuracy_history.append(accuracy(y, self.predict(X)))
diff --git a/index.html b/index.html
@@ -5,8 +5,8 @@
 
 <p>Welcome to my site!</p>
 
-<p> I am a part-time student at Georgia Tech graduating May 3rd <i class="fa fa-graduation-cap" aria-hidden="true"></i>. 
-    I am currently seeking new opportunities to solve the hardest problems in data science <i class="fa fa-puzzle-piece" aria-hidden="true"></i> 
+<p> I am a student at Georgia Tech graduating May 3rd <i class="fa fa-graduation-cap" aria-hidden="true"></i>. 
+    I am a Data Scientist at FIS seeking to solve the hardest machine learning problems <i class="fa fa-puzzle-piece" aria-hidden="true"></i> 
     and continue to deploy use cases of large language models <i class="fa fa-language" aria-hidden="true"></i>. </p>
 
 <iframe src="assets/files/Charles_Bauer_Resume.pdf" width="100%" height="400px"></iframe> 

Original file line number	Diff line number	Diff line change
`@@ -87,6 +87,7 @@ If the quantity ordered is also continuous, like 3.42 gallons, then there is a p`
`87`	`87`
`88`	`88`	`$$F(q^*)=\frac{c_s-c_p}{c_s+c_h}$$`
`89`	`89`
	`90`	`+To prove the critical ratio is the same, you would set the derivative of the cost function equal to zero; however, computing this derivative will require Leibniz's Rule.`
`90`	`91`
`91`	`92`	`## Example`
`92`	`93`