You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2025-02-22-newsvendor.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -87,6 +87,7 @@ If the quantity ordered is also continuous, like 3.42 gallons, then there is a p
87
87
88
88
$$F(q^*)=\frac{c_s-c_p}{c_s+c_h}$$
89
89
90
+
To prove the critical ratio is the same, you would set the derivative of the cost function equal to zero; however, computing this derivative will require Leibniz's Rule.
For Exponential and Normal, the maximum likelihood estimates are the same as the method of moments estimates.
178
176
That's a good illustration of why method of moments is a good rule of thumb for simple distributions.
@@ -230,7 +228,7 @@ The Fisher Information quantifies the amount of information that an observable r
230
228
Let $X=(X_1,X_2,...,X_N)$ be a vector of i.i.d. random variables, $T(X)$ be a transformation of $X$, and $x$ denote an outcome/value of $X$. Then $X_i$ has pmf/pdf $f(x_i;\theta)$ with parameter $\theta$ and $X$ has joint pmf/pdf $f(x;\theta) $ $=$ $f(x_1,x_2,...,x_n; \theta) $ $ =$ $\prod_{i=1}^n f(x_i; \theta)$.
231
229
An estimate for $\theta$ calculated from a data sample is not necessarily a sufficient statistic. However, any efficient estimate for $\theta$ must be a function of a sufficient statistic $T(X)$. Definition: $T(X)$ is a sufficient statistic if the probability distribution of $X$ given a value for $T(X)$ is constant with respect to $\theta$. In math notation, $f(x \| T(X)=T(x))$ is not a function of $\theta$. In other words, the statistic $T(X)$ provides as much information about $\theta$ as the entire data sample $X$.
232
230
233
-
The Fisher-Neyman Factorization Theorem provides a shortcut to prove that a statistic is sufficient. The theorem states $T(X)$ is a sufficient statistic for $\theta$ if joint pmf/pdf $f(x; \theta)$ $=$ $g(T(x), \theta) h(x)$ for some $g$ that is a function of $T(X)$ and the parameters and some $h$ that is any function of $X$ and the parameters besides $\theta$.
231
+
The Fisher-Neyman Factorization Theorem provides a shortcut to prove that a statistic is sufficient. The theorem states $T(X)$ is a sufficient statistic for $\theta$ if joint pmf/pdf $f(x; \theta)$ $=$ $g(T(x), \theta) h(x)$ for some $g$ that is a function of $T(X)$ and the parameters and some $h$ that is any function of $X$ and the parameters besides $\theta$. In otherwords, $g$ is an expression with $x$ only inside $T(x)$ and $h$ is an expression without $\theta$.
234
232
235
233
For example, let $X_1, X_2, ..., X_n \sim \text{Poisson} (\lambda)$. The maximum likelihood estimate for $\lambda$ is the sample mean $\frac{1}{n} \sum_{i=1}^n x_i$.
236
234
This is a function of $T(x)$ $ =$ $\sum_{i=1}^n x_i$, which is a sufficient statistic for $\lambda$. The sufficiency can be shown by the definition or theorem.
Copy file name to clipboardExpand all lines: _posts/2026-01-01-regression2.md
+16-2Lines changed: 16 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
layout: single
3
-
title: "Model Selection in Linear Regression"
3
+
title: "Model Selection for Linear Regression"
4
4
excerpt: ""
5
5
categories:
6
6
- Intro-Data-Science
@@ -14,7 +14,21 @@ toc_label: "Table of Contents"
14
14
# teaser:
15
15
---
16
16
17
-
# Var
17
+
# Train-Test Split
18
+
The **train-test split** is a method used to evaluate the performance of a machine learning model. The dataset is divided into two subsets: the training set and the testing set. The model is trained on the training set and evaluated on the testing set.
19
+
20
+
21
+
## Train-Test-Validate
22
+
The **train-test-validate** approach involves splitting the dataset into three subsets: training, validation, and testing. The model is trained on the training set, tuned on the validation set, and evaluated on the testing set. The test data is reserved until after models are selected and compared. to provide an unseen dataset to test whether the model selection process and final model is accurate on unseen data
23
+
24
+
25
+
## Cross-fold Validation
26
+
**Cross-fold validation** (also called $k$-folds or CV) is a technique where the dataset is divided into $k$ subsets. The model is trained and evaluated \( k \) times, each time using a different fold as the testing set and the remaining folds as the training set. The final performance is the average of the \( k \) evaluations.
0 commit comments