You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
where :math:`B` is the number of bins, :math:`p(x)` and :math:`q(x)` is the empirical density function of the base and target population, respectively. Note that the PSI calculation is related to the binning method, and we have two options for binning, i.e., "uniform" and "quantile". The number of bins is fixed at 10.
@@ -26,7 +26,7 @@ where :math:`B` is the number of bins, :math:`p(x)` and :math:`q(x)` is the empi
26
26
27
27
.. math::
28
28
\begin{align}
29
-
WD_{1} = \sum^{B}_{i=1} |P(x) - Q(x)|,
29
+
WD_{1} = \int |P(x) - Q(x)| dx,
30
30
\end{align}
31
31
32
32
where and :math:`P(x)` and :math:`Q(x)` are the cumulative distribution of the target and base population.
@@ -35,11 +35,11 @@ where and :math:`P(x)` and :math:`Q(x)` are the cumulative distribution of the t
35
35
36
36
.. math::
37
37
\begin{align}
38
-
KS = \max |P(x) - Q(x)|,
38
+
KS = \sup_x |P(x) - Q(x)|.
39
39
\end{align}
40
40
41
-
In PiML, the WD1 and KS statistics are calculated by the `wasserstein_distance` and `ks_2samp` functions from `scipy.stats`, where we don't need to specify the binning method.
42
-
41
+
In PiML, the WD1 and KS statistics are calculated by the `wasserstein_distance` and `ks_2samp` functions from `scipy.stats`.
<p>where <spanclass="math notranslate nohighlight">\(B\)</span> is the number of bins, <spanclass="math notranslate nohighlight">\(p(x)\)</span> and <spanclass="math notranslate nohighlight">\(q(x)\)</span> is the empirical density function of the base and target population, respectively. Note that the PSI calculation is related to the binning method, and we have two options for binning, i.e., “uniform” and “quantile”. The number of bins is fixed at 10.</p>
262
262
<ulclass="simple">
263
263
<li><p><strong>Wasserstein distance 1D (WD1)</strong>: WD1 calculates the absolute difference between the cumulative distribution functions of the two samples.</p></li>
264
264
</ul>
265
265
<divclass="math notranslate nohighlight">
266
266
\[\begin{align}
267
-
WD_{1} = \sum^{B}_{i=1} |P(x) - Q(x)|,
267
+
WD_{1} = \int |P(x) - Q(x)| dx,
268
268
\end{align}\]</div>
269
269
<p>where and <spanclass="math notranslate nohighlight">\(P(x)\)</span> and <spanclass="math notranslate nohighlight">\(Q(x)\)</span> are the cumulative distribution of the target and base population.</p>
270
270
<ulclass="simple">
271
271
<li><p><strong>Kolmogorov-Smirnov (KS)</strong>: KS calculates the maximum absolute distance between the cumulative distribution functions of the two samples. In PiML, the WD1 statistics are calculated by the function from <codeclass="docutils literal notranslate"><spanclass="pre">scipy.stats</span></code>.</p></li>
272
272
</ul>
273
273
<divclass="math notranslate nohighlight">
274
274
\[\begin{align}
275
-
KS = \max |P(x) - Q(x)|,
275
+
KS = \sup_x |P(x) - Q(x)|.
276
276
\end{align}\]</div>
277
-
<p>In PiML, the WD1 and KS statistics are calculated by the <codeclass="docutils literal notranslate"><spanclass="pre">wasserstein_distance</span></code> and <codeclass="docutils literal notranslate"><spanclass="pre">ks_2samp</span></code> functions from <codeclass="docutils literal notranslate"><spanclass="pre">scipy.stats</span></code>, where we don’t need to specify the binning method.</p>
277
+
<p>In PiML, the WD1 and KS statistics are calculated by the <codeclass="docutils literal notranslate"><spanclass="pre">wasserstein_distance</span></code> and <codeclass="docutils literal notranslate"><spanclass="pre">ks_2samp</span></code> functions from <codeclass="docutils literal notranslate"><spanclass="pre">scipy.stats</span></code>.</p>
278
278
</section>
279
279
<sectionid="usage">
280
280
<h2><spanclass="section-number">2.7.2. </span>Usage<aclass="headerlink" href="#usage" title="Permalink to this heading">¶</a></h2>
0 commit comments