Skip to content

Commit 6666822

Browse files
'[pre-commit.ci 🤖] Apply code format tools to PR'
1 parent c583437 commit 6666822

File tree

2 files changed

+22
-22
lines changed

2 files changed

+22
-22
lines changed

_data/authors.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,4 +120,4 @@ Raktim Mukhopadhyay:
120120
url: "https://github.com/rmj3197"
121121
- label: "LinkedIn"
122122
icon: "fab fa-fw fa-linkedin"
123-
url: "https://www.linkedin.com/in/raktimm/"
123+
url: "https://www.linkedin.com/in/raktimm/"

_posts/2025-01-17-quadratik.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,9 @@ last_modified: 2025-01-17
3434

3535
## Goodness-of-Fit (GoF) Tests
3636

37-
Goodness-of-Fit (GoF) tests are classical tools for assessing the compatibility of data with a given probability model. GoF tests typically compute a distance-like metric between the null distribution and observations, rejecting the null hypothesis if the distance exceeds a critical value.
37+
Goodness-of-Fit (GoF) tests are classical tools for assessing the compatibility of data with a given probability model. GoF tests typically compute a distance-like metric between the null distribution and observations, rejecting the null hypothesis if the distance exceeds a critical value.
3838

39-
The methods for normality, two-sample, and k-sample test use a bandwidth parameter `h`. We have also provided an algorithm for determining the optimal value of `h` based on the mid-power analysis (please see Markatou and Saraceno (2024)). You can find more details on algorithm in our [manual](https://quadratik.readthedocs.io/en/latest/user_guide/hselect.html).
39+
The methods for normality, two-sample, and k-sample test use a bandwidth parameter `h`. We have also provided an algorithm for determining the optimal value of `h` based on the mid-power analysis (please see Markatou and Saraceno (2024)). You can find more details on algorithm in our [manual](https://quadratik.readthedocs.io/en/latest/user_guide/hselect.html).
4040

4141
In this section, the various GoF tests are shown with corresponding examples.
4242

@@ -64,20 +64,20 @@ normality_test = KernelTest(
6464
print(normality_test.summary(print_fmt="grid"))
6565
```
6666

67-
The results of this test is shown below.
67+
The results of this test is shown below.
6868
<figure style="float: center;">
6969
<picture>
7070
<source srcset="/images/quadratik/normality-test-results.webp" type="image/webp">
7171
<img src="/images/quadratik/normality-test-results.jpg" alt="Results for the Normality Test." />
7272
</picture>
7373
</figure>
7474

75-
The test rightly fails to reject the null hypothesis, as the samples have been generated from a standard normal distribution.
75+
The test rightly fails to reject the null hypothesis, as the samples have been generated from a standard normal distribution.
7676

7777
### Two-Sample Test
7878
The two-sample GoF test is used to determine whether two separate samples are likely drawn from the same population distribution.
7979

80-
To illustrate the two sample test, we generate n = 200 random samples from a multivariate standard normal distribution and a skewed normal distribution with value of skewness parameter lambda = 0.1.
80+
To illustrate the two sample test, we generate n = 200 random samples from a multivariate standard normal distribution and a skewed normal distribution with value of skewness parameter lambda = 0.1.
8181

8282
```python
8383
import numpy as np
@@ -103,21 +103,21 @@ two_sample_test = KernelTest(h=2, num_iter=150, random_state=42).test(X_2, Y_2)
103103
print(two_sample_test.summary(print_fmt = "grid"))
104104
```
105105

106-
The results of the test is shown below.
106+
The results of the test is shown below.
107107
<figure style="float: center;">
108108
<picture>
109109
<source srcset="/images/quadratik/two-sample-test-results.webp" type="image/webp">
110110
<img src="/images/quadratik/two-sample-test-results.png" alt="Results for the Two Sample Test." />
111111
</picture>
112112
</figure>
113113

114-
The test rejects the null hypothesis, as the samples have been generated from two different distributions.
114+
The test rejects the null hypothesis, as the samples have been generated from two different distributions.
115115

116116
### K-Sample Test
117117

118118
Similar to the two-sample test, the k-sample test examines whether k groups of samples are obtained from the same distribution.
119119

120-
For illustrating the k-sample test, we use the glass identification dataset from the [UCI ML repository](https://archive.ics.uci.edu/dataset/42/glass+identification). We use the first three classes of glass types to illustrate the working of the k-sample test.
120+
For illustrating the k-sample test, we use the glass identification dataset from the [UCI ML repository](https://archive.ics.uci.edu/dataset/42/glass+identification). We use the first three classes of glass types to illustrate the working of the k-sample test.
121121

122122
```python
123123
# Importing required libraries
@@ -142,7 +142,7 @@ k_sample_test = KernelTest(h=2, num_iter=150, random_state=42).test(X, y)
142142
# Printing the test summary
143143
print(k_sample_test.summary(print_fmt="grid"))
144144
```
145-
The results of the test is shown below.
145+
The results of the test is shown below.
146146
<figure style="float: center;">
147147
<picture>
148148
<source srcset="/images/quadratik/k-sample-test-results.webp" type="image/webp">
@@ -154,9 +154,9 @@ The null hypothesis is rejected for the k-sample test indicates that there is **
154154

155155
### Uniformity Test on the Sphere
156156

157-
In this we test the null hypothesis of uniformity on the sphere. We illustrate this test using an example.
157+
In this we test the null hypothesis of uniformity on the sphere. We illustrate this test using an example.
158158

159-
The data for this example is generated from a multivariate standard normal distribution, and is further divided by the L2 norm of generated vectors. This processed data is uniformly distributed on the surface of the unit sphere.
159+
The data for this example is generated from a multivariate standard normal distribution, and is further divided by the L2 norm of generated vectors. This processed data is uniformly distributed on the surface of the unit sphere.
160160

161161
```python
162162
import numpy as np
@@ -174,7 +174,7 @@ unif_test = PoissonKernelTest(rho=0.5, random_state=42).test(data_unif)
174174
# printing the summary for uniformity test
175175
print(unif_test.summary(print_fmt = "grid"))
176176
```
177-
The results of the test is shown below.
177+
The results of the test is shown below.
178178
<figure style="float: center;">
179179
<picture>
180180
<source srcset="/images/quadratik/uniformity-test-results.webp" type="image/webp">
@@ -222,7 +222,7 @@ segmented_images = []
222222
plt.figure(figsize=(16, 8))
223223

224224
num_k_values = len(k_values)
225-
num_cols = 6
225+
num_cols = 6
226226
num_rows = (num_k_values + num_cols - 1) // num_cols
227227

228228
for i, k in enumerate(k_values, start=1):
@@ -249,7 +249,7 @@ plt.tight_layout()
249249
plt.show()
250250
```
251251

252-
The image is segmented into k clusters with k ranging from 2 to 8. Below, we display the regions identified for each value of k.
252+
The image is segmented into k clusters with k ranging from 2 to 8. Below, we display the regions identified for each value of k.
253253

254254
<figure style="float: center;">
255255
<picture>
@@ -258,7 +258,7 @@ The image is segmented into k clusters with k ranging from 2 to 8. Below, we dis
258258
</picture>
259259
</figure>
260260

261-
Starting from k = 5, the segmented images reveal only minor changes in the identified segments upon closer examination. Let us see if we can validate our observation using the elbow plots.
261+
Starting from k = 5, the segmented images reveal only minor changes in the identified segments upon closer examination. Let us see if we can validate our observation using the elbow plots.
262262

263263
```python
264264
validation_metrics, elbow_plots = pkbc.validation()
@@ -271,7 +271,7 @@ elbow_plots
271271
</picture>
272272
</figure>
273273

274-
The elbow plots show a clear elbow at k = 5, which aligns with our observation that all regions of the image are effectively identified at this value of k.
274+
The elbow plots show a clear elbow at k = 5, which aligns with our observation that all regions of the image are effectively identified at this value of k.
275275

276276
The clustering algorithm proposed in Golzy and Markatou has been used in other works such as Golzy et al. (2023), Strelnikoff at al. (2020), and Strelnikoff et al. (2024).
277277

@@ -293,7 +293,7 @@ samples_rejacg = pkbd.rpkb(
293293
n=500, mu=[1, 1, 1], rho=0.9, method="rejacg", random_state=42)
294294
```
295295

296-
The generated samples can also be visualized on the unit sphere.
296+
The generated samples can also be visualized on the unit sphere.
297297

298298
```python
299299
import matplotlib.pyplot as plt
@@ -344,7 +344,7 @@ plt.tight_layout()
344344

345345
<br>
346346

347-
More details on Poisson Kernel-Based Distributions can be found in the package documentation [here](https://quadratik.readthedocs.io/en/latest/user_guide/pkbd.html).
347+
More details on Poisson Kernel-Based Distributions can be found in the package documentation [here](https://quadratik.readthedocs.io/en/latest/user_guide/pkbd.html).
348348

349349
## Dashboard
350350

@@ -366,7 +366,7 @@ UI().run()
366366

367367
`QuadratiK` provides methods to researchers and practitioners to delve deeper into their data, draw robust inference, and conduct potentially impactful analyses and inference across a wide array of disciplines. The `QuadratiK` package is also available in `R` and is hosted on [CRAN](https://cran.r-project.org/web/packages/QuadratiK/index.html). You can learn more about `QuadratiK` in our [arXiv preprint](https://arxiv.org/abs/2402.02290). Additional theoretical papers of interest are listed in the reference section.
368368

369-
Please feel free to reach me at raktimmu at buffalo.edu.
369+
Please feel free to reach me at raktimmu at buffalo.edu.
370370

371371
Thank you! Happy coding to you — may your bugs be few, and your data ever insightful! 🚀😊
372372

@@ -383,7 +383,7 @@ Thank you! Happy coding to you — may your bugs be few, and your data ever insi
383383
- Markatou, M., & Saraceno, G. (2024). A unified framework for multivariate two-sample and k-sample kernel-based quadratic distance goodness-of-fit tests. DOI: 10.48550/arXiv.2407.16374v1
384384

385385
- Golzy, M., Rosen, G. H., Kruse, R. L., Hooshmand, K., Mehr, D. R., & Murray, K. S. (2023). Holistic assessment of quality of life predicts survival in older patients with bladder cancer. Urology, 174, 141-149.
386-
386+
387387
- Strelnikoff, S., Jammalamadaka, A., & Warmsley, D. (2020, December). Causal maps for multi-document summarization. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 4437-4445). IEEE.
388388

389-
- Strelnikoff, S., Jammalamadaka, A., & Warmsley, D. M. (2024). U.S. Patent No. 11,907,307. Washington, DC: U.S. Patent and Trademark Office.
389+
- Strelnikoff, S., Jammalamadaka, A., & Warmsley, D. M. (2024). U.S. Patent No. 11,907,307. Washington, DC: U.S. Patent and Trademark Office.

0 commit comments

Comments
 (0)