IPO-Predictor/index.html at main · sammcdo/IPO-Predictor · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.6.39">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">

<meta name="author" content="Sam McDowell">
<meta name="dcterms.date" content="2024-12-09">

<title>IPO Prediction - Honors Petition</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
  width: 0.8em;
  margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
  vertical-align: middle;
}
/* CSS for syntax highlighting */
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
  { counter-reset: source-line 0; }
pre.numberSource code > span
  { position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
  { content: counter(source-line);
    position: relative; left: -1em; text-align: right; vertical-align: baseline;
    border: none; display: inline-block;
    -webkit-touch-callout: none; -webkit-user-select: none;
    -khtml-user-select: none; -moz-user-select: none;
    -ms-user-select: none; user-select: none;
    padding: 0 4px; width: 4em;
  }
pre.numberSource { margin-left: 3em;  padding-left: 4px; }
div.sourceCode
  {   }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
</style>


<script src="index_files/libs/clipboard/clipboard.min.js"></script>
<script src="index_files/libs/quarto-html/quarto.js"></script>
<script src="index_files/libs/quarto-html/popper.min.js"></script>
<script src="index_files/libs/quarto-html/tippy.umd.min.js"></script>
<script src="index_files/libs/quarto-html/anchor.min.js"></script>
<link href="index_files/libs/quarto-html/tippy.css" rel="stylesheet">
<link href="index_files/libs/quarto-html/quarto-syntax-highlighting-e26003cea8cd680ca0c55a263523d882.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="index_files/libs/bootstrap/bootstrap.min.js"></script>
<link href="index_files/libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="index_files/libs/bootstrap/bootstrap-8a79a254b8e706d3c925cde0a310d4f0.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="light">

  <script src="https://cdnjs.cloudflare.com/polyfill/v3/polyfill.min.js?features=es6"></script>
  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js" type="text/javascript"></script>

<script type="text/javascript">
const typesetMath = (el) => {
  if (window.MathJax) {
    // MathJax Typeset
    window.MathJax.typeset([el]);
  } else if (window.katex) {
    // KaTeX Render
    var mathElements = el.getElementsByClassName("math");
    var macros = [];
    for (var i = 0; i < mathElements.length; i++) {
      var texText = mathElements[i].firstChild;
      if (mathElements[i].tagName == "SPAN") {
        window.katex.render(texText.data, mathElements[i], {
          displayMode: mathElements[i].classList.contains('display'),
          throwOnError: false,
          macros: macros,
          fleqn: false
        });
      }
    }
  }
}
window.Quarto = {
  typesetMath
};
</script>

</head>

<body>

<div id="quarto-content" class="page-columns page-rows-contents page-layout-article">
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
  <nav id="TOC" role="doc-toc" class="toc-active">
    <h2 id="toc-title">Table of contents</h2>

  <ul>
  <li><a href="#ipo-prediction" id="toc-ipo-prediction" class="nav-link active" data-scroll-target="#ipo-prediction">IPO Prediction</a>
  <ul class="collapse">
  <li><a href="#abstract" id="toc-abstract" class="nav-link" data-scroll-target="#abstract">Abstract</a></li>
  <li><a href="#introduction" id="toc-introduction" class="nav-link" data-scroll-target="#introduction">Introduction</a></li>
  <li><a href="#methods" id="toc-methods" class="nav-link" data-scroll-target="#methods">Methods</a></li>
  <li><a href="#results" id="toc-results" class="nav-link" data-scroll-target="#results">Results</a></li>
  <li><a href="#discussion" id="toc-discussion" class="nav-link" data-scroll-target="#discussion">Discussion</a></li>
  <li><a href="#opportunities-for-future-research" id="toc-opportunities-for-future-research" class="nav-link" data-scroll-target="#opportunities-for-future-research">Opportunities for Future Research</a></li>
  <li><a href="#conclusion" id="toc-conclusion" class="nav-link" data-scroll-target="#conclusion">Conclusion</a></li>
  <li><a href="#references" id="toc-references" class="nav-link" data-scroll-target="#references">References</a></li>
  </ul></li>
  </ul>
</nav>
</div>
<main class="content" id="quarto-document-content">

<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title">IPO Prediction - Honors Petition</h1>
</div>


<div class="quarto-title-meta">

    <div>
    <div class="quarto-title-meta-heading">Author</div>
    <div class="quarto-title-meta-contents">
             <p>Sam McDowell </p>
          </div>
  </div>

    <div>
    <div class="quarto-title-meta-heading">Published</div>
    <div class="quarto-title-meta-contents">
      <p class="date">December 9, 2024</p>
    </div>
  </div>


  </div>


</header>


<section id="ipo-prediction" class="level1">
<h1>IPO Prediction</h1>
<p>Honors Petition Project</p>
<p>Liberty University - School of Business</p>
<section id="abstract" class="level3">
<h3 class="anchored" data-anchor-id="abstract">Abstract</h3>
<p>This study investigates the prediction of the stock price of an Initial Public Offering (IPO) shortly after its listing by grouping similar IPOs based on financial data and using time series forecasting techniques. The research utilizes data compiled from IPO Scoop and Yahoo Finance, before applying K-Means clustering based on financial information, industry and time of IPO. A Vector Autoregression (VAR) model is then used to forecast stock performance within each group. Despite the clustering technique used, the results showed that financial data alone did not group IPOs with similar stock market reactions. Additionally, issues with stationarity, autocorrelation and causality in the stock data limited the predictive performance of the VAR model. Future research opportunities include expanding the amount of data observed, exploring advanced stationarizing methods, and investigating alternative machine learning models for improved IPO prediction accuracy.</p>
</section>
<section id="introduction" class="level3">
<h3 class="anchored" data-anchor-id="introduction">Introduction</h3>
<p>The stock market is an amazing resource for those wishing to make money. In fact, as much as $300 billion changes hands on the stock market every day (nasdaqtrader.com). With the rise of machine learning and artificial intelligence, much research has been done to predict these markets. Someone in possession of an accurate prediction method could make untold amounts of money by knowing when to buy and sell. There are approximately 8,000 stocks eligible for trading in the US stock market (NYSE). Before a stock can be publically traded on the stock market, it must go through a process called an Initial Public Offering, or IPO (Fernando, 2024). The time when a company first releases shares to the public is a time of extreme volatility and therefore a time when large profit can be made. This study attempts to predict the stock performance of an IPO by grouping it with similar IPOs and performing a forecast.</p>
<p>This study collects datasets of IPO financials and stock data. The data is then clustered and grouped for analysis. A multivariate autoregressive model is used to make predictions on the stocks in a group based on the time series of their prices. The results and implications will be fully explored.</p>
</section>
<section id="methods" class="level3">
<h3 class="anchored" data-anchor-id="methods">Methods</h3>
<p>To begin, data was collected for the grouping and forecasting. The first dataset is from IPO Scoop. It includes a company name, the company’s symbol, the industry that company is in, the date that stock was first offered publicly, the number of shares offered, how much it cost when it was offered, the market cap of the company, its revenue and its net income. Each of those data points were standardized using Scikit-Learn’s Standard Scaler, which transforms the data to have a mean of 0 and a standard deviation of 1 (scikit-learn.org, 2024, StandardScaler).</p>
<p>A second dataset was collected through Yahoo Finance. This data consisted of the first 60 day candles of the IPO’s public response. This data included the open, high, low and close price, as well as the volume of shares traded on that day.</p>
<p>Once those datasets were collected, the grouping of similar stocks was performed. The primary method of grouping IPOs together for this study was K-Means clustering. This was done using the financial information dataset. The data was clustered into two groups. The data was again grouped by industry where all those stocks from the same industry were put in a group. It was also grouped by month of IPO, where all stocks that launched in the same month were put in a group.</p>
<p>Finally, time series analysis was performed to make a forecast from the stock data of the stocks in a given group. This was done using Vector Autoregression (or VAR). The first 55 days of stock performance for a group of stocks were used to fit a VAR model. Day 60 was the target for prediction. This was tested on the clusters created by K-Means Clustering, the groups created by industry and the groups created by month of IPO.</p>
</section>
<section id="results" class="level3">
<h3 class="anchored" data-anchor-id="results">Results</h3>
<p>When the data was clustered there were a lot of outliers when visualizing them based on two principal components. The first principal component was primarily based on the number of shares of the company. The second was based on the remaining data points with a roughly even split.</p>
<div id="4978c1df" class="cell" data-execution_count="1">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> analysis.clustering_util <span class="im">import</span> <span class="op">*</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>x, _ <span class="op">=</span> getData()</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>x, pca <span class="op">=</span> addPCA(x)</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="bu">len</span>(pca.components_)):</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="ss">f"Component </span><span class="sc">{</span>i<span class="op">+</span><span class="dv">1</span><span class="sc">}</span><span class="ss">"</span>, pca.components_[i])</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">"</span><span class="ch">\t</span><span class="st">Explained Variance:"</span>, pca.explained_variance_ratio_[i])</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a><span class="co"># plot the ipos</span></span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>plt.scatter(x[<span class="st">'pc1'</span>], x[<span class="st">'pc2'</span>], cmap<span class="op">=</span><span class="st">'viridis'</span>)</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'Visualize IPOs'</span>)</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>plt.xlabel(<span class="st">'PC 1'</span>)</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>plt.ylabel(<span class="st">'PC 2'</span>)</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>Component 1 [0.99783894 0.03052901 0.04543049 0.02070526 0.02987977]
    Explained Variance: 0.9841593409460709
Component 2 [-0.06407769  0.53952155  0.52025562  0.31929104  0.57636609]
    Explained Variance: 0.008320573663589195</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-2-output-2.png" width="594" height="449" class="figure-img"></p>
</figure>
</div>
</div>
</div>
<p>These outliers were removed by dropping IPOs that were farther from the mean of the principal components. This allowed for a much tighter grouping.</p>
<div id="a3ee0a50" class="cell" data-execution_count="2">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> analysis.clustering_util <span class="im">import</span> <span class="op">*</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>x, _ <span class="op">=</span> getData()</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>x, _ <span class="op">=</span> addPCA(x)</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>x <span class="op">=</span> removeOutliers(x)</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a><span class="co"># Add plot the ipos</span></span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>plt.scatter(x[<span class="st">'pc1'</span>], x[<span class="st">'pc2'</span>], cmap<span class="op">=</span><span class="st">'viridis'</span>)</span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'Visualize IPOs w/o Outliers'</span>)</span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a>plt.xlabel(<span class="st">'PC 1'</span>)</span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a>plt.ylabel(<span class="st">'PC 2'</span>)</span>
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-3-output-1.png" width="602" height="449" class="figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Once the outliers were removed it was necessary to determine how many clusters the K-Means algorithm should create. The K-Means algorithm requires a set number of groups before performing a clustering (scikit-learn, 2019, K-Means). An elbow plot was created, and it was shown that the ideal number of clusters was 2. Then the K-Means algorithm was run and the two clusters were created.</p>
<div id="f0e94887" class="cell" data-execution_count="3">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> analysis.clustering_util <span class="im">import</span> <span class="op">*</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>x, _ <span class="op">=</span> getData()</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>x, _ <span class="op">=</span> addPCA(x)</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>x <span class="op">=</span> removeOutliers(x)</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a><span class="co"># Calculate WCSS for different number of clusters</span></span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>wcss <span class="op">=</span> []</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">1</span>, <span class="dv">16</span>):</span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>    kmeans <span class="op">=</span> KMeans(n_clusters<span class="op">=</span>i, random_state<span class="op">=</span><span class="dv">42</span>)</span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a>    kmeans.fit(x)</span>
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a>    wcss.append(kmeans.inertia_)</span>
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a><span class="co"># show the elbow plot</span></span>
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a>plt.plot(<span class="bu">range</span>(<span class="dv">1</span>, <span class="dv">16</span>), wcss, marker<span class="op">=</span><span class="st">'o'</span>)</span>
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'Elbow Method'</span>)</span>
<span id="cb4-17"><a href="#cb4-17" aria-hidden="true" tabindex="-1"></a>plt.xlabel(<span class="st">'Number of Clusters'</span>)</span>
<span id="cb4-18"><a href="#cb4-18" aria-hidden="true" tabindex="-1"></a>plt.ylabel(<span class="st">'Within-Cluster Sum of Squares (WCSS)'</span>)</span>
<span id="cb4-19"><a href="#cb4-19" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-4-output-1.png" width="603" height="449" class="figure-img"></p>
</figure>
</div>
</div>
</div>
<div id="c2098e98" class="cell" data-execution_count="4">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> analysis.clustering_util <span class="im">import</span> <span class="op">*</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>x, s <span class="op">=</span> getData(includeClusters<span class="op">=</span><span class="va">False</span>)</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a>x, pca <span class="op">=</span> addPCA(x)</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>x[<span class="st">"symbol"</span>] <span class="op">=</span> s</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>x <span class="op">=</span> removeOutliers(x)</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>y <span class="op">=</span> x[[<span class="st">"pc1"</span>, <span class="st">"pc2"</span>, <span class="st">"symbol"</span>]]</span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a>x.drop(columns<span class="op">=</span>[<span class="st">"pc1"</span>, <span class="st">"pc2"</span>, <span class="st">"symbol"</span>], inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a><span class="co"># Create KMeans instance with 2 clusters</span></span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a>kmeans <span class="op">=</span> KMeans(n_clusters<span class="op">=</span><span class="dv">2</span>)</span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a>kmeans.fit(x)</span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a><span class="co"># Get the cluster labels</span></span>
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a>centroids <span class="op">=</span> pca.transform(kmeans.cluster_centers_) <span class="co"># scale to same as components</span></span>
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a>x[<span class="st">'Cluster'</span>] <span class="op">=</span> kmeans.labels_</span>
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> c <span class="kw">in</span> y.columns:</span>
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true" tabindex="-1"></a>    x[c] <span class="op">=</span> y[c]</span>
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true" tabindex="-1"></a>old, s <span class="op">=</span> getData(includeClusters<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true" tabindex="-1"></a>old[<span class="st">"symbol"</span>] <span class="op">=</span> s</span>
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true" tabindex="-1"></a>old <span class="op">=</span> old[old[<span class="st">"symbol"</span>].isin(x[<span class="st">'symbol'</span>])]</span>
<span id="cb5-23"><a href="#cb5-23" aria-hidden="true" tabindex="-1"></a>x[<span class="st">"Industry_Cluster"</span>] <span class="op">=</span> old[<span class="st">"Industry_Cluster"</span>]</span>
<span id="cb5-24"><a href="#cb5-24" aria-hidden="true" tabindex="-1"></a>x[<span class="st">"Month_Cluster"</span>] <span class="op">=</span> old[<span class="st">"Month_Cluster"</span>]</span>
<span id="cb5-25"><a href="#cb5-25" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-26"><a href="#cb5-26" aria-hidden="true" tabindex="-1"></a><span class="co"># Plot the centroids</span></span>
<span id="cb5-27"><a href="#cb5-27" aria-hidden="true" tabindex="-1"></a>plt.scatter(x[<span class="st">'pc1'</span>], x[<span class="st">'pc2'</span>], c<span class="op">=</span>x[<span class="st">'Cluster'</span>], cmap<span class="op">=</span><span class="st">'viridis'</span>)</span>
<span id="cb5-28"><a href="#cb5-28" aria-hidden="true" tabindex="-1"></a>plt.scatter(centroids[:, <span class="dv">0</span>], centroids[:, <span class="dv">1</span>], s<span class="op">=</span><span class="dv">100</span>, c<span class="op">=</span><span class="st">'red'</span>, marker<span class="op">=</span><span class="st">'X'</span>)</span>
<span id="cb5-29"><a href="#cb5-29" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'K-means Clustering'</span>)</span>
<span id="cb5-30"><a href="#cb5-30" aria-hidden="true" tabindex="-1"></a>plt.xlabel(<span class="st">'PC 1'</span>)</span>
<span id="cb5-31"><a href="#cb5-31" aria-hidden="true" tabindex="-1"></a>plt.ylabel(<span class="st">'PC 2'</span>)</span>
<span id="cb5-32"><a href="#cb5-32" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-5-output-1.png" width="602" height="449" class="figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The groups from the clustered data were log transformed and tested for stationarity using the Augmented Dickey-Fuller test and the KPSS test. The data is also tested for autocorrelation using the Durbin-Watson test and for causality with the Granger Causality test.</p>
<details>
<summary>
Show Results of Stationarity and Correlation Tests
</summary>
<div id="c3023057" class="cell" data-execution_count="5">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> prediction.prediction_util <span class="im">import</span> <span class="op">*</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>stocks <span class="op">=</span> pd.read_csv(<span class="st">"data-collection/stocks.csv"</span>, header<span class="op">=</span>[<span class="dv">0</span>,<span class="dv">1</span>], index_col<span class="op">=</span>[<span class="dv">0</span>])</span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a>ipos <span class="op">=</span> pd.read_csv(<span class="st">"analysis/clustered.csv"</span>)</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a>group1 <span class="op">=</span> ipos[ipos[<span class="st">"Cluster"</span>] <span class="op">==</span> <span class="dv">1</span>]</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a>close <span class="op">=</span> stocks.xs(<span class="st">'Close'</span>, axis<span class="op">=</span><span class="dv">1</span>, level<span class="op">=</span><span class="dv">1</span>)</span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a>close.index <span class="op">=</span> stocks.index</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a>g1StockData <span class="op">=</span> getGroupClosingPrices(group1, close)</span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> c <span class="kw">in</span> g1StockData.columns:</span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a>    <span class="co"># p = PowerTransformer(method='box-cox')</span></span>
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true" tabindex="-1"></a>    <span class="co"># g1StockData[c] = p.fit_transform(g1StockData[[c]])</span></span>
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true" tabindex="-1"></a>    <span class="co"># lambdas[c] = p</span></span>
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true" tabindex="-1"></a>    g1StockData[c] <span class="op">=</span> g1StockData[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.log(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>)</span>
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true" tabindex="-1"></a>    <span class="co"># g1StockData[c] = g1StockData[c].diff()</span></span>
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true" tabindex="-1"></a>    <span class="co"># g1StockData[c].dropna(inplace=True)</span></span>
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true" tabindex="-1"></a>    ts, p, lags <span class="op">=</span> adf_test(g1StockData[c].dropna())</span>
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true" tabindex="-1"></a>    tsK, pK, lagsK <span class="op">=</span> kpss_test(g1StockData[c].dropna())</span>
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true" tabindex="-1"></a>    db <span class="op">=</span> durbinWatson_test(g1StockData[c].dropna())</span>
<span id="cb6-26"><a href="#cb6-26" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-27"><a href="#cb6-27" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(c)</span>
<span id="cb6-28"><a href="#cb6-28" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="ss">f"Test Statistic:  ADF: </span><span class="sc">{</span><span class="bu">round</span>(ts, <span class="dv">5</span>)<span class="sc">}</span><span class="ch">\t</span><span class="ss"> KPSS: </span><span class="sc">{</span><span class="bu">round</span>(tsK, <span class="dv">5</span>)<span class="sc">}</span><span class="ss">"</span>)</span>
<span id="cb6-29"><a href="#cb6-29" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="ss">f"P-Value:         ADF: </span><span class="sc">{</span><span class="bu">round</span>(p, <span class="dv">5</span>)<span class="sc">}</span><span class="ch">\t</span><span class="ss"> KPSS: </span><span class="sc">{</span><span class="bu">round</span>(pK, <span class="dv">5</span>)<span class="sc">}</span><span class="ss">"</span>)</span>
<span id="cb6-30"><a href="#cb6-30" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">"DurbinWatson: </span><span class="sc">%.5f</span><span class="st">"</span> <span class="op">%</span> db)</span>
<span id="cb6-31"><a href="#cb6-31" aria-hidden="true" tabindex="-1"></a>    others <span class="op">=</span> [x <span class="cf">for</span> x <span class="kw">in</span> g1StockData.columns <span class="cf">if</span> x <span class="op">!=</span> c]</span>
<span id="cb6-32"><a href="#cb6-32" aria-hidden="true" tabindex="-1"></a>    gc <span class="op">=</span> {}</span>
<span id="cb6-33"><a href="#cb6-33" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">"Granger Causality Tests"</span>)</span>
<span id="cb6-34"><a href="#cb6-34" aria-hidden="true" tabindex="-1"></a>    <span class="cf">for</span> x <span class="kw">in</span> others:</span>
<span id="cb6-35"><a href="#cb6-35" aria-hidden="true" tabindex="-1"></a>        gc[x] <span class="op">=</span> <span class="bu">min</span>(granger_test(g1StockData[c], g1StockData[x]).values())</span>
<span id="cb6-36"><a href="#cb6-36" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(x, gc[x])</span>
<span id="cb6-37"><a href="#cb6-37" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> p <span class="op">&gt;</span> <span class="fl">0.05</span> <span class="kw">and</span> pK <span class="op">&lt;</span> <span class="fl">0.05</span>:</span>
<span id="cb6-38"><a href="#cb6-38" aria-hidden="true" tabindex="-1"></a>        g1StockData.drop(columns<span class="op">=</span>c, inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb6-39"><a href="#cb6-39" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>CEP
Test Statistic:  ADF: -4.87106   KPSS: 0.1238
P-Value:         ADF: 0.00035    KPSS: 0.0911
DurbinWatson: 2.60440
Granger Causality Tests
PCSC 0.018398930051991544
RAPP 0.008267953926903267
BOW 0.05741566079961837
SVCO 0.0121665509297127
SERV 2.569131905239486e-10
CTNM 0.02967932533155508
BOLD 0.18356175675065428
MGX 0.06379038435606435
ANRO 0.20632205725568947
GUTS 0.0878093488952149
AVBP 0.004055205046340644
PSBD 0.3979728065556022

PCSC
Test Statistic:  ADF: -4.62428   KPSS: 0.1262
P-Value:         ADF: 0.00094    KPSS: 0.08668
DurbinWatson: 2.38954
Granger Causality Tests
CEP 0.7253002930981398
RAPP 0.43459609424848644
BOW 0.359057709271086
SVCO 0.040303422753851424
SERV 0.41501878912969214
CTNM 0.7253418581320649
BOLD 0.2663773523258561
MGX 0.03906736884873641
ANRO 0.0014636482375851685
GUTS 0.16652633698950112
AVBP 0.1503057139056179
PSBD 0.13942511575564365

RAPP
Test Statistic:  ADF: -4.32761   KPSS: 0.09268
P-Value:         ADF: 0.00285    KPSS: 0.1
DurbinWatson: 2.49984
Granger Causality Tests
CEP 0.1491025289315778
PCSC 0.24683257178984666
BOW 0.0015308497938752474
SVCO 0.271968704424047
SERV 2.964543549643123e-06
CTNM 0.32712422924875234
BOLD 0.3800796264120684
MGX 0.07417145017904349
ANRO 0.061289219354879194
GUTS 0.6670089867081552
AVBP 0.1567205804916075
PSBD 0.2699846396119399

BOW
Test Statistic:  ADF: -3.50887   KPSS: 0.09492
P-Value:         ADF: 0.03845    KPSS: 0.1
DurbinWatson: 1.99343
Granger Causality Tests
CEP 0.08187624065376915
PCSC 0.07820943168985632
RAPP 0.0015354569942819976
SVCO 0.10987423303731886
SERV 0.00841786882017044
CTNM 0.27012954328021144
BOLD 0.4872434826867401
MGX 0.008223209264913298
ANRO 0.13760515087361985
GUTS 0.44831944555222236
AVBP 0.07073941992246385
PSBD 0.5639070614036178

SVCO
Test Statistic:  ADF: -2.11824   KPSS: 0.09272
P-Value:         ADF: 0.53587    KPSS: 0.1
DurbinWatson: 2.23338
Granger Causality Tests
CEP 0.06902017530568726
PCSC 0.1567415593481579
RAPP 0.44857759326255353
BOW 0.10823394282139179
SERV 0.8051005120915178
CTNM 0.10307672725747383
BOLD 0.22386537760437183
MGX 0.332347794853642
ANRO 0.2099719303448638
GUTS 0.2988377874132186
AVBP 0.04336817680056798
PSBD 0.2892461037832255

SERV
Test Statistic:  ADF: -8.96804   KPSS: 0.11209
P-Value:         ADF: 0.0    KPSS: 0.1
DurbinWatson: 1.11358
Granger Causality Tests
CEP 0.06462052848771162
PCSC 0.1892794558040669
RAPP 8.896077916219496e-05
BOW 0.0027888334075957142
SVCO 0.42176447390128335
CTNM 0.20765494226960765
BOLD 0.6789097631385217
MGX 0.01939890198032806
ANRO 0.15884854366484147
GUTS 0.07064977547159591
AVBP 0.0037905861095408554
PSBD 0.44058983289404186

CTNM
Test Statistic:  ADF: 0.80003    KPSS: 0.15447
P-Value:         ADF: 1.0    KPSS: 0.04294
DurbinWatson: 2.20008
Granger Causality Tests
CEP 0.058906305734582824
PCSC 0.021638659116818077
RAPP 0.30008368477227115
BOW 0.15410656418381666
SVCO 0.005122683164076716
SERV 0.05881123113837669
BOLD 0.012558127613428166
MGX 0.33044012194744693
ANRO 0.2936603137581739
GUTS 0.5360551466000933
AVBP 0.027862877896793453
PSBD 0.23607031550233595

BOLD
Test Statistic:  ADF: -1.4041    KPSS: 0.11033
P-Value:         ADF: 0.8597     KPSS: 0.1
DurbinWatson: 1.82203
Granger Causality Tests
CEP 0.04356706658795905
PCSC 0.03775419194608621
RAPP 0.053533023327477704
BOW 0.019801000619290934
SVCO 0.005790308756127357
SERV 0.03134746227526417
MGX 0.16734535491985691
ANRO 0.4886258573085557
GUTS 0.617136139698123
AVBP 0.008347740411823654
PSBD 0.1930252113473966

MGX
Test Statistic:  ADF: -2.82966   KPSS: 0.16411
P-Value:         ADF: 0.18626    KPSS: 0.0349
DurbinWatson: 2.03101
Granger Causality Tests
CEP 0.000663349102709499
PCSC 0.04670514351062663
RAPP 0.1651709562152099
BOW 0.01778559410392999
SVCO 0.02357217844753844
SERV 0.0005631330697704794
BOLD 0.03285320346685174
ANRO 0.538488128707094
GUTS 0.07606586130611459
AVBP 0.049464801975924136
PSBD 0.4081812810686565

ANRO
Test Statistic:  ADF: -3.01873   KPSS: 0.09723
P-Value:         ADF: 0.12692    KPSS: 0.1
DurbinWatson: 1.78261
Granger Causality Tests
CEP 0.11201327208601963
PCSC 0.0458022369186799
RAPP 0.3472755559698738
BOW 0.1348123730957287
SVCO 0.08796906843321683
SERV 0.2852377036252603
BOLD 0.29966270734746264
GUTS 0.06029185132202115
AVBP 0.03646614043962709
PSBD 0.19640297548897914

GUTS
Test Statistic:  ADF: -3.45481   KPSS: 0.08782
P-Value:         ADF: 0.04448    KPSS: 0.1
DurbinWatson: 1.66913
Granger Causality Tests
CEP 0.023061061783352637
PCSC 0.007030840590820504
RAPP 0.00027722097377959364
BOW 0.03619210221331742
SVCO 0.19619121305951717
SERV 0.0037801175651760812
BOLD 0.4186587552080332
ANRO 0.09487739164133752
AVBP 0.00030409943489079894
PSBD 0.26722969222751713

AVBP
Test Statistic:  ADF: -3.65177   KPSS: 0.13849
P-Value:         ADF: 0.02574    KPSS: 0.0639
DurbinWatson: 2.48813
Granger Causality Tests
CEP 0.018154827029324288
PCSC 0.241929250177746
RAPP 0.004680924445177441
BOW 0.04886560649480386
SVCO 0.26927624224636953
SERV 2.492120189140341e-07
BOLD 0.07605732902465005
ANRO 0.8189941911585441
GUTS 0.28126009001128
PSBD 0.5325622151333134

PSBD
Test Statistic:  ADF: -3.62758   KPSS: 0.08807
P-Value:         ADF: 0.02759    KPSS: 0.1
DurbinWatson: 2.55556
Granger Causality Tests
CEP 0.28346592934572545
PCSC 0.2525818375421246
RAPP 0.4643586566096587
BOW 0.2727132098706565
SVCO 0.027263023113685864
SERV 0.17495528105213884
BOLD 0.1688591650549159
ANRO 0.3709951453581626
GUTS 0.22173312173355147
AVBP 0.16875412103652332
</code></pre>
</div>
</div>
</details>
<div id="3d708fed" class="cell" data-execution_count="6">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>plt.figure(figsize<span class="op">=</span>(<span class="dv">9</span>, <span class="fl">4.8</span>))</span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> c <span class="kw">in</span> g1StockData.columns:</span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a>    plt.plot(g1StockData.index, g1StockData[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.exp(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>), label<span class="op">=</span>c)</span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a>plt.legend()</span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a>plt.xticks(rotation<span class="op">=</span><span class="dv">90</span>)</span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-7-output-1.png" width="717" height="433" class="figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Once the clusters were created, the stocks from that cluster were given to the VAR model. The small cluster was analyzed.</p>
<div id="b4c085f0" class="cell" data-execution_count="7">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> statsmodels.tools.sm_exceptions <span class="im">import</span> ValueWarning</span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> prediction.prediction_util <span class="im">import</span> <span class="op">*</span></span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> statsmodels.tsa.api <span class="im">import</span> VAR</span>
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt</span>
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.metrics <span class="im">import</span> mean_absolute_error, mean_squared_error</span>
<span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> warnings</span>
<span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a>warnings.filterwarnings(<span class="st">'ignore'</span>, category<span class="op">=</span>ValueWarning)</span>
<span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-11"><a href="#cb9-11" aria-hidden="true" tabindex="-1"></a>stocks <span class="op">=</span> pd.read_csv(<span class="st">"data-collection/stocks.csv"</span>, header<span class="op">=</span>[<span class="dv">0</span>,<span class="dv">1</span>], index_col<span class="op">=</span>[<span class="dv">0</span>])</span>
<span id="cb9-12"><a href="#cb9-12" aria-hidden="true" tabindex="-1"></a>ipos <span class="op">=</span> pd.read_csv(<span class="st">"analysis/clustered.csv"</span>)</span>
<span id="cb9-13"><a href="#cb9-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-14"><a href="#cb9-14" aria-hidden="true" tabindex="-1"></a>group1 <span class="op">=</span> ipos[ipos[<span class="st">"Cluster"</span>] <span class="op">==</span> <span class="dv">1</span>]</span>
<span id="cb9-15"><a href="#cb9-15" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-16"><a href="#cb9-16" aria-hidden="true" tabindex="-1"></a>close <span class="op">=</span> stocks.xs(<span class="st">'Close'</span>, axis<span class="op">=</span><span class="dv">1</span>, level<span class="op">=</span><span class="dv">1</span>)</span>
<span id="cb9-17"><a href="#cb9-17" aria-hidden="true" tabindex="-1"></a>close.index <span class="op">=</span> stocks.index</span>
<span id="cb9-18"><a href="#cb9-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-19"><a href="#cb9-19" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-20"><a href="#cb9-20" aria-hidden="true" tabindex="-1"></a>g1StockData <span class="op">=</span> getGroupClosingPrices(group1, close)</span>
<span id="cb9-21"><a href="#cb9-21" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-22"><a href="#cb9-22" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> c <span class="kw">in</span> g1StockData.columns:</span>
<span id="cb9-23"><a href="#cb9-23" aria-hidden="true" tabindex="-1"></a>    <span class="co"># p = PowerTransformer(method='box-cox')</span></span>
<span id="cb9-24"><a href="#cb9-24" aria-hidden="true" tabindex="-1"></a>    <span class="co"># g1StockData[c] = p.fit_transform(g1StockData[[c]])</span></span>
<span id="cb9-25"><a href="#cb9-25" aria-hidden="true" tabindex="-1"></a>    <span class="co"># lambdas[c] = p</span></span>
<span id="cb9-26"><a href="#cb9-26" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-27"><a href="#cb9-27" aria-hidden="true" tabindex="-1"></a>    g1StockData[c] <span class="op">=</span> g1StockData[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.log(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>)</span>
<span id="cb9-28"><a href="#cb9-28" aria-hidden="true" tabindex="-1"></a>    <span class="co"># g1StockData[c] = g1StockData[c].diff()</span></span>
<span id="cb9-29"><a href="#cb9-29" aria-hidden="true" tabindex="-1"></a>    <span class="co"># g1StockData[c].dropna(inplace=True)</span></span>
<span id="cb9-30"><a href="#cb9-30" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-31"><a href="#cb9-31" aria-hidden="true" tabindex="-1"></a>    ts, p, lags <span class="op">=</span> adf_test(g1StockData[c].dropna())</span>
<span id="cb9-32"><a href="#cb9-32" aria-hidden="true" tabindex="-1"></a>    tsK, pK, lagsK <span class="op">=</span> kpss_test(g1StockData[c].dropna())</span>
<span id="cb9-33"><a href="#cb9-33" aria-hidden="true" tabindex="-1"></a>    db <span class="op">=</span> durbinWatson_test(g1StockData[c].dropna())</span>
<span id="cb9-34"><a href="#cb9-34" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> p <span class="op">&gt;</span> <span class="fl">0.05</span> <span class="kw">and</span> pK <span class="op">&lt;</span> <span class="fl">0.05</span>:</span>
<span id="cb9-35"><a href="#cb9-35" aria-hidden="true" tabindex="-1"></a>        g1StockData.drop(columns<span class="op">=</span>c, inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb9-36"><a href="#cb9-36" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-37"><a href="#cb9-37" aria-hidden="true" tabindex="-1"></a>date_labels <span class="op">=</span> pd.date_range(start<span class="op">=</span><span class="st">'2024-01-01'</span>, periods<span class="op">=</span><span class="dv">60</span>, freq<span class="op">=</span><span class="st">'D'</span>)</span>
<span id="cb9-38"><a href="#cb9-38" aria-hidden="true" tabindex="-1"></a>g1StockData[<span class="st">"date"</span>] <span class="op">=</span> date_labels</span>
<span id="cb9-39"><a href="#cb9-39" aria-hidden="true" tabindex="-1"></a>g1StockData.set_index(<span class="st">"date"</span>, inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb9-40"><a href="#cb9-40" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-41"><a href="#cb9-41" aria-hidden="true" tabindex="-1"></a>model <span class="op">=</span> VAR(g1StockData.iloc[:<span class="op">-</span><span class="dv">5</span>])</span>
<span id="cb9-42"><a href="#cb9-42" aria-hidden="true" tabindex="-1"></a>results <span class="op">=</span> model.fit(maxlags<span class="op">=</span><span class="dv">1</span>, ic<span class="op">=</span><span class="st">'aic'</span>)</span>
<span id="cb9-43"><a href="#cb9-43" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-44"><a href="#cb9-44" aria-hidden="true" tabindex="-1"></a>lag_order <span class="op">=</span> results.k_ar</span>
<span id="cb9-45"><a href="#cb9-45" aria-hidden="true" tabindex="-1"></a>forecast_input <span class="op">=</span> g1StockData.values[<span class="op">-</span>(lag_order):]</span>
<span id="cb9-46"><a href="#cb9-46" aria-hidden="true" tabindex="-1"></a>forecast <span class="op">=</span> results.forecast(y<span class="op">=</span>forecast_input, steps<span class="op">=</span><span class="dv">5</span>)</span>
<span id="cb9-47"><a href="#cb9-47" aria-hidden="true" tabindex="-1"></a>forecast_df <span class="op">=</span> pd.DataFrame(forecast, index<span class="op">=</span>[<span class="ss">f"Day </span><span class="sc">{</span>i<span class="sc">}</span><span class="ss">"</span> <span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">56</span>, <span class="dv">61</span>)], columns<span class="op">=</span>g1StockData.columns)</span>
<span id="cb9-48"><a href="#cb9-48" aria-hidden="true" tabindex="-1"></a>g1StockData.index <span class="op">=</span> [<span class="ss">f"Day </span><span class="sc">{</span>i<span class="sc">}</span><span class="ss">"</span> <span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">1</span>, <span class="dv">61</span>)]</span>
<span id="cb9-49"><a href="#cb9-49" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-50"><a href="#cb9-50" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> c <span class="kw">in</span> forecast_df.columns:</span>
<span id="cb9-51"><a href="#cb9-51" aria-hidden="true" tabindex="-1"></a>    g1StockData[c] <span class="op">=</span> g1StockData[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.exp(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>)</span>
<span id="cb9-52"><a href="#cb9-52" aria-hidden="true" tabindex="-1"></a>    forecast_df[c] <span class="op">=</span> forecast_df[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.exp(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>)</span>
<span id="cb9-53"><a href="#cb9-53" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-54"><a href="#cb9-54" aria-hidden="true" tabindex="-1"></a>plt.figure(figsize<span class="op">=</span>(<span class="dv">9</span>, <span class="fl">4.8</span>))</span>
<span id="cb9-55"><a href="#cb9-55" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i, c <span class="kw">in</span> <span class="bu">enumerate</span>(g1StockData.columns):</span>
<span id="cb9-56"><a href="#cb9-56" aria-hidden="true" tabindex="-1"></a>    plt.plot(g1StockData.iloc[:<span class="op">-</span><span class="dv">5</span>].index, g1StockData[c].iloc[:<span class="op">-</span><span class="dv">5</span>], label<span class="op">=</span>c)</span>
<span id="cb9-57"><a href="#cb9-57" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i, c <span class="kw">in</span> <span class="bu">enumerate</span>(g1StockData.columns):</span>
<span id="cb9-58"><a href="#cb9-58" aria-hidden="true" tabindex="-1"></a>    plt.plot(forecast_df.index, <span class="bu">list</span>(forecast_df.iloc[:, i]), color<span class="op">=</span>plt.gca().lines[i].get_color(), label<span class="op">=</span><span class="va">None</span>)</span>
<span id="cb9-59"><a href="#cb9-59" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-60"><a href="#cb9-60" aria-hidden="true" tabindex="-1"></a>plt.legend(loc<span class="op">=</span><span class="st">'upper left'</span>)</span>
<span id="cb9-61"><a href="#cb9-61" aria-hidden="true" tabindex="-1"></a>plt.xticks(rotation<span class="op">=</span><span class="dv">90</span>)</span>
<span id="cb9-62"><a href="#cb9-62" aria-hidden="true" tabindex="-1"></a>plt.show()</span>
<span id="cb9-63"><a href="#cb9-63" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-64"><a href="#cb9-64" aria-hidden="true" tabindex="-1"></a>correct <span class="op">=</span> g1StockData.iloc[<span class="op">-</span><span class="dv">1</span>]</span>
<span id="cb9-65"><a href="#cb9-65" aria-hidden="true" tabindex="-1"></a>pred <span class="op">=</span> forecast_df.iloc[<span class="op">-</span><span class="dv">1</span>]</span>
<span id="cb9-66"><a href="#cb9-66" aria-hidden="true" tabindex="-1"></a>prev <span class="op">=</span> g1StockData.iloc[<span class="op">-</span><span class="dv">2</span>]</span>
<span id="cb9-67"><a href="#cb9-67" aria-hidden="true" tabindex="-1"></a>error <span class="op">=</span> <span class="bu">abs</span>(correct <span class="op">-</span> pred)</span>
<span id="cb9-68"><a href="#cb9-68" aria-hidden="true" tabindex="-1"></a>errorPct <span class="op">=</span> error <span class="op">/</span> correct <span class="op">*</span> <span class="dv">100</span></span>
<span id="cb9-69"><a href="#cb9-69" aria-hidden="true" tabindex="-1"></a>results <span class="op">=</span> pd.DataFrame({<span class="st">"correct"</span>:correct, <span class="st">"pred"</span>:pred, <span class="st">"prev"</span>:prev, <span class="st">"errorPct"</span>:errorPct, <span class="st">"error"</span>:error})</span>
<span id="cb9-70"><a href="#cb9-70" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(results)</span>
<span id="cb9-71"><a href="#cb9-71" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-72"><a href="#cb9-72" aria-hidden="true" tabindex="-1"></a><span class="co"># Calculate MAE</span></span>
<span id="cb9-73"><a href="#cb9-73" aria-hidden="true" tabindex="-1"></a>mae <span class="op">=</span> mean_absolute_error(results[<span class="st">'correct'</span>], results[<span class="st">'pred'</span>])</span>
<span id="cb9-74"><a href="#cb9-74" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">'Mean Absolute Error:'</span>, mae)</span>
<span id="cb9-75"><a href="#cb9-75" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-76"><a href="#cb9-76" aria-hidden="true" tabindex="-1"></a><span class="co"># Calculate MSE</span></span>
<span id="cb9-77"><a href="#cb9-77" aria-hidden="true" tabindex="-1"></a>mse <span class="op">=</span> mean_squared_error(results[<span class="st">'correct'</span>], results[<span class="st">'pred'</span>])</span>
<span id="cb9-78"><a href="#cb9-78" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">'Mean Squared Error:'</span>, mse)</span>
<span id="cb9-79"><a href="#cb9-79" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-80"><a href="#cb9-80" aria-hidden="true" tabindex="-1"></a>count <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb9-81"><a href="#cb9-81" aria-hidden="true" tabindex="-1"></a>errorPct <span class="op">=</span> []</span>
<span id="cb9-82"><a href="#cb9-82" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="bu">len</span>(correct)):</span>
<span id="cb9-83"><a href="#cb9-83" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> correct.iloc[i] <span class="op">&lt;</span> prev.iloc[i] <span class="kw">and</span> pred.iloc[i] <span class="op">&lt;</span> prev.iloc[i]:</span>
<span id="cb9-84"><a href="#cb9-84" aria-hidden="true" tabindex="-1"></a>        count <span class="op">+=</span> <span class="dv">1</span></span>
<span id="cb9-85"><a href="#cb9-85" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> correct.iloc[i] <span class="op">&gt;</span> prev.iloc[i] <span class="kw">and</span> pred.iloc[i] <span class="op">&gt;</span> prev.iloc[i]:</span>
<span id="cb9-86"><a href="#cb9-86" aria-hidden="true" tabindex="-1"></a>        count <span class="op">+=</span> <span class="dv">1</span></span>
<span id="cb9-87"><a href="#cb9-87" aria-hidden="true" tabindex="-1"></a>    errorPct.append(<span class="bu">abs</span>(correct.iloc[i] <span class="op">-</span> pred.iloc[i]) <span class="op">/</span> correct.iloc[i] <span class="op">*</span> <span class="dv">100</span>)</span>
<span id="cb9-88"><a href="#cb9-88" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">"Percent Directionally Correct:"</span>, count <span class="op">/</span> <span class="bu">len</span>(correct), <span class="ss">f"(</span><span class="sc">{</span>count<span class="sc">}</span><span class="ss">/</span><span class="sc">{</span><span class="bu">len</span>(correct)<span class="sc">}</span><span class="ss">)"</span>)</span>
<span id="cb9-89"><a href="#cb9-89" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">"MAE of Percent Error:"</span>, <span class="bu">sum</span>(results[<span class="st">"errorPct"</span>]) <span class="op">/</span> <span class="bu">len</span>(results))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-8-output-1.png" width="717" height="433" class="figure-img"></p>
</figure>
</div>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>        correct       pred       prev   errorPct     error
CEP   10.070000  10.052977  10.070000   0.169043  0.017023
PCSC  10.050000  10.087239  10.050000   0.370537  0.037239
RAPP  20.240000  19.579422  21.750000   3.263726  0.660578
BOW   27.840000  27.296792  26.950001   1.951178  0.543208
SVCO  16.180000  17.306084  15.880000   6.959726  1.126084
SERV   2.800000   1.797361   2.880000  35.808534  1.002639
BOLD   5.410000   7.359080   4.770000  36.027351  1.949080
ANRO  15.560000  15.782397  14.180000   1.429287  0.222397
GUTS   6.980000   7.280076   6.700000   4.299082  0.300076
AVBP  15.900000  15.026402  15.500000   5.494325  0.873598
PSBD  16.110001  16.617865  16.290001   3.152482  0.507865
Mean Absolute Error: 0.6581623048931214
Mean Squared Error: 0.7241801019328876
Percent Directionally Correct: 0.6363636363636364 (7/11)
MAE of Percent Error: 8.993206453218596</code></pre>
</div>
</div>
<p>Then the clusters were analyzed by industry group.</p>
<details>
<summary>
Show results of Industry Group Tests
</summary>
<div id="dc44ff03" class="cell" data-execution_count="8">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> statsmodels.tools.sm_exceptions <span class="im">import</span> ValueWarning</span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> prediction.prediction_util <span class="im">import</span> <span class="op">*</span></span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> statsmodels.tsa.api <span class="im">import</span> VAR</span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt</span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.metrics <span class="im">import</span> mean_absolute_error, mean_squared_error</span>
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> warnings</span>
<span id="cb11-9"><a href="#cb11-9" aria-hidden="true" tabindex="-1"></a>warnings.filterwarnings(<span class="st">'ignore'</span>, category<span class="op">=</span>ValueWarning)</span>
<span id="cb11-10"><a href="#cb11-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-11"><a href="#cb11-11" aria-hidden="true" tabindex="-1"></a>stocks <span class="op">=</span> pd.read_csv(<span class="st">"data-collection/stocks.csv"</span>, header<span class="op">=</span>[<span class="dv">0</span>,<span class="dv">1</span>], index_col<span class="op">=</span>[<span class="dv">0</span>])</span>
<span id="cb11-12"><a href="#cb11-12" aria-hidden="true" tabindex="-1"></a>ipos <span class="op">=</span> pd.read_csv(<span class="st">"analysis/clustered.csv"</span>)</span>
<span id="cb11-13"><a href="#cb11-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-14"><a href="#cb11-14" aria-hidden="true" tabindex="-1"></a>industriesLabels <span class="op">=</span> [</span>
<span id="cb11-15"><a href="#cb11-15" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Basic Materials"</span>,</span>
<span id="cb11-16"><a href="#cb11-16" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Blank Check"</span>,</span>
<span id="cb11-17"><a href="#cb11-17" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Consumer Goods"</span>,</span>
<span id="cb11-18"><a href="#cb11-18" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Consumer Services"</span>,</span>
<span id="cb11-19"><a href="#cb11-19" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Financials"</span>,</span>
<span id="cb11-20"><a href="#cb11-20" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Health Care"</span>,</span>
<span id="cb11-21"><a href="#cb11-21" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Industrials"</span>,</span>
<span id="cb11-22"><a href="#cb11-22" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Oil and Gas"</span>,</span>
<span id="cb11-23"><a href="#cb11-23" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Other"</span>,</span>
<span id="cb11-24"><a href="#cb11-24" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Technology"</span>,</span>
<span id="cb11-25"><a href="#cb11-25" aria-hidden="true" tabindex="-1"></a>    <span class="st">"Utilities"</span>,</span>
<span id="cb11-26"><a href="#cb11-26" aria-hidden="true" tabindex="-1"></a>]</span>
<span id="cb11-27"><a href="#cb11-27" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-28"><a href="#cb11-28" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> INUM, ILABEL <span class="kw">in</span> <span class="bu">enumerate</span>(industriesLabels):</span>
<span id="cb11-29"><a href="#cb11-29" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> INUM <span class="op">!=</span> <span class="dv">0</span>:</span>
<span id="cb11-30"><a href="#cb11-30" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(<span class="st">"</span><span class="ch">\n\n</span><span class="st">"</span>)</span>
<span id="cb11-31"><a href="#cb11-31" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">"INDUSTRY:"</span>, ILABEL)</span>
<span id="cb11-32"><a href="#cb11-32" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-33"><a href="#cb11-33" aria-hidden="true" tabindex="-1"></a>    group1 <span class="op">=</span> ipos[ipos[<span class="st">"Industry_Cluster"</span>] <span class="op">==</span> INUM]</span>
<span id="cb11-34"><a href="#cb11-34" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-35"><a href="#cb11-35" aria-hidden="true" tabindex="-1"></a>    close <span class="op">=</span> stocks.xs(<span class="st">'Close'</span>, axis<span class="op">=</span><span class="dv">1</span>, level<span class="op">=</span><span class="dv">1</span>)</span>
<span id="cb11-36"><a href="#cb11-36" aria-hidden="true" tabindex="-1"></a>    close.index <span class="op">=</span> stocks.index</span>
<span id="cb11-37"><a href="#cb11-37" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-38"><a href="#cb11-38" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-39"><a href="#cb11-39" aria-hidden="true" tabindex="-1"></a>    g1StockData <span class="op">=</span> getGroupClosingPrices(group1, close)</span>
<span id="cb11-40"><a href="#cb11-40" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-41"><a href="#cb11-41" aria-hidden="true" tabindex="-1"></a>    <span class="cf">for</span> c <span class="kw">in</span> g1StockData.columns:</span>
<span id="cb11-42"><a href="#cb11-42" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-43"><a href="#cb11-43" aria-hidden="true" tabindex="-1"></a>        g1StockData[c] <span class="op">=</span> g1StockData[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.log(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>)</span>
<span id="cb11-44"><a href="#cb11-44" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-45"><a href="#cb11-45" aria-hidden="true" tabindex="-1"></a>        ts, p, lags <span class="op">=</span> adf_test(g1StockData[c].dropna())</span>
<span id="cb11-46"><a href="#cb11-46" aria-hidden="true" tabindex="-1"></a>        tsK, pK, lagsK <span class="op">=</span> kpss_test(g1StockData[c].dropna())</span>
<span id="cb11-47"><a href="#cb11-47" aria-hidden="true" tabindex="-1"></a>        db <span class="op">=</span> durbinWatson_test(g1StockData[c].dropna())</span>
<span id="cb11-48"><a href="#cb11-48" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(c)</span>
<span id="cb11-49"><a href="#cb11-49" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(<span class="ss">f"Test Statistic:  ADF: </span><span class="sc">{</span><span class="bu">round</span>(ts, <span class="dv">5</span>)<span class="sc">}</span><span class="ch">\t</span><span class="ss"> KPSS: </span><span class="sc">{</span><span class="bu">round</span>(tsK, <span class="dv">5</span>)<span class="sc">}</span><span class="ss">"</span>)</span>
<span id="cb11-50"><a href="#cb11-50" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(<span class="ss">f"P-Value:         ADF: </span><span class="sc">{</span><span class="bu">round</span>(p, <span class="dv">5</span>)<span class="sc">}</span><span class="ch">\t</span><span class="ss"> KPSS: </span><span class="sc">{</span><span class="bu">round</span>(pK, <span class="dv">5</span>)<span class="sc">}</span><span class="ss">"</span>)</span>
<span id="cb11-51"><a href="#cb11-51" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(<span class="st">"DurbinWatson: </span><span class="sc">%.5f</span><span class="st">"</span> <span class="op">%</span> db)</span>
<span id="cb11-52"><a href="#cb11-52" aria-hidden="true" tabindex="-1"></a>        others <span class="op">=</span> [x <span class="cf">for</span> x <span class="kw">in</span> g1StockData.columns <span class="cf">if</span> x <span class="op">!=</span> c]</span>
<span id="cb11-53"><a href="#cb11-53" aria-hidden="true" tabindex="-1"></a>        gc <span class="op">=</span> {}</span>
<span id="cb11-54"><a href="#cb11-54" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(<span class="st">"Granger Causality Tests"</span>)</span>
<span id="cb11-55"><a href="#cb11-55" aria-hidden="true" tabindex="-1"></a>        <span class="cf">for</span> x <span class="kw">in</span> others:</span>
<span id="cb11-56"><a href="#cb11-56" aria-hidden="true" tabindex="-1"></a>            gc[x] <span class="op">=</span> <span class="bu">min</span>(granger_test(g1StockData[c], g1StockData[x]).values())</span>
<span id="cb11-57"><a href="#cb11-57" aria-hidden="true" tabindex="-1"></a>            <span class="bu">print</span>(x, gc[x])</span>
<span id="cb11-58"><a href="#cb11-58" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> p <span class="op">&gt;</span> <span class="fl">0.05</span> <span class="kw">and</span> pK <span class="op">&lt;</span> <span class="fl">0.05</span>:</span>
<span id="cb11-59"><a href="#cb11-59" aria-hidden="true" tabindex="-1"></a>            g1StockData.drop(columns<span class="op">=</span>c, inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb11-60"><a href="#cb11-60" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>()</span>
<span id="cb11-61"><a href="#cb11-61" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-62"><a href="#cb11-62" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> <span class="bu">len</span>(g1StockData.columns) <span class="op">&lt;</span> <span class="dv">2</span>:</span>
<span id="cb11-63"><a href="#cb11-63" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(<span class="st">"Insufficient number of symbols after preprocessing"</span>)</span>
<span id="cb11-64"><a href="#cb11-64" aria-hidden="true" tabindex="-1"></a>        <span class="cf">continue</span></span>
<span id="cb11-65"><a href="#cb11-65" aria-hidden="true" tabindex="-1"></a>    g1StockData <span class="op">=</span> g1StockData.iloc[:, <span class="dv">0</span>:<span class="dv">15</span>]</span>
<span id="cb11-66"><a href="#cb11-66" aria-hidden="true" tabindex="-1"></a>    date_labels <span class="op">=</span> pd.date_range(start<span class="op">=</span><span class="st">'2024-01-01'</span>, periods<span class="op">=</span><span class="dv">60</span>, freq<span class="op">=</span><span class="st">'D'</span>)</span>
<span id="cb11-67"><a href="#cb11-67" aria-hidden="true" tabindex="-1"></a>    g1StockData[<span class="st">"date"</span>] <span class="op">=</span> date_labels</span>
<span id="cb11-68"><a href="#cb11-68" aria-hidden="true" tabindex="-1"></a>    g1StockData.set_index(<span class="st">"date"</span>, inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb11-69"><a href="#cb11-69" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-70"><a href="#cb11-70" aria-hidden="true" tabindex="-1"></a>    model <span class="op">=</span> VAR(g1StockData.iloc[:<span class="op">-</span><span class="dv">5</span>])</span>
<span id="cb11-71"><a href="#cb11-71" aria-hidden="true" tabindex="-1"></a>    results <span class="op">=</span> model.fit(maxlags<span class="op">=</span><span class="dv">1</span>, ic<span class="op">=</span><span class="st">'aic'</span>)</span>
<span id="cb11-72"><a href="#cb11-72" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-73"><a href="#cb11-73" aria-hidden="true" tabindex="-1"></a>    lag_order <span class="op">=</span> results.k_ar</span>
<span id="cb11-74"><a href="#cb11-74" aria-hidden="true" tabindex="-1"></a>    forecast_input <span class="op">=</span> g1StockData.values[<span class="op">-</span>(lag_order):]</span>
<span id="cb11-75"><a href="#cb11-75" aria-hidden="true" tabindex="-1"></a>    forecast <span class="op">=</span> results.forecast(y<span class="op">=</span>forecast_input, steps<span class="op">=</span><span class="dv">5</span>)</span>
<span id="cb11-76"><a href="#cb11-76" aria-hidden="true" tabindex="-1"></a>    forecast_df <span class="op">=</span> pd.DataFrame(forecast, index<span class="op">=</span>[<span class="ss">f"Day </span><span class="sc">{</span>i<span class="sc">}</span><span class="ss">"</span> <span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">56</span>, <span class="dv">61</span>)], columns<span class="op">=</span>g1StockData.columns)</span>
<span id="cb11-77"><a href="#cb11-77" aria-hidden="true" tabindex="-1"></a>    g1StockData.index <span class="op">=</span> [<span class="ss">f"Day </span><span class="sc">{</span>i<span class="sc">}</span><span class="ss">"</span> <span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">1</span>, <span class="dv">61</span>)]</span>
<span id="cb11-78"><a href="#cb11-78" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-79"><a href="#cb11-79" aria-hidden="true" tabindex="-1"></a>    <span class="cf">for</span> c <span class="kw">in</span> forecast_df.columns:</span>
<span id="cb11-80"><a href="#cb11-80" aria-hidden="true" tabindex="-1"></a>        g1StockData[c] <span class="op">=</span> g1StockData[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.exp(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>)</span>
<span id="cb11-81"><a href="#cb11-81" aria-hidden="true" tabindex="-1"></a>        forecast_df[c] <span class="op">=</span> forecast_df[c].<span class="bu">apply</span>(<span class="kw">lambda</span> x: np.exp(x) <span class="cf">if</span> x <span class="op">!=</span> <span class="dv">0</span> <span class="cf">else</span> <span class="dv">0</span>)</span>
<span id="cb11-82"><a href="#cb11-82" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-83"><a href="#cb11-83" aria-hidden="true" tabindex="-1"></a>    plt.figure(figsize<span class="op">=</span>(<span class="dv">9</span>, <span class="fl">4.8</span>))</span>
<span id="cb11-84"><a href="#cb11-84" aria-hidden="true" tabindex="-1"></a>    <span class="cf">for</span> i, c <span class="kw">in</span> <span class="bu">enumerate</span>(g1StockData.columns):</span>
<span id="cb11-85"><a href="#cb11-85" aria-hidden="true" tabindex="-1"></a>        plt.plot(g1StockData.iloc[:<span class="op">-</span><span class="dv">5</span>].index, g1StockData[c].iloc[:<span class="op">-</span><span class="dv">5</span>], label<span class="op">=</span>c)</span>
<span id="cb11-86"><a href="#cb11-86" aria-hidden="true" tabindex="-1"></a>    <span class="cf">for</span> i, c <span class="kw">in</span> <span class="bu">enumerate</span>(g1StockData.columns):</span>
<span id="cb11-87"><a href="#cb11-87" aria-hidden="true" tabindex="-1"></a>        plt.plot(forecast_df.index, <span class="bu">list</span>(forecast_df.iloc[:, i]), color<span class="op">=</span>plt.gca().lines[i].get_color(), label<span class="op">=</span><span class="va">None</span>)</span>
<span id="cb11-88"><a href="#cb11-88" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-89"><a href="#cb11-89" aria-hidden="true" tabindex="-1"></a>    plt.legend(loc<span class="op">=</span><span class="st">'upper left'</span>)</span>
<span id="cb11-90"><a href="#cb11-90" aria-hidden="true" tabindex="-1"></a>    plt.xticks(rotation<span class="op">=</span><span class="dv">90</span>)</span>
<span id="cb11-91"><a href="#cb11-91" aria-hidden="true" tabindex="-1"></a>    plt.show()</span>
<span id="cb11-92"><a href="#cb11-92" aria-hidden="true" tabindex="-1"></a>    g1StockData.replace([<span class="dv">0</span>], <span class="fl">0.001</span>, inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb11-93"><a href="#cb11-93" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-94"><a href="#cb11-94" aria-hidden="true" tabindex="-1"></a>    correct <span class="op">=</span> g1StockData.iloc[<span class="op">-</span><span class="dv">1</span>]</span>
<span id="cb11-95"><a href="#cb11-95" aria-hidden="true" tabindex="-1"></a>    pred <span class="op">=</span> forecast_df.iloc[<span class="op">-</span><span class="dv">1</span>]</span>
<span id="cb11-96"><a href="#cb11-96" aria-hidden="true" tabindex="-1"></a>    prev <span class="op">=</span> g1StockData.iloc[<span class="op">-</span><span class="dv">2</span>]</span>
<span id="cb11-97"><a href="#cb11-97" aria-hidden="true" tabindex="-1"></a>    error <span class="op">=</span> <span class="bu">abs</span>(correct <span class="op">-</span> pred)</span>
<span id="cb11-98"><a href="#cb11-98" aria-hidden="true" tabindex="-1"></a>    errorPct <span class="op">=</span> error <span class="op">/</span> correct <span class="op">*</span> <span class="dv">100</span></span>
<span id="cb11-99"><a href="#cb11-99" aria-hidden="true" tabindex="-1"></a>    results <span class="op">=</span> pd.DataFrame({<span class="st">"correct"</span>:correct, <span class="st">"pred"</span>:pred, <span class="st">"prev"</span>:prev, <span class="st">"errorPct"</span>:errorPct, <span class="st">"error"</span>:error})</span>
<span id="cb11-100"><a href="#cb11-100" aria-hidden="true" tabindex="-1"></a>    results.replace([np.inf, <span class="op">-</span>np.inf], <span class="dv">100</span>, inplace<span class="op">=</span><span class="va">True</span>)</span>
<span id="cb11-101"><a href="#cb11-101" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(results)</span>
<span id="cb11-102"><a href="#cb11-102" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-103"><a href="#cb11-103" aria-hidden="true" tabindex="-1"></a>    <span class="co"># Calculate MAE</span></span>
<span id="cb11-104"><a href="#cb11-104" aria-hidden="true" tabindex="-1"></a>    mae <span class="op">=</span> mean_absolute_error(results[<span class="st">'correct'</span>], results[<span class="st">'pred'</span>])</span>
<span id="cb11-105"><a href="#cb11-105" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">'Mean Absolute Error:'</span>, mae)</span>
<span id="cb11-106"><a href="#cb11-106" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-107"><a href="#cb11-107" aria-hidden="true" tabindex="-1"></a>    <span class="co"># Calculate MSE</span></span>
<span id="cb11-108"><a href="#cb11-108" aria-hidden="true" tabindex="-1"></a>    mse <span class="op">=</span> mean_squared_error(results[<span class="st">'correct'</span>], results[<span class="st">'pred'</span>])</span>
<span id="cb11-109"><a href="#cb11-109" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">'Mean Squared Error:'</span>, mse)</span>
<span id="cb11-110"><a href="#cb11-110" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-111"><a href="#cb11-111" aria-hidden="true" tabindex="-1"></a>    count <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb11-112"><a href="#cb11-112" aria-hidden="true" tabindex="-1"></a>    errorPct <span class="op">=</span> []</span>
<span id="cb11-113"><a href="#cb11-113" aria-hidden="true" tabindex="-1"></a>    <span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="bu">len</span>(correct)):</span>
<span id="cb11-114"><a href="#cb11-114" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> correct.iloc[i] <span class="op">&lt;</span> prev.iloc[i] <span class="kw">and</span> pred.iloc[i] <span class="op">&lt;</span> prev.iloc[i]:</span>
<span id="cb11-115"><a href="#cb11-115" aria-hidden="true" tabindex="-1"></a>            count <span class="op">+=</span> <span class="dv">1</span></span>
<span id="cb11-116"><a href="#cb11-116" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> correct.iloc[i] <span class="op">&gt;</span> prev.iloc[i] <span class="kw">and</span> pred.iloc[i] <span class="op">&gt;</span> prev.iloc[i]:</span>
<span id="cb11-117"><a href="#cb11-117" aria-hidden="true" tabindex="-1"></a>            count <span class="op">+=</span> <span class="dv">1</span></span>
<span id="cb11-118"><a href="#cb11-118" aria-hidden="true" tabindex="-1"></a>        errorPct.append(<span class="bu">abs</span>(correct.iloc[i] <span class="op">-</span> pred.iloc[i]) <span class="op">/</span> correct.iloc[i] <span class="op">*</span> <span class="dv">100</span>)</span>
<span id="cb11-119"><a href="#cb11-119" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">"Percent Directionally Correct:"</span>, count <span class="op">/</span> <span class="bu">len</span>(correct), <span class="ss">f"(</span><span class="sc">{</span>count<span class="sc">}</span><span class="ss">/</span><span class="sc">{</span><span class="bu">len</span>(correct)<span class="sc">}</span><span class="ss">)"</span>)</span>
<span id="cb11-120"><a href="#cb11-120" aria-hidden="true" tabindex="-1"></a>    <span class="bu">print</span>(<span class="st">"MAE of Percent Error:"</span>, <span class="bu">sum</span>(results[<span class="st">"errorPct"</span>]) <span class="op">/</span> <span class="bu">len</span>(results))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>INDUSTRY: Basic Materials
PGHL
Test Statistic:  ADF: -1.26919   KPSS: 0.11052
P-Value:         ADF: 0.89534    KPSS: 0.1
DurbinWatson: 1.94677
Granger Causality Tests

Insufficient number of symbols after preprocessing


INDUSTRY: Blank Check
CEP
Test Statistic:  ADF: -4.87106   KPSS: 0.1238
P-Value:         ADF: 0.00035    KPSS: 0.0911
DurbinWatson: 2.60440
Granger Causality Tests
PCSC 0.018398930051991544

PCSC
Test Statistic:  ADF: -4.62428   KPSS: 0.1262
P-Value:         ADF: 0.00094    KPSS: 0.08668
DurbinWatson: 2.38954
Granger Causality Tests
CEP 0.7253002930981398
</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-9-output-2.png" width="738" height="433" class="figure-img"></p>
</figure>
</div>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>      correct       pred   prev  errorPct     error
CEP     10.07  10.042405  10.07  0.274024  0.027594
PCSC    10.05  10.067043  10.05  0.169584  0.017043
Mean Absolute Error: 0.02231873080203428
Mean Squared Error: 0.0005259568404847629
Percent Directionally Correct: 0.0 (0/2)
MAE of Percent Error: 0.22180426492823258


INDUSTRY: Consumer Goods
RAY
Test Statistic:  ADF: -2.28919   KPSS: 0.11491
P-Value:         ADF: 0.43994    KPSS: 0.1
DurbinWatson: 1.91186
Granger Causality Tests
SOWG 0.004003475237870572
TWG 0.020365358445125735
SMXT 0.013377846185227816

SOWG
Test Statistic:  ADF: -3.55781   KPSS: 0.15682
P-Value:         ADF: 0.0336     KPSS: 0.04098
DurbinWatson: 2.43214
Granger Causality Tests
RAY 0.4316359253342723
TWG 0.04146568996712947
SMXT 0.16901595863786065

TWG
Test Statistic:  ADF: -2.82923   KPSS: 0.13148
P-Value:         ADF: 0.18642    KPSS: 0.07689
DurbinWatson: 2.11375
Granger Causality Tests
RAY 0.06546057972842291
SOWG 0.39078103822872623
SMXT 0.11521212541109388

SMXT
Test Statistic:  ADF: -2.55973   KPSS: 0.1097
P-Value:         ADF: 0.29877    KPSS: 0.1
DurbinWatson: 1.67778
Granger Causality Tests
RAY 0.2682625693186744
SOWG 0.12048535035184825
TWG 0.08016804138302787
</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="index_files/figure-html/cell-9-output-4.png" width="717" height="433" class="figure-img"></p>
</figure>
</div>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>      correct       pred    prev   errorPct     error
RAY      2.50   3.431852   2.810  37.274091  0.931852
SOWG    10.05   8.411556   9.500  16.302924  1.638444
TWG      0.78   0.780586   0.834   0.075153  0.000586
SMXT    13.07  13.201538  12.570   1.006417  0.131539
Mean Absolute Error: 0.6756052670864232
Mean Squared Error: 0.8925374329244472
Percent Directionally Correct: 0.5 (2/4)
MAE of Percent Error: 13.664646286079725


INDUSTRY: Consumer Services
AZI
Test Statistic:  ADF: -2.29635   KPSS: 0.16328
P-Value:         ADF: 0.43597    KPSS: 0.0356
DurbinWatson: 1.12856
Granger Causality Tests
NIPG 0.09937659474591048
QMMM 0.3352194983661654
JDZG 0.03504951472137446
SERV 0.4677169406154553
JUNE 0.1005614834228527
MMA 0.22687343611730698
INTJ 0.23235007759795726
HAO 0.4710785057521988

NIPG
Test Statistic:  ADF: -6.15742   KPSS: 0.15337
P-Value:         ADF: 0.0    KPSS: 0.04386
DurbinWatson: 1.84246
Granger Causality Tests
QMMM 0.3855373175146004
JDZG 0.00013485828216075612
SERV 1.041218800971101e-05
JUNE 0.2263764759132391
MMA 0.010184965434085516
INTJ 0.001257680146827705
HAO 1.8885212438398537e-05

QMMM
Test Statistic:  ADF: -2.91843   KPSS: 0.06643
P-Value:         ADF: 0.15633    KPSS: 0.1
DurbinWatson: 2.21855
Granger Causality Tests
NIPG 0.06401927026276631
JDZG 0.41884232209928685
SERV 2.8107159652094402e-14
JUNE 0.06237769669556758