@@ -63,19 +63,19 @@ The dataset `Beijing` is the Beijing Multi-Site Air-Quality Data Set. It consist
6363This dataset only contains numerical vairables.
6464
6565``` python
66- # df_data = data.get_data_corrupted("Beijing", ratio_masked=.2, mean_size=120)
66+ df_data = data.get_data_corrupted(" Beijing" , ratio_masked = .2 , mean_size = 120 )
6767
6868# cols_to_impute = ["TEMP", "PRES", "DEWP", "NO2", "CO", "O3", "WSPM"]
6969# cols_to_impute = df_data.columns[df_data.isna().any()]
70- # cols_to_impute = ["TEMP", "PRES"]
70+ cols_to_impute = [" TEMP" , " PRES" ]
7171
7272```
7373
7474The dataset ` Artificial ` is designed to have a sum of a periodical signal, a white noise and some outliers.
7575
7676``` python
77- df_data = data.get_data_corrupted(" Artificial" , ratio_masked = .2 , mean_size = 10 )
78- cols_to_impute = [" signal" ]
77+ # df_data = data.get_data_corrupted("Artificial", ratio_masked=.2, mean_size=10)
78+ # cols_to_impute = ["signal"]
7979```
8080
8181Let's take a look at variables to impute. We only consider a station, Aotizhongxin.
@@ -123,7 +123,7 @@ imputer_spline = imputers.ImputerInterpolation(groups=["station"], method="splin
123123imputer_shuffle = imputers.ImputerShuffle(groups = [" station" ])
124124imputer_residuals = imputers.ImputerResiduals(groups = [" station" ], period = 7 , model_tsa = " additive" , extrapolate_trend = " freq" , method_interpolation = " linear" )
125125
126- imputer_rpca = imputers.ImputerRPCA(groups = [" station" ], columnwise = True , period = 100 , max_iter = 100 , tau = 2 , lam = .3 )
126+ imputer_rpca = imputers.ImputerRPCA(groups = [" station" ], columnwise = True , period = 365 , max_iter = 200 , tau = 2 , lam = .3 )
127127imputer_rpca_opti = imputers.ImputerRPCA(groups = [" station" ], columnwise = True , period = 365 , max_iter = 100 )
128128
129129imputer_ou = imputers.ImputeEM(groups = [" station" ], method = " multinormal" , max_iter_em = 34 , n_iter_ou = 15 , strategy = " ou" )
0 commit comments