Skip to content

jiachzou/selective_multiple_testing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Selective Multiple Testing

Authors: Markus Pelger(mpelger@stanford.edu), Jiacheng Zou(jiachengzou@alumni.stanford.edu)

Context

We provide a one-stop collection of resources on covariate selective inference with Family-Wise Error Rate (FWER) control on large panel asset pricing data and model, as described in Selective Multiple Testing: Inference for Large Panels with Many Covariates. Specifically, we enable users to perform the rolling-window estimations described in $\S$ 7.3 of the paper. Any constant window model or alternative window size can be produced by changing the window parameter or estimation in the code.

data:

  1. Response variables: $Y=R^{660 \times 243}$ test portfolios excess returns downloaded and processed from Kenneth French's website. The cross-section size is 243 as test portfolios and there are 660 monthly observations. We regress out the market factor from each of the invidiual factors.
  2. Covariates: $X=R^{660 \times 114}$ asset pricing high-minus-low factors that are downloaded and processed from Hou, K., Xue, C. and Zhang, L., 2020. Replicating anomalies. The Review of financial studies, 33(5), pp.2019-2133. The covariates' dimension is 114 and there are 660 monthly observations. We regress out the market factor from each of the individual factors.

R code:

empirics:
simulations:
  • simulation.R is a self-contained R script that

python code on selection:

  • python: funs.py provides minimal stand-alone function that only requires pandas and numpy to perform our Selective Multiple Testing selection method given a matrix of p-values, controlling for Family-Wise Error Rate (FWER).
python demo

When there are $N$ units and $J$ features, the evidence of unit-level regressions can be stored in a matrix:

  • a $P$ matrix $J \times N$ of log p-values;
  • whenever $P_{jn}$ is missing, the $j$th feature is not in the support set of $n$th unit-level model.

To run the code, we can select features subject to FWER target of $\alpha$:

import numpy as np
import pandas as pd
J, N = log_pval_matrix.shape
alpha_vec = [0.00001,0.01,0.05] # the FWER thresholds you want to try
pmt_rejection_table =panel_unordered(log_pval_matrix)
rho=pmt_rejection_table['rho'].unique()[0] # the panel cohesiveness coefficient
for alpha in alpha_vec:
	selected_panel_multiple_testing =np.sort(pmt_rejection_table.index[pmt_rejection_table['rho_inv.N.p_1']<=alpha]).tolist()
	selected_Bonferroni_multiple_testing =np.sort(pmt_rejection_table.index[pmt_rejection_table['p_1']<=alpha/(J*N)]).tolist()

Additional resources

For a method-focused code base, we provide a python version of the Selective Multiple Testing in the Github repository for our accompanying paper Large Dimensional Change Point Detection with FWER Control as Automatic Stopping.

Usage

To cite this code, in addition to the data sources, please use the following citation:

@article{pelger2022selective,
  title={Selective Multiple Testing: Inference for Large Panels with Many Covariates},
  author={Pelger, Markus and Zou, Jiacheng},
  journal={Available at SSRN 4315891},
  year={2022}
}

About

Accompanying code for Pelger, M. and Zou, J., 2022. Selective Multiple Testing: Inference for Large Panels with Many Covariates. Available at SSRN 4315891.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors