Code to implement analyses of PCA correction and other corrections in Microbiome data

The code in this repository can be followed in order to reproduce the analyses in the preprint found on biorxiv.

Needed packages

packs = c("compositions")
lapply(packs,install.packages())

if also testing edgeR and DESeq2

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("edgeR")
BiocManager::install("DESeq2")

Quick summary of steps

Needed input data:

metadata.txt that contains the information regarding phenotype and other experimental variables (if performing correlation analyses)
feature_table_rel.rds or feature_table_rel.txt

These files should be stored in a directory of choice. The path for this directory will be hereafter referred to <your_directory>

Step 1: CLR Transformation

Input: feature table | Output: CLR-transformed data in both rds and txt formats

Rscript Correction/transformations.R <your_directory> <input format of feature table: txt or rds>

Rscript Correction/transformations.R ~/Documents/MicroBatch/microbatch_vc/data/Gibbonsr_max_k5 rds

Step 2: Calculate PCs (if doing PCA correction)

Input: feature table | Output: eigenvalues and pca scores

Rscript Correction/calc_pcs.R <your_directory> <input format of feature table: txt or rds> <data transformation (relative abundance or CLR): rel or rel_clr>

Rscript Correction/calc_pcs.R ~/Documents/MicroBatch/microbatch_vc/data/Gibbonsr_max_k5 rds rel_clr

Step 3: Correction

Input: feature table & metadata (for limma, percentilenorm, combat, etc.), & PCA scores (for PCA correction) | Output: eigenvalues and pca scores

*make sure metadata categories are already cleaned up to distinct groups if you're doing to use percentilenorm

Rscript Correction/correction.R ~/Documents/MicroBatch/microbatch_vc/data/Gibbonsr_max_k5 rds rel limma study bin_crc_normal

Rscript Correction/correction.R ~/Documents/MicroBatch/microbatch_vc/data/Gibbonsr_max_k5 rds logcpm limma study bin_crc_normal

Analysis Opt 1: Correlation analyses

Correlation > pc_correlations_plot.R

Analysis Opt 2: Titration analyses

Titration > titratoins.ipynb

Analysis Opt 3: Variance partitioning

VariancePartitioning > variance_partitioning.R

VariancePartitioning > variance_partitioning_plotting.R

Analysis Opt 4: Prediction

For binary phenotype:

RF_Classifier > classifier.py

For continuous phenotype:

LinRegression > regression.py

## Breakdown of steps and expected output

Step 1: CLR Transformation

Correction > transformations.R

Rscript transformations.R ~/Documents/MicroBatch/microbatch_vc/data/Gibbonsr_max_k5 rds

Example output:

[1] "Starting CLR transformation of data"
[1] "Adding pseudocount to zeroes"
[1] 6.748107e-06
[1] "Number of zero values in data"
[1] 0
[1] "Proportion of values in data that are zero"
[1] 0
[1] "Transformation done and exported to /Users/leahbriscoe/Documents/MicroBatch/microbatch_vc/data/Gibbonsr_max_k5"

Step 2: Calculate PCs (if doing PCA correction)

Loading required package: bigstatsr
Time difference of 1.820936 secs
[1] "eigen values"
 [1] 1.58260062 0.21910492 0.12427020 0.09583746 0.08502080 0.07795310
 [7] 0.06726517 0.06550892 0.06077923 0.05249138 0.05210248 0.04988210
[13] 0.04238802 0.03892835 0.03716319
[1] "dim pca scores"
[1] 723  15

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
BiomarkerAnalysis		BiomarkerAnalysis
Correction		Correction
Correlation		Correlation
DataCuration		DataCuration
LinRegression		LinRegression
LogisticClassifier		LogisticClassifier
RF_Classifier		RF_Classifier
Titration		Titration
VariancePartitioning		VariancePartitioning
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code to implement analyses of PCA correction and other corrections in Microbiome data

Table of contents

Needed packages

Quick summary of steps

Needed input data:

Step 1: CLR Transformation

Step 2: Calculate PCs (if doing PCA correction)

Step 3: Correction

Analysis Opt 1: Correlation analyses

Analysis Opt 2: Titration analyses

Analysis Opt 3: Variance partitioning

Analysis Opt 4: Prediction

Step 1: CLR Transformation

Step 2: Calculate PCs (if doing PCA correction)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Code to implement analyses of PCA correction and other corrections in Microbiome data

Table of contents

Needed packages

Quick summary of steps

Needed input data:

Step 1: CLR Transformation

Step 2: Calculate PCs (if doing PCA correction)

Step 3: Correction

Analysis Opt 1: Correlation analyses

Analysis Opt 2: Titration analyses

Analysis Opt 3: Variance partitioning

Analysis Opt 4: Prediction

Step 1: CLR Transformation

Step 2: Calculate PCs (if doing PCA correction)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages