Update README.md

qizhijie · web-flow · commit cb2344937508 · 2020-04-28T22:57:25.000-07:00
diff --git a/README.md b/README.md
@@ -16,9 +16,9 @@ The schematic diagram below describes the various stages of the PROPERseqTools p
 4. The pre-processed read pairs are mapped to transcriptome with BWA separately. ‘-a’ option is enabled to keep all found alignments using default threshold of BWA. This is used in the later filtering of potential homologous read pairs. 
 5. The mapped read pairs are then deduplicated based on the external coordinates of their primary alignments.
 6. The deduplicated read pairs are output as mapped read pairs. Their transcriptome alignment information is stored in both `.bed` and `.bam` files.
-7. The transcriptome alignment information of mapped read pairs is utilized to check if two ends’ primary alignments are mapped to different protein-coding genes. The selected read pairs are further checked to see if both ends have over 50% of their read bases matches the reference transcriptome based on the CIGAR string and if both ends have no shared lesser alignments. 
-8. The read pairs passing the quality checks output as chimeric read pairs from the library in `chimericReadPairs.csv`. 
-9. With chimeric read pairs from the positive library from data processing step 3 as input, for each gene pair appeared in the positive library, we applied chi-square test as shown in Figure 4B. Benjamini-Hochberg adjustment is applied to correct all the p-values. Gene pairs with an adjusted p-value less than 0.05 and with an odds ratio larger than 1 are kept. Gene pairs with mapped chimeric read pair count in the positive library larger than 4 times the average number of mapped chimeric read pairs per gene pair in the positive library are kept. The average number of mapped chimeric read pairs per gene pair in the positive library, [X], is computed as Supplementary Equation 3. 
+7. The transcriptome alignment information of mapped read pairs is utilized to select read pairs whose two ends’ primary alignments are mapped to different protein-coding genes. The selected read pairs are further checked to see if both ends have over 50% of their read bases matches the reference transcriptome based on the CIGAR string and if both ends have no shared lesser alignments. 
+8. The read pairs passing the quality checks are output as chimeric read pairs from the library in `chimericReadPairs.csv`. 
+9. Chi-square test is applied to the chimeric read pairs. Benjamini-Hochberg adjustment is applied to correct all the p-values. Gene pairs with an adjusted p-value less than 0.05 (default) and with an odds ratio larger than 1 (default) are kept. Gene pairs with mapped chimeric read pair count in the library larger than 4 (default) times the average number of mapped chimeric read pairs per gene pair in the positive library are kept. 
 10. The kept gene pairs are output as protein-protein interactions in `proteinProteinInteractions.csv`.