Skip to content

Latest commit

 

History

History
89 lines (48 loc) · 6.78 KB

File metadata and controls

89 lines (48 loc) · 6.78 KB

Repository Details

All data and source code can be obtained from the project GitHub repository: https://github.com/anchitaaghag/Codon-Optimization-for-Biopharming

All of the data and code in these files can be used in the following order:

I. README.md

For an overview of the project and a quick how-to.

II. LICENSE.md

If license information is required.

III. Repository_Details.md

The current file with repository details.

IV. Create_Custom_Codon_Usage Folder

             A. Kazusa_Codon_Usage_Data Sub-Folder

                         1. The Using_Command_Line_to_Process_Kazuza_CU.md document lists how to process codon counts data obtained from                          the Kazusa database. The associated files are also provided in this sub-folder:

                                                a) Kazuza_CU_for_each_CDS_Format.txt

                                                b) Kazuza_CU_for_each_CDS_in_N_benthamiana.txt

                                                c) Kazuza_CU_for_each_CDS_in_N_tabacum.txt

                                                d) N_benthamiana_Codon_Counts_Only.txt

                                                e) N_tabacum_Codon_Counts_Only.txt

             B. Build_Codon_Usage_Table.R

             C. Functions_For_Stat.R

             D. Statistical_Analysis_of_Codon_Usage.R

             E. Features_Data.csv (if required)

             F. NCBI_Data.csv (if required)

             G. Updated_Codon_Usage_Information.csv (to be used in the Statistical Analysis R file)

V. Codon_Optimization_Tool Folder

             A. Updated_Codon_Usage.txt

             B. Reverse_Translate_Function.R

             C. Codon_Optimization_Script.R

             D. Example_Protein_Sequences.txt

             E. Example_Results.txt

VI. Test_Codon_Optimization_Tool Folder

             A. The Tool_Testing_Data.xlsx spreadsheet provides an overview of the steps taken and data obtained from SolGenomics tBLASTn              and CAIcal Server outputs.

             B. CO_Tool_Input_Protein_Sequences.txt

             C. The CO_Tool_Output_DNA_Sequences Sub-Folder contains the results after running the CO_Tool_Input_Protein_Sequences.txt              file through the codon optimization tool ten times. The resulting files are included:

                         1. Run_1_Results.txt

                         2. Run_2_Results.txt

                         3. Run_3_Results.txt

                         4. Run_4_Results.txt

                         5. Run_5_Results.txt

                         6. Run_6_Results.txt

                         7. Run_7_Results.txt

                         8. Run_8_Results.txt

                         9. Run_9_Results.txt

                         10. Run_10_Results.txt

             D. Sol_Genomics_DNA_Sequences.txt

             E. CAI_Values.txt (created from the Tool_Testing_Data.xlsx spreadsheet)

             F. GC_Values.txt (created from the Tool_Testing_Data.xlsx spreadsheet)

             G. Tool_Testing.R