Skip to content

Commit 11ddbf5

Browse files
Koeng101v-rajaTimothyStilesIsaac Guerreiro
authored
Synthesis fixer (#98)
* Added manifesto of how and why * Working a little on the synthesisFixer * First round of new synthesisFixer with sqlite * Work in progress * Work in progress with SQL synthesisFixer * Passes go test * Figuring out tests * FixCds now works with an example * updates * Moved synthesis fixer from transformations -> synthesis * Improved comment for synthesis.go * Added FindTypeIIS for Pichia work * Removed overlap loop. It can be buggy and a proper overlap function should be added later. * Fixed index issue * Added test case fixing error in sqlFix where NA would not look for GC or AT biases * 1000 iterations -> 100 * Fix linter issues * Added generator for naughty sequences * Added repeat remover * Added iterations as variable * Added GcContent function * Remove intersect * Added FixCdsSimple example * refactor synthesis stuff into its own subpackage and update fn calls after `repackage` PR * update direct dependencies and fn calls to match update * update `codon.chooser()` err to include source of error * update copyright year. * added random.DNASequence function. (#179) * Updated go.mod to remove dependency on CGO. * Add example * Update to synthesis.go * Added additional docs * Add comment to sqlite blank import * Fixed stuff for code climate * Added concurrent parsing feature for genbank files (#182) * Added concurrent parsing feature for genbank files * Added ParseFlatConcurrent * Updated to have generic CheckSum instead of just MD5 * Tutorials (#184) * Added example_test for genbank and gff * Added SIMPLE tutorials and notes for each parser. * Revamped seqhash docs and added simple tutorial * moved example seqhash to example_test - does this make it runnable? Co-authored-by: Timothy Stiles <tim@stiles.io> * Fixed problem with stop codons * Added change log and fixed gcbias bug * Change log now has ordering * Added change comment string * Added gc content checker * Test where the codon choosed by the sql query is actually not with the highest weight * Create a feature to removes repetitions between cds and a external sequence as host genome or plasmid to avoid homology recombination * fix linters indications * Hairpin remover function for FixCds and test * Fix for checking if we have a full CDS sequence without interrupted codons * Fix for genbank files exported from benchling that have a different index for slash in qualifiers * Lint correction * Synthesis fix for correctly choose the highest synonymous codon * fixed typo * Fixed reversion error with DESC statement. * Fixed linter for synthesisFixer * Fix error message * enzymes -> sequencesToRemove * Changed name of enzymes * Changed fixiterations * 100% test coverage with updates * Changed wg -> waitgroup * Fix linter issues * Add better message for failures * Add comment to synthesis fixer * Added comments on sql * codonLength * Removed kmer table stuff - will add again with kmer table for synthesisFixer Co-authored-by: Vivek Raja <vivek.r.raja@gmail.com> Co-authored-by: Tim <TimothyStiles@users.noreply.github.com> Co-authored-by: Timothy Stiles <tim@stiles.io> Co-authored-by: Isaac Guerreiro <ig00114@saqueepague.com.br>
1 parent 499ec41 commit 11ddbf5

File tree

11 files changed

+1538
-31
lines changed

11 files changed

+1538
-31
lines changed

checks/checks.go

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,21 @@
11
package checks
22

3-
import "github.com/TimothyStiles/poly/transform"
3+
import (
4+
"github.com/TimothyStiles/poly/transform"
5+
"strings"
6+
)
47

58
// IsPalindromic accepts a sequence of even length and returns if it is
69
// palindromic. More here - https://en.wikipedia.org/wiki/Palindromic_sequence
710
func IsPalindromic(sequence string) bool {
811
return sequence == transform.ReverseComplement(sequence)
912
}
13+
14+
// GcContent checks the GcContent of a given sequence.
15+
func GcContent(sequence string) float64 {
16+
sequence = strings.ToUpper(sequence)
17+
GuanineCount := strings.Count(sequence, "G")
18+
CytosineCount := strings.Count(sequence, "C")
19+
GuanineAndCytosinePercentage := float64(GuanineCount+CytosineCount) / float64(len(sequence))
20+
return GuanineAndCytosinePercentage
21+
}

checks/checks_test.go

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,10 @@ func TestIsPalindromic(t *testing.T) {
1313
t.Errorf("IsPalindromic failed call BsaI NOT a palindrome")
1414
}
1515
}
16+
17+
func TestGcContent(t *testing.T) {
18+
content := GcContent("GGTATC")
19+
if content != 0.5 {
20+
t.Errorf("GcContent did not properly calculate GC content")
21+
}
22+
}

0 commit comments

Comments
 (0)