Skip to content

spatino1234/bwt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Burrows Wheeler Transform (BWT): DNA Alignment and Analysis - Carleton College CS Senior Comps 2024

🧬 BWT Project 🧬

Welcome to our project repository! This is the home of our innovative work on the Burrows-Wheeler Transform (BWT), a groundbreaking approach to DNA sequence alignment and analysis. Dive into our world of genomics, algorithms, and computational biology!

Team Extraordinaire 👩‍💻👨‍💻

🚀 Project Components 🚀

C Implementation 🖥️

  • BWT Transform and Alignment: A robust C implementation complete with a sleek terminal-based user interface.
  • Other Alignment Algorithms 🔍
  • See the README.txt in the C folder for instructions.

GUI Magic 🌈

  • Visualize BWT: Experience the BWT transformation through our dynamic graphical interface.
  • 📝 How to Use: See the README.txt in the GUI folder for instructions.

Website Showcase 🌐

  • HTML & CSS Genius: Explore our project's website, crafted with care and coding prowess.

Radix Sort Experiments 🧪

  • Radix Sort Sandbox: Python and early C experimentation with the fascinating Radix Sort algorithm.

Benchmarking Tools ⏱️

  • Performance Analysis: Python programs meticulously designed for benchmarking our algorithms.

Python BWT Implementations 🐍

  • Classic BWT.py: The core Python implementation of BWT transform, reversal, and pattern matching.
  • BWT with a Twist: Our special Numpy-enhanced version of the BWT transform.
  • Python Power: Check out our Python implementations of the Boyer-Moore and Naive alignment algorithms

🧬 The World of DNA and BWT

  • Human DNA, a vast universe of over 3 billion characters (A, C, G, T), poses complex challenges in genome sequencing. Our project tackles these challenges head-on, exploring efficient alignment of short DNA sequences to a reference genome – a crucial task given the sheer volume and potential imperfections in the data.

Why BWT? 🤔

  • A Compression Powerhouse: BWT isn't just a compressive technique; it's a marvel in data storage and handling, especially for repetitive sequences like human DNA.
  • Versatility: From file compression (think bzip2) to DNA, BWT's adaptability is nothing short of amazing.

🌟 The Project Mission

  • Decode the BWT and FM-Index: Understand the why and how behind these compression superheroes.
  • Reference Genome Realities: Delve into the creation and ethical considerations of reference genomes.
  • Aligners in Action: Run and analyze outputs from existing BWT-based aligners, comparing their efficacy.
  • Hands-On Implementation: We're not just studying; we're building! From exact matches to handling mismatches and gaps, we're on it.
  • Data Playground: Engaging with real and simulated datasets to test and refine our alignment algorithms.

Running Locally

  1. Run our C implementation by doing
gcc bwt.c -o bwt
./bwt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors