🧬 BWT Project 🧬
Welcome to our project repository! This is the home of our innovative work on the Burrows-Wheeler Transform (BWT), a groundbreaking approach to DNA sequence alignment and analysis. Dive into our world of genomics, algorithms, and computational biology!
Team Extraordinaire 👩💻👨💻
🚀 Project Components 🚀
C Implementation 🖥️
- BWT Transform and Alignment: A robust C implementation complete with a sleek terminal-based user interface.
- Other Alignment Algorithms 🔍
- See the README.txt in the C folder for instructions.
GUI Magic 🌈
- Visualize BWT: Experience the BWT transformation through our dynamic graphical interface.
- 📝 How to Use: See the README.txt in the GUI folder for instructions.
Website Showcase 🌐
- HTML & CSS Genius: Explore our project's website, crafted with care and coding prowess.
Radix Sort Experiments 🧪
- Radix Sort Sandbox: Python and early C experimentation with the fascinating Radix Sort algorithm.
Benchmarking Tools ⏱️
- Performance Analysis: Python programs meticulously designed for benchmarking our algorithms.
Python BWT Implementations 🐍
- Classic BWT.py: The core Python implementation of BWT transform, reversal, and pattern matching.
- BWT with a Twist: Our special Numpy-enhanced version of the BWT transform.
- Python Power: Check out our Python implementations of the Boyer-Moore and Naive alignment algorithms
🧬 The World of DNA and BWT
- Human DNA, a vast universe of over 3 billion characters (A, C, G, T), poses complex challenges in genome sequencing. Our project tackles these challenges head-on, exploring efficient alignment of short DNA sequences to a reference genome – a crucial task given the sheer volume and potential imperfections in the data.
Why BWT? 🤔
- A Compression Powerhouse: BWT isn't just a compressive technique; it's a marvel in data storage and handling, especially for repetitive sequences like human DNA.
- Versatility: From file compression (think bzip2) to DNA, BWT's adaptability is nothing short of amazing.
🌟 The Project Mission
- Decode the BWT and FM-Index: Understand the why and how behind these compression superheroes.
- Reference Genome Realities: Delve into the creation and ethical considerations of reference genomes.
- Aligners in Action: Run and analyze outputs from existing BWT-based aligners, comparing their efficacy.
- Hands-On Implementation: We're not just studying; we're building! From exact matches to handling mismatches and gaps, we're on it.
- Data Playground: Engaging with real and simulated datasets to test and refine our alignment algorithms.
Running Locally
- Run our C implementation by doing
gcc bwt.c -o bwt
./bwt