-
Notifications
You must be signed in to change notification settings - Fork 7
rtracklayer improvements
R has many tools for genomic data analysis, but currently is lacking support for (1) generating UCSC track hub meta-data files, and (2) writing bigBed files, which are binary files for displaying genomic regions on track hubs.
- PeakSegPipeline has code for generating track hubs, and currently relies on the bedToBigBed command line program for creating bigBed files.
- rtracklayer can create bigWig but not bigBed files.
The interested student will work on implementing two major features for the rtracklayer package:
A track hub is a group of text files that describes a set of genomic data to display on the UCSC browser. It contains links to binary indexed files such as bigWig and bigBed. R needs a function for creating such files.
The student should implement functions such as
trackHub(
multiWig(
bigWig("http://path/to/data.bigWig", "red"),
bigWig("http://path/to/peaks.bigWig", "black")),
bigBed("http://path/to/labels.bigBed"),
trackDb="trackDb.txt",
genomes="genomes.txt",
db="hg19",
hub="hub.txt")which would generate trackDb.txt, genomes.txt, and hub.txt which could then be uploaded to a web server for display on UCSC.
The bigBed file
format is useful for displaying genomic regions on UCSC track
hubs. The student should implement a BigBedFile class with methods
import, export, etc, similar to the existing
BigWigFile
class.
This project will provide R with functionality for creating track hub meta-data files, along with bigBed files.
Students, please contact mentors below after completing at least one of the tests below.
- Toby Hocking [email protected] is the author of R package PeakSegPipeline which has code for generating track hubs, and currently relies on the bedToBigBed command line program for creating bigBed files.
- Micheal Lawrence [email protected] is a member of R-core, and is author of R package rtracklayer which can create bigWig but not bigBed files.
Students, please do one or more of the following tests before contacting the mentors above.
TODO: write several tests that potential students can do to demonstrate their capabilities for this particular project. Ask some hard questions that will give you insight about how the students write code to solve problems. You'll see that the harder the questions that you ask, the easier it will be for you to choose between the students that apply for your project! Please modify the suggestions below to make them specific for your project.
- Easy: something that any useR should be able to do, e.g. download some existing package listed in the Related Work, and run it on some example data.
- Medium: something a bit more complicated. You can encourage students to write a script or some functions that show their R coding abilities.
- Hard: Can the student write a package with Rd files, tests, and vignettes? If your package interfaces with non-R code, can the student write in that other language?
Students, please post a link to your test results here.