-
Notifications
You must be signed in to change notification settings - Fork 18
Home
The ruby ensembl API was first devised by Jan Aerts in June 2007 on release 45 of the core database. The purpose of the API was basically to mimick all functionality that’s available in the perl API for that database. With the help of the Ensembl core team, the API for the core database was worked out while being a “Geek for a Week”.
An example
require 'ensembl'
include Ensembl::Core
DBConnection.connect('homo_sapiens',50)
puts "== Get a slice =="
slice = Slice.fetch_by_region('chromosome','4',10000,99999,-1)
puts slice.display_name
puts "== Print all genes for that slice (regardless of coord_system genes are annotated on) =="
slice.genes.each do |gene|
puts gene.stable_id + "\t" + gene.status + "\t" + gene.slice.display_name
end
puts "== Get a transcript and print its 5'UTR, CDS and protein sequence =="
transcript = Transcript.find_by_stable_id('ENST00000380593')
puts "5'UTR: " + transcript.five_prime_utr_seq
puts "CDS: " + transcript.cds_seq
puts "peptide: " + transcript.protein_seq
DBConnection.connection.disconnect!
DBConnection.connect('bos_taurus',45)
puts "== Transforming a cow gene from chromosome level to scaffold level =="
gene = Gene.find(2408)
gene_on_scaffold = gene.transform('scaffold')
puts "Original: " + gene.slice.display_name
puts "Now: " + gene_on_scaffold.slice.display_name
puts "== What things are related to a 'gene' object? =="
puts 'Genes belong to: ' + Gene.reflect_on_all_associations(:belongs_to).collect{|a| a.name.to_s}.join(',')
puts 'Genes have many: ' + Gene.reflect_on_all_associations(:has_many).collect{|a| a.name.to_s}.join(',')
Output is:
== Get a slice ==
chromosome:NCBI36:4:10000:99999:-1
== Print all genes for that slice (regardless of coord_system genes are annotated on) ==
ENSG00000197701 KNOWN chromosome:NCBI36:4:43227:77340:1
ENSG00000207643 NOVEL chromosome:NCBI36:4:55032:55124:1
== Get a transcript and print its 5’UTR, CDS and protein sequence ==
5’UTR: ggaggaggtgaggagggtttgctgggtgg…agcactaggtcttcccgtcacctccacctctctcc
CDS: atgacccggctctgcttacccagacccgaagcacgtg…caaccccatcccactgcctgtgtctgttga
peptide: MTRLCLPRPEAREDPIPVPP…HDSPRRHSGFGSIEGQPHPTACVC*
== Transforming a cow gene from chromosome level to scaffold level ==
Original: chromosome:Btau_3.1:4:8104409:8496477:-1
Now: scaffold:Btau_3.1:Chr4.003.10:1590801:1982869:1
== What things are related to a ‘gene’ object? ==
Genes belong to: seq_region
Genes have many: object_xrefs,attrib_types,xrefs,transcripts,gene_attribs
At the moment only the core and variation databases are covered, leaving others like compara, funcgen and other_features. Hopefully these will be added in the future. Marc Hoeppner from Stockholm University is working on the compara part.
This Ruby API to the Ensembl databases is very much inspired by the Perl API provided by the Ensembl team. Given that they are two different languages, there are of course some differences:
- There is only one API for the different Ensembl releases. In the Perl API, the user needs to load the version of the API that matches the database release he wants to work with.
- The Slice class is defined slightly differently. In the Perl API, the “slice” of an object is the whole seq_region (read: chromosome) that that object is defined on. For example: the “slice” of the gene BRCA2 is
chromosome 13. In contrast, the “slice” in the Ruby API is delineated by the start and stop positions of that object; the “slice” for the same gene using the Ruby API ischromosome:GRCh37:13:32889611:32973347:1. This makes additional functionality available for the Ruby objects. You can for example check if one object overlaps with of is contained within another object. For example:gene1.overlaps?(gene2). - Ruby’s introspection makes it possible to investigate the structure of the database from within the code, and for example check what types of object are related to e.g. a gene:
Gene.reflect_on_all_associations(:belongs_to)reports that a gene “belongs to” aseq_regionand ananalysis. - + binaries
- - only core and variation
- - not developed by Ensembl core team
The API is made available as a gem from gems.github.com. To install, type
gem install ruby-ensembl-api
To help in the development, feel free to fork this repository.
An extensive tutorial is available here. This tutorial is a ruby version of the perl tutorial available at the Ensembl website (with permission).
Full documentation on classes and methods can be found here.
- Add additional databases: compara, …