-
Notifications
You must be signed in to change notification settings - Fork 18
Home
The ruby ensembl API was first devised by Jan Aerts in June 2007 on release 45 of the core database. The purpose of the API was basically to mimick all functionality that’s available in the perl API for that database. With the help of the Ensembl core team, the API for the core database was worked out while being a “Geek for a Week”.
Different from the perl version there is only one API for the consecutive Ensembl releases. The underlying ruby code picks up changes in the database schema automatically.
require 'ensembl'
include Ensembl::Core
DBConnection.connect('homo_sapiens',50)
puts "== Get a slice =="
slice = Slice.fetch_by_region('chromosome','4',10000,99999,-1)
puts slice.display_name
puts "== Print all gene for that slice (regardless of coord_system genes are annotated on) =="
slice.genes.each do |gene|
puts gene.stable_id + "\t" + gene.status + "\t" + gene.slice.display_name
end
puts "== Get a transcript and print its 5'UTR, CDS and protein sequence =="
transcript = Transcript.find_by_stable_id('ENST00000380593')
puts "5'UTR: " + transcript.five_prime_utr_seq
puts "CDS: " + transcript.cds_seq
puts "peptide: " + transcript.protein_seq
DBConnection.connection.disconnect!
DBConnection.connect('bos_taurus',45)
puts "== Transforming a cow gene from chromosome level to scaffold level =="
gene = Gene.find(2408)
cloned_gene = gene.transform('scaffold')
puts "Original: " + gene.slice.display_name
puts "Now: " + cloned_gene.slice.display_name
puts "== What things are related to a 'gene' object? =="
puts 'Genes belong to: ' + Gene.reflect_on_all_associations(:belongs_to).collect{|a| a.name.to_s}.join(',')
puts 'Genes have many: ' + Gene.reflect_on_all_associations(:has_many).collect{|a| a.name.to_s}.join(',')
Output is:
== Get a slice ==
chromosome:NCBI36:4:10000:99999:-1
== Print all gene for that slice (regardless of coord_system genes are annotated on) ==
ENSG00000197701 KNOWN chromosome:NCBI36:4:43227:77340:1
ENSG00000207643 NOVEL chromosome:NCBI36:4:55032:55124:1
== Get a transcript and print its 5’UTR, CDS and protein sequence ==
5’UTR: ggaggaggtgaggagggtttgctgggtgg…agcactaggtcttcccgtcacctccacctctctcc
CDS: atgacccggctctgcttacccagacccgaagcacgtg…caaccccatcccactgcctgtgtctgttga
peptide: MTRLCLPRPEAREDPIPVPP…HDSPRRHSGFGSIEGQPHPTACVC*
== Transforming a cow gene from chromosome level to scaffold level ==
Original: chromosome:Btau_3.1:4:8104409:8496477:-1
Now: scaffold:Btau_3.1:Chr4.003.10:1590801:1982869:1
== What things are related to a ‘gene’ object? ==
Genes belong to: seq_region
Genes have many: object_xrefs,attrib_types,xrefs,transcripts,gene_attribs
At the moment only the core database is covered, leaving others like funcgen, other_features and variation. Hopefully these will be added in the future. Marc Hoeppner from Stockholm University is working on the compara part; Francesco Strozzi on variation.
The API is made available as a gem from gems.github.com. To install, type
sudo gem install jandot-ruby-ensembl-api —source http://gems.github.com
To help in the development, feel free to fork this repository.
Although this wiki might provide documentation in the future, all rdoc documentation and a tutorial are now still available at the original rubyforge project page.
The API was initially under subversion source control at rubyforge (http://bioruby-annex.rubyforge.org), but has moved to github.