Skip to content
jandot edited this page Sep 14, 2010 · 27 revisions

Introduction & history

The ruby ensembl API was first devised by Jan Aerts in June 2007 on release 45 of the core database. The purpose of the API was basically to mimick all functionality that’s available in the perl API for that database. With the help of the Ensembl core team, the API for the core database was worked out while being a “Geek for a Week”.

Different from the perl version there is only one API for the consecutive Ensembl releases. The underlying ruby code picks up changes in the database schema automatically.

A small example

require 'ensembl'

include Ensembl::Core

DBConnection.connect('homo_sapiens',50)

puts "== Get a slice =="
  slice = Slice.fetch_by_region('chromosome','4',10000,99999,-1)
puts slice.display_name

puts "== Print all gene for that slice (regardless of coord_system genes are annotated on) =="
slice.genes.each do |gene|
  puts gene.stable_id + "\t" + gene.status + "\t" + gene.slice.display_name
end

puts "== Get a transcript and print its 5'UTR, CDS and protein sequence =="
transcript = Transcript.find_by_stable_id('ENST00000380593')
puts "5'UTR: " + transcript.five_prime_utr_seq
puts "CDS: " + transcript.cds_seq
puts "peptide: " + transcript.protein_seq

DBConnection.connection.disconnect!
DBConnection.connect('bos_taurus',45)

puts "== Transforming a cow gene from chromosome level to scaffold level =="
gene = Gene.find(2408)
cloned_gene = gene.transform('scaffold')
puts "Original: " + gene.slice.display_name
puts "Now: " + cloned_gene.slice.display_name

puts "== What things are related to a 'gene' object? =="
puts 'Genes belong to: ' + Gene.reflect_on_all_associations(:belongs_to).collect{|a| a.name.to_s}.join(',')
puts 'Genes have many: ' + Gene.reflect_on_all_associations(:has_many).collect{|a| a.name.to_s}.join(',')

Output is:


== Get a slice ==
chromosome:NCBI36:4:10000:99999:-1
== Print all gene for that slice (regardless of coord_system genes are annotated on) ==
ENSG00000197701 KNOWN chromosome:NCBI36:4:43227:77340:1
ENSG00000207643 NOVEL chromosome:NCBI36:4:55032:55124:1
== Get a transcript and print its 5’UTR, CDS and protein sequence ==
5’UTR: ggaggaggtgaggagggtttgctgggtgg…agcactaggtcttcccgtcacctccacctctctcc
CDS: atgacccggctctgcttacccagacccgaagcacgtg…caaccccatcccactgcctgtgtctgttga
peptide: MTRLCLPRPEAREDPIPVPP…HDSPRRHSGFGSIEGQPHPTACVC*
== Transforming a cow gene from chromosome level to scaffold level ==
Original: chromosome:Btau_3.1:4:8104409:8496477:-1
Now: scaffold:Btau_3.1:Chr4.003.10:1590801:1982869:1
== What things are related to a ‘gene’ object? ==
Genes belong to: seq_region
Genes have many: object_xrefs,attrib_types,xrefs,transcripts,gene_attribs

At the moment only the core database is covered, leaving others like funcgen, other_features and variation. Hopefully these will be added in the future. Marc Hoeppner from Stockholm University is working on the compara part; Francesco Strozzi on variation.

Installation

The API is made available as a gem from gems.github.com. To install, type


sudo gem install jandot-ruby-ensembl-api —source http://gems.github.com

To help in the development, feel free to fork this repository.

Documentation

Although this wiki might provide documentation in the future, all rdoc documentation and a tutorial are now still available at the original rubyforge project page.

Source control

The API was initially under subversion source control at rubyforge (http://bioruby-annex.rubyforge.org), but has moved to github.

Clone this wiki locally