Skip to content
jandot edited this page Jan 10, 2011 · 27 revisions

Introduction and history

The ruby ensembl API was first devised by Jan Aerts in June 2007 on release 45 of the core database. The purpose of the API was basically to mimick all functionality that’s available in the perl API for that database. With the help of the Ensembl core team, the API for the core database was worked out while being a “Geek for a Week”.

An example

require 'ensembl'

include Ensembl::Core

DBConnection.connect('homo_sapiens',50)

puts "== Get a slice =="
slice = Slice.fetch_by_region('chromosome','4',10000,99999,-1)
puts slice.display_name

puts "== Print all genes for that slice (regardless of coord_system genes are annotated on) =="
slice.genes.each do |gene|
  puts gene.stable_id + "\t" + gene.status + "\t" + gene.slice.display_name
end

puts "== Get a transcript and print its 5'UTR, CDS and protein sequence =="
transcript = Transcript.find_by_stable_id('ENST00000380593')
puts "5'UTR: " + transcript.five_prime_utr_seq
puts "CDS: " + transcript.cds_seq
puts "peptide: " + transcript.protein_seq

DBConnection.connection.disconnect!
DBConnection.connect('bos_taurus',45)

puts "== Transforming a cow gene from chromosome level to scaffold level =="
gene = Gene.find(2408)
gene_on_scaffold = gene.transform('scaffold')
puts "Original: " + gene.slice.display_name
puts "Now: " + gene_on_scaffold.slice.display_name

puts "== What things are related to a 'gene' object? =="
puts 'Genes belong to: ' + Gene.reflect_on_all_associations(:belongs_to).collect{|a| a.name.to_s}.join(',')
puts 'Genes have many: ' + Gene.reflect_on_all_associations(:has_many).collect{|a| a.name.to_s}.join(',')

Output is:

== Get a slice ==
chromosome:NCBI36:4:10000:99999:-1
== Print all genes for that slice (regardless of coord_system genes are annotated on) ==
ENSG00000197701 KNOWN chromosome:NCBI36:4:43227:77340:1
ENSG00000207643 NOVEL chromosome:NCBI36:4:55032:55124:1
== Get a transcript and print its 5’UTR, CDS and protein sequence ==
5’UTR: ggaggaggtgaggagggtttgctgggtgg…agcactaggtcttcccgtcacctccacctctctcc
CDS: atgacccggctctgcttacccagacccgaagcacgtg…caaccccatcccactgcctgtgtctgttga
peptide: MTRLCLPRPEAREDPIPVPP…HDSPRRHSGFGSIEGQPHPTACVC*
== Transforming a cow gene from chromosome level to scaffold level ==
Original: chromosome:Btau_3.1:4:8104409:8496477:-1
Now: scaffold:Btau_3.1:Chr4.003.10:1590801:1982869:1
== What things are related to a ‘gene’ object? ==
Genes belong to: seq_region
Genes have many: object_xrefs,attrib_types,xrefs,transcripts,gene_attribs

At the moment only the core and variation databases are covered, leaving others like compara, funcgen and other_features. Hopefully these will be added in the future. Marc Hoeppner from Stockholm University is working on the compara part.

Comparion to the Perl API

This Ruby API to the Ensembl databases is very much inspired by the Perl API provided by the Ensembl team. Given that they are two different languages, there are of course some differences:

  • There is only one API for the different Ensembl releases. In the Perl API, the user needs to load the version of the API that matches the database release he wants to work with.
  • The Slice class is defined slightly differently. In the Perl API, the “slice” of an object is the whole seq_region (read: chromosome) that that object is defined on. For example: the “slice” of the gene BRCA2 is chromosome 13. In contrast, the “slice” in the Ruby API is delineated by the start and stop positions of that object; the “slice” for the same gene using the Ruby API is chromosome:GRCh37:13:32889611:32973347:1. This makes additional functionality available for the Ruby objects. You can for example check if one object overlaps with of is contained within another object. For example: gene1.overlaps?(gene2).
  • Ruby’s introspection makes it possible to investigate the structure of the database from within the code, and for example check what types of object are related to e.g. a gene: Gene.reflect_on_all_associations(:belongs_to) reports that a gene “belongs to” a seq_region and an analysis.
  • + binaries
  • - only core and variation
  • - not developed by Ensembl core team

Installation

The API is made available as a gem from gems.github.com. To install, type

gem install ruby-ensembl-api

To help in the development, feel free to fork this repository.

Documentation

An extensive tutorial is available here. This tutorial is a ruby version of the perl tutorial available at the Ensembl website (with permission).

Full documentation on classes and methods can be found here.

To do

  • Add additional databases: compara, …

Clone this wiki locally