Skip to content

Issues with annotation of variants #13

@SamakshSingh99

Description

@SamakshSingh99

Hi!
I used the following command to run the VarGen pipeline but I am getting the following error:

> disease_variants <- vargen_pipeline(vargen_dir = "./vargen_data/", 
+                                     omim_morbid_ids = c("125853",
+                                                         "222100", 
+                                                         "601665", 
+                                                         "604302",
+                                                         "212750",
+                                                         "145500",
+                                                         "603813",
+                                                         "166710",
+                                                         "266600",
+                                                         "610938",
+                                                         "601367",
+                                                         "223100",
+                                                         "600807"),
+                                     gtex_tissues = gtex_tissue,
+                                     gwas_traits = disease_traits, 
+                                     verbose = T)
[1] "Connecting to the gene mart..."
[1] "Connecting to the snp mart..."
[1] "Building the gwascat object..."
[1] "Reading the enhancer tss association file for FANTOM5... './vargen_data//enhancer_tss_associations.bed'"
[1] "Starting the pipeline..."
[1] "Getting genes for OMIM: 125853"
[1] "Getting genes for OMIM: 222100"
[1] "Getting genes for OMIM: 601665"
[1] "Getting genes for OMIM: 604302"
[1] "Getting genes for OMIM: 212750"
[1] "Getting genes for OMIM: 145500"
[1] "Getting genes for OMIM: 603813"
[1] "Getting genes for OMIM: 166710"
[1] "Getting genes for OMIM: 266600"
[1] "Getting genes for OMIM: 610938"
[1] "Getting genes for OMIM: 601367"
[1] "Getting genes for OMIM: 223100"
[1] "Getting genes for OMIM: 600807"
[1] "Writing the list of genes to: .//genes_info.tsv"
[1] "Getting the GTEx variants..."
[1] "Loading GTEx lookup table... Please be patient"
|--------------------------------------------------|
|==================================================|
[1] "Number of GTEx ids removed (no corresponding rsid): 135"
Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__,  : 
  Join results in 1895121 rows; more than 471669 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice.
In addition: Warning messages:
1: In vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = c("125853",  :
  Gene mart not provided (or not a valid Mart object).We used one from connect_to_gene_ensembl() instead.
2: In vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = c("125853",  :
  Snp mart not provided (or not a valid Mart object).We used one from connect_to_snp_ensembl() instead.
3: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
4: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
5: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
6: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
7: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE

Following gtex tissues were taken into consideration :

gtex_tissue <- select_gtex_tissues("./vargen_data/GTEx_Analysis_v8_eQTL/", 
                                   c("adipose",
                                     "artery_coronary",
                                     "pancreas", 
                                     "breast", 
                                     "heart", 
                                     "liver",
                                     "stomach",
                                     "lung",
                                     "fibroblasts",
                                     "small_intestine", "whole_blood", "brain"))

Also, when I am not including the gtex tissue in the pipeline I am able to run the pipeline with some warnings but unable to perfrom annotations:

Running Pipeline :

> disease_variants <- vargen_pipeline(vargen_dir = "./vargen_data/", 
+                                     omim_morbid_ids = c("125853",
+                                                         "222100", 
+                                                         "601665", 
+                                                         "604302",
+                                                         "212750",
+                                                         "145500",
+                                                         "603813",
+                                                         "166710",
+                                                         "266600",
+                                                         "610938",
+                                                         "601367",
+                                                         "223100",
+                                                         "600807"),
+                                     gwas_traits = disease_traits, 
+                                     verbose = T)
[1] "Connecting to the gene mart..."
[1] "Connecting to the snp mart..."
[1] "Building the gwascat object..."
[1] "Reading the enhancer tss association file for FANTOM5... './vargen_data//enhancer_tss_associations.bed'"
[1] "Starting the pipeline..."
[1] "Getting genes for OMIM: 125853"
[1] "Getting genes for OMIM: 222100"
[1] "Getting genes for OMIM: 601665"
[1] "Getting genes for OMIM: 604302"
[1] "Getting genes for OMIM: 212750"
[1] "Getting genes for OMIM: 145500"
[1] "Getting genes for OMIM: 603813"
[1] "Getting genes for OMIM: 166710"
[1] "Getting genes for OMIM: 266600"
[1] "Getting genes for OMIM: 610938"
[1] "Getting genes for OMIM: 601367"
[1] "Getting genes for OMIM: 223100"
[1] "Getting genes for OMIM: 600807"
[1] "Writing the list of genes to: .//genes_info.tsv"
[1] "No values for 'gtex_tissues', skipping GTEx step..."
[1] "Getting the gwas variants,,,"
[1] "Writing the variants to .//vargen_variants.tsv"
Warning messages:
1: In vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = c("125853",  :
  Gene mart not provided (or not a valid Mart object).We used one from connect_to_gene_ensembl() instead.
2: In vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = c("125853",  :
  Snp mart not provided (or not a valid Mart object).We used one from connect_to_snp_ensembl() instead.
3: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
4: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
5: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
6: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
7: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE
8: In (function (seqlevels, genome, new_style)  :
  cannot switch GRCh38's seqlevels from NCBI to UCSC style

Performing annotation:

> disease_annotation <- annotate_variants(disease_variants$rsid, verbose = T)
Error in if (cj == upper) next : missing value where TRUE/FALSE needed
In addition: Warning message:
In (1:g) * nnm : NAs produced by integer overflow

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions