Skip to content

How to deal with incomplete data set and not ENTREZ IDs (novel isoforms) #6

@NikoLichi

Description

@NikoLichi

Hi There,
I am giving this tool a try instead my normal R packages and so far I required some code modification in the documentation https://spycone.readthedocs.io/en/latest/gene-level-workflow.html#Prepare-the-dataset. It would be great to have a more update version of this.

Now, my questions:

  1. I normally do use ENTREZ IDs (ENSEMBL IDs), and I am also using novel isoforms, which means that not all of them have regular "gene Names". How could this be implemented in the pipelines (transcript and gene level)?
  2. I have 5 time points and 5 replicates for each time point, but unfortunately, one of the samples needed to be removed from the data set due to quality issues. Then, when creating the Spycone object, the function complains. Is there a way to solve this?

Here the error:
`Cell In[24], line 1
----> 1 tp5_dset = spy.dataset(ts=df_counts_sort,
2 gene_id = gene_list,
3 symbs=gene_list,
4 species=9606,
5 reps1 = 5,
6 timepts = 5)

File ~/miniconda3/envs/jypyTimeSeries/lib/python3.11/site-packages/spycone/DataSet.py:126, in dataset.init(self, ts, species, reps1, timepts, gtf, gene_id, transcript_id, timeserieslist, symbs, discretization_steps)
123 self.ts[0] = np.array(self.ts[0], dtype="double")
125 if self.timepts*self.reps1 != self.ts[0].shape[1]:
--> 126 raise ValueError("Number of columns is not the same as number of time points.")
128 if self.species not in self.SPECIES:
129 raise ValueError("Please provide a supported species ID.")

ValueError: Number of columns is not the same as number of time points.`

Thanks and all the best,
Nicolas

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions