Prebuilt databases do not have nodes.dmp for the use of classifiedRefiner module; segfault if using gtdb_r226 instead of gtdb_r220+virus+human database

(1) The prebuilt database worked well for the classify workflow but when I proceed to remove unclassified/ human portions it failed. should i just copy NCBI taxdump files here?
(2) Is it normal to have ~95% read as unclassified? What should I expect?

'''
classifiedRefiner 07a.metabuli/JL304_B27_2_classifications.tsv ../metabuli/gtdb+virus+human --threads 4 --remove-unclassified --report 1 

Metabuli Version (commit):                                                                      1.1.1
Remove unclassified reads                                                                       true
Exclude taxId as well as its children                                                       
Select taxId as well as its children                                                        
Select columns with number, (7:full lineage, generated if absent)                           
Make report of refined classification file                                                      true
Adjust classification to the specified rank                                                 
0: without higher rank, 1: with higher rank, 2: separate file for higher rank classification    0
Threads                                                                                         4
Min. sequence similarity score                                                                  0

Loading nodes file ...File ../metabuli/gtdb+virus+human/nodes.dmp not found!
'''

(3) It also happened that the classify workflow only worked for trimmed reads using prebuilt gtdb_r220+virus+human but not for the prebuilt gtdb_r226. Any other use cases cause segfault. I wonder if it is caused by large data size limited RAM (remote server max: 500 Gb; input file size: ~5Gb , paired)

Tom

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prebuilt databases do not have nodes.dmp for the use of classifiedRefiner module; segfault if using gtdb_r226 instead of gtdb_r220+virus+human database #175

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Prebuilt databases do not have nodes.dmp for the use of classifiedRefiner module; segfault if using gtdb_r226 instead of gtdb_r220+virus+human database #175

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions