-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
The substitute matrix, S, shows a high load imbalance. Fixing this may require keeping a randomized mapping of k-mers to k-mer IDs.
See the email thread About Large Runs on 12/29/2019. Here's a logfile from a run that shows this effect and also fails.
Process Grid (p x p x t): 68 x 68 x 2
INFO: Program started on Sat Dec 28 20:04:12 2019
INFO: Job ID knl_fa_shuff_subs25/knl_fa_shuff_subs25_c61ed871-1547-4285-8082-f05137282334
Parameters...
Input file (-i): /global/cscratch1/sd/esaliya/data/isolates/archaea/sanitized_2728834_impure_2729008_len_lte_2000_in_shuffled_isolates_proteins_archaea.fasta
Original sequence count (-c): 2728834
Kmer length (k): 6
Kmer stride (s): 1
Overlap in bytes (-O): 10000
Max seed count (--sc): 1
Gap open penalty (-g): -11
Gap extension penalty (-e): -2
Overlap file (--of): None
Alignment file (--af): knl_fa_shuff_subs25/knl_fa_shuff_subs25_align.txt
Alignment write frequency (--afreq): 100000
No align (--na): False
Full align (--fa): True
Xdrop align (--xa): False
Banded align (--ba): False
Index map (--idxmap): knl_fa_shuff_subs25_archaea_idx_map.txt
Alphabet (--alph): 0
Use substitute kmers (--subs): True | sub kmers: 25
Creating fileknl_fa_shuff_subs25_archaea_idx_map.txt with 41438932 bytes
File knl_fa_shuff_subs25_archaea_idx_map.txt is actually 41438932 bytes seen from process 4623
INFO: Modfied sequence count
Final sequence count: 2728822 (0.000440% removed)
Matrix A:
Load imbalance: 3.118424
As a whole: 2728822 rows and 244140625 columns and 718716196 nonzeros
Matrix At: As a whole: 244140625 rows and 2728822 columns and 718716196 nonzeros
Matrix S:
Load imbalance: 113.142021
As a whole: 244140625 rows and 244140625 columns and 723834658 nonzeros
Matrix AS:
Load imbalance: 2.567925
As a whole: 2728822 rows and 244140625 columns and 10751320837 nonzeros
terminate called after throwing an instance of 'std::bad_alloc'
Metadata
Metadata
Assignees
Labels
No labels