1- Io_lib: Version 1.14.8
1+ Io_lib: Version 1.14.9
22=======================
33
44Io_lib is a library of file reading and writing code to provide a general
5- purpose trace file (and Experiment File) reading interface. The programmer
6- simply calls the (eg) read_reading to create a "Read" C structure with the
7- data loaded into memory. It has been compiled and tested on a variety
8- of unix systems, MacOS X and MS Windows.
5+ purpose SAM/BAM/CRAM, trace file (and Experiment File) reading
6+ interface. Programmatically {S,B,CR}AM can be manipulated using the
7+ scram_ * () API functions while DNA Chromatogram ("trace") files can be
8+ read using the read_reading() function.
9+
10+ It has been compiled and tested on a variety of unix systems, MacOS X
11+ and MS Windows.
912
1013The directories below here contain the io_lib code. These support the
1114following file formats:
1215
16+ SAM/BAM sequence files
17+ CRAM sequence files
1318 SCF trace files
1419 ABI trace files
1520 ALF trace files
@@ -18,62 +23,73 @@ following file formats:
1823 SRF trace archives
1924 Experiment files
2025 Plain text files
21- SAM/BAM sequence files
22- CRAM sequence files
2326
2427These link together to form a single "libstaden-read" library supporting
2528all the file formats via a single read_reading (or fread_reading or
2629mfread_reading) function call and analogous write_reading functions
2730too. See the file include/Read.h for the generic 'Read' structure.
2831
29- See the CHANGES for a summary of older updates or ChangeLog for the
32+ See the CHANGES for a summary of older updates or git logs for the
3033full details.
3134
32- Version 1.14.8 (22nd April 2016 )
35+ Version 1.14.9 (9th February 2017 )
3336--------------
3437
35- * SAM: Small speed up to record parsing.
38+ Updates:
39+
40+ * BAM: Added CRC checking. Bizarrely this was absent here and in most
41+ other BAM implementations too. Pure BAM decode of an uncompressed
42+ BAM is around 9% slower and compressed BAM to compressed BAM is
43+ almost identical. The most significant hit is reading uncompressed
44+ BAM (and doing nothing else) which is 120% slower as CRC dominates.
45+ Options are available to disable the CRC checking incase this is an
46+ issue (scramble -!).
47+
48+ * CRAM: Now supports bgziped fasta references.
49+
50+ * CRAM/SAM: Headers are now kept in the same basic type order while
51+ transcoding. (Eg all @PG before all @SQ , or vice versa, depending on
52+ input ordering.)
53+
54+ * CRAM: Compression level 1 is now faster but larger. (The old -1 and
55+ -2 were too similar.)
56+
57+ * CRAM: Improved compression efficiency in some files, when switching
58+ from sorted to unsorted data.
59+
60+ * CRAM: Various speedups relating to memory handling,
61+ multi-threaded performance and the rANS codec.
62+
63+ * CRAM: Block CRC checks are now only done when the block is used,
64+ speeding up multi-threading and tools that do not decode all blocks
65+ (eg flagstat).
3666
37- * CRAM: Scramble now has -p and -P options to control whether to
38- force the BAM auxiliary sizes (8 vs 16 vs 32-bit integer quantities)
39- rather than reducing to smallest size required, and whether to
40- preserve the order of auxiliary tags including RG, NM and MD.
67+ * Scramble -g and -G options to generate and reuse bgzip indices when
68+ reading and writing BAM files.
4169
42- This latter option requires storing these values verbatim instead of
43- regenerating them on-the-fly, but note this only preserves tag order
44- with Scramble / Htslib. Htsjdk will still produce these fields out
45- of order.
70+ * Scramble -q option to omit updating the @PG header records.
4671
47- * CRAM no longer stores data in the CORE block, permitting greater
48- flexibility in choosing which fields to decode. (This change is
49- also mirrored in htslib and htsjdk.)
72+ * Experimental cram_filter tool has been added, to rapidly produce
73+ cram subsets.
5074
51- * CRAM: ref.fai files in a different order to @SQ headers should now
52- work correctly.
75+ * Migrated code base to git. Use github for primary repository.
5376
54- * CRAM required-fields parameters no longer forces quality decoding
55- when asking for sequence.
77+ Bug fixes:
5678
57- * CRAM: More robustness / safety checks during decoding; itf8 bounds
58- checks, running out of memory, bounds checks in BETA codec, and
59- more.
79+ * BAM: Fixed the bin value calculation for placed but unmapped reads.
6080
61- * CRAM auto-generated read names are consistent regardless of range
62- queries. They also now match those produced by htslib.
81+ * CRAM: Fixed file descriptor leak in refs_load_fai().
6382
64- * A few compiler warnings in cram_dump / cram_size have gone away.
65- Many small CRAM code tweaks to aid comparisons to htslib. It should
66- also be easier to build under Microsoft Visual Studio (although no
67- project file is provided).
83+ * CRAM: Fixed a crash in MD5 calculation for sequences beyond the
84+ reference end.
6885
69- * CRAM: the rANS codec should now be slightly faster at decoding .
86+ * CRAM: Bug fixes when encoding malformed @ SQ records .
7087
71- * CRAM bug fix: removed potential (but unobserved) possibility of
72- 8-bit quantities stored as a 16-bit value in BAM being converted
73- incorrectly within CRAM.
88+ * CRAM: Fixed a rare renormalisation bug in rANS codec.
7489
75- * SAM bug fix: no more complaining about "unknown" sort order .
90+ * Fixed tests so make -j worked .
7691
92+ * Removed ancient, broken and unused popen() code.
7793
7894
7995Building
0 commit comments