Skip to content

Commit d02cef3

Browse files
committed
Merge branch 'dev'
2 parents 238640f + 3d603d1 commit d02cef3

File tree

13 files changed

+84
-22
lines changed

13 files changed

+84
-22
lines changed

README.md

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@ You can also use conda to install *slow5tools* as `conda install slow5tools -c b
3030

3131
### Building a release
3232

33-
3433
Users are recommended to build from the [latest release](https://github.com/hasindu2008/slow5tools/releases) tar ball.
3534

3635
Quick example for Ubuntu :
@@ -70,7 +69,7 @@ make
7069

7170
### Other building options
7271

73-
- You can optionally enable [*zstd* compression](https://facebook.github.io/zstd) support when building *slow5lib* by invoking `make zstd=1`. This requires __zstd 1.3 or higher development libraries__ installed on your system (*libzstd1-dev* package for *apt*, *libzstd-devel* for *yum/dnf* and *zstd* for *homebrew*). SLOW5 files compressed with *zstd* offer slightly smaller file size and better performance compared to the default *zlib*. However, *zlib* runtime library is available by default on almost all distributions unlike *zstd* and thus files compressed with *zlib* will be more 'portable' (also see [notes](https://github.com/hasindu2008/slow5tools#notes)).
72+
- You can optionally enable [*zstd* compression](https://facebook.github.io/zstd) support when building *slow5lib* by invoking `make zstd=1`. This requires __zstd 1.3 or higher development libraries__ installed on your system (*libzstd1-dev* package for *apt*, *libzstd-devel* for *yum/dnf* and *zstd* for *homebrew*). SLOW5 files compressed with *zstd* offer smaller file size and better performance compared to the default *zlib*. However, *zlib* runtime library is available by default on almost all distributions unlike *zstd* and thus files compressed with *zlib* will be more 'portable' (also see [notes](https://github.com/hasindu2008/slow5tools#notes)).
7473

7574
- *slow5tools* from version 0.3.0 onwards by default requires vector instructions (SSSE3 or higher for Intel/AMD and neon for ARM). If your processor is an ancient processor with no such vector instructions, invoke make as `make no_simd=1`.
7675

@@ -85,11 +84,17 @@ make
8584
Similarly, to locally build *zstd* and link against that:
8685
8786
```
88-
scripts/install-zstd.sh # download and compiles HDF5 in the current folder
87+
scripts/install-zstd.sh # download and compiles zstd in the current folder
8988
./configure --enable-localzstd
90-
make
89+
make # don't run make zstd=1. libzstd.a is statically linked this time.
9190
```
9291
92+
- On Mac M1 or in any system if `./configure` cannot find the hdf5 libraries installed through the package manager, you can specify the location as *LDFLAGS=-L/path/to/shared/lib/ CPPFLAGS=-I/path/to/headers/*. For example on Mac M1:
93+
```
94+
./configure LDFLAGS=-L/opt/homebrew/lib/ CPPFLAGS=-I/opt/homebrew/include/
95+
make
96+
```
97+
9398
- You can build a docker image as follows.
9499
```
95100
git clone https://github.com/hasindu2008/slow5tools && cd slow5tools
@@ -99,7 +104,7 @@ make
99104
100105
## Usage
101106
102-
Visit the [man page](https://hasindu2008.github.io/slow5tools/commands.html) for all the commands and options.
107+
Visit the [man page](https://hasindu2008.github.io/slow5tools/commands.html) for all the commands and options. See [here](https://hasindu2008.github.io/slow5tools/oneliners.html) for example bash one-liners with slow5tools. A guide on using BLOW5 for archiving and steps to verify if data integrity is preserved is [here](https://hasindu2008.github.io/slow5tools/archive.html). A script for performing real-time FAST5 to BLOW5 conversion during sequencing is provided [here](https://github.com/hasindu2008/slow5tools/tree/master/scripts/realtime-f2s).
103108
104109
### Examples
105110
@@ -141,6 +146,24 @@ slow5tools s2f blow5_dir -d fast5
141146

142147
Visit [here](https://hasindu2008.github.io/slow5tools/workflows.html) for example workflows.
143148

149+
150+
### Troubleshooting/Questions
151+
152+
Visit the [frequently asked questions](https://hasindu2008.github.io/slow5tools/faq.html) or open an [issue](https://github.com/hasindu2008/slow5tools/issues).
153+
154+
155+
### Upcoming features and optimisations
156+
157+
Following are some features and optimisations in our todo list which will be implemented based on the need. If anyone is interested please request [here](https://github.com/hasindu2008/slow5tools/issues). Contributions are welcome.
158+
159+
- pipelining input, processing and output in *merge, get, etc.* (expected runtime improvement upto 2X)
160+
- reading from stdin for *view*
161+
- binary releases for ARM64 processors on Linux
162+
- binary releases for MacOS
163+
- decoupling conversion modules (currently f2s and s2f; any future formats) so that slow5tools only deal with S/BLOW5 files and thus can be easily compiled. Currently, compiling slow5tools is not straight forward due to the HDF5 (FAST5) dependency
164+
- any other features that are potentially useful to many
165+
166+
144167
### Notes
145168

146169
*slow5lib* from version 0.3.0 onwards has built in [StreamVByte](https://github.com/lemire/streamvbyte) compression support to enable even smaller file sizes, which is applied to the raw signal by default when producing BLOW5 files. *zlib* compression is then applied by default to each record. If *zstd* is used instead of *zlib* on top of *StreamVByte*, it is similar to ONT's latest [vbz](https://github.com/nanoporetech/vbz_compression) compression. BLOW5 files compressed with *zstd+StreamVByte* are still significantly smaller than vbz compressed FAST5 files.

docs/getting_started.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,12 @@ make
8484
./configure --enable-localzstd
8585
make
8686
```
87+
88+
- On Mac M1 or in any system if `./configure` cannot find the hdf5 libraries installed through the package manager, you can specify the location as *LDFLAGS=-L/path/to/shared/lib/ CPPFLAGS=-I/path/to/headers/*. For example on Mac M1:
89+
```
90+
./configure LDFLAGS=-L/opt/homebrew/lib/ CPPFLAGS=-I/opt/homebrew/include/
91+
make
92+
8793
8894
- You can build a docker image as follows.
8995
```

docs/workflows.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ f5c eventalign -t 8 -r reads.fq -g ref.fa -b reads.bam --slow5 signals.blow5 > m
3333

3434
## Nanopolish
3535

36-
[Nanopolish](https://github.com/jts/nanopolish) master branch now supports SLOW5 file format.
36+
[Nanopolish](https://github.com/jts/nanopolish) version 0.14.0 onwards supports SLOW5 file format.
3737

3838
```bash
3939
#convert fast5 files to slow5 files using 8 I/O processes
@@ -57,3 +57,10 @@ slow5tools f2s fast5_dir -d blow5_dir -p 8
5757
# run sigmap
5858
./sigmap -m -r ref.fa -p <model> -x index -s blow5_dir -o mapping.paf -t 8
5959
```
60+
61+
## Bonito
62+
63+
SLOW5 support for ONT's Bonito basecaller is now available as a [pull request](https://github.com/nanoporetech/bonito/pull/252) along with usage instructions and benchmarks.
64+
65+
66+

scripts/realtime-f2s/pipeline.sh

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ die() {
1212

1313
LOG=start_end_trace.log
1414

15+
MAX_PROC=$(nproc)
16+
MAX_PROC=$(echo "${MAX_PROC}/2" | bc)
17+
1518
## Handle flags
1619
while getopts "d:l:f:" o; do
1720
case "${o}" in
@@ -35,10 +38,12 @@ shift $((OPTIND-1))
3538

3639
$SLOW5TOOLS --version &> /dev/null || die "[pipeline.sh] slow5tools not found in path. Exiting."
3740

41+
echo "[pipeline.sh] Starting pipeline with $MAX_PROC max processes"
3842
#test -e ${LOG} && rm ${LOG}
39-
43+
counter=0
4044
while read FILE
4145
do
46+
(
4247
F5_FILEPATH=$FILE # first argument
4348
F5_DIR=${F5_FILEPATH%/*} # strip filename from .fast5 filepath
4449
PARENT_DIR=${F5_DIR%/*} # get folder one heirarchy higher
@@ -86,5 +91,12 @@ do
8691

8792
echo "[pipeline.sh] $F5_FILEPATH" >> $TMP_FILE
8893
echo -e "${F5_FILEPATH}\t${SLOW5_FILEPATH}\t${START_TIME}\t${END_TIME}" >> ${LOG}
89-
94+
)&
95+
((counter++))
96+
if [ $counter -ge $MAX_PROC ]; then
97+
echo "[pipeline.sh] Waiting for $counter jobs to finish."
98+
wait
99+
counter=0
100+
fi
90101
done
102+
wait

src/cmd.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#ifndef CMD_H
22
#define CMD_H
33

4-
#define SLOW5TOOLS_VERSION "0.4.0"
4+
#define SLOW5TOOLS_VERSION "0.4.0-dirty"
55

66
#define DEFAULT_NUM_THREADS 8
77
#define DEFAULT_NUM_PROCESSES 8

src/get.c

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -124,10 +124,6 @@ int get_main(int argc, char **argv, struct program_meta *meta) {
124124
opt_t user_opts;
125125
init_opt(&user_opts);
126126

127-
// Default options
128-
int32_t num_threads = DEFAULT_NUM_THREADS;
129-
int64_t read_id_batch_capacity = DEFAULT_BATCH_SIZE;
130-
131127
// Input arguments
132128
char* read_list_file_in = NULL;
133129

@@ -272,7 +268,7 @@ int get_main(int argc, char **argv, struct program_meta *meta) {
272268

273269
// Setup multithreading structures
274270
core_t core;
275-
core.num_thread = num_threads;
271+
core.num_thread = user_opts.num_threads;
276272
core.fp = slow5file;
277273
core.format_out = user_opts.fmt_out;
278274
core.press_method = press_out;
@@ -287,7 +283,7 @@ int get_main(int argc, char **argv, struct program_meta *meta) {
287283
bool end_of_file = false;
288284
while (!end_of_file) {
289285
int64_t num_ids = 0;
290-
while (num_ids < read_id_batch_capacity) {
286+
while (num_ids < user_opts.read_id_batch_capacity) {
291287
char *buf = NULL;
292288
size_t cap_buf = 0;
293289
ssize_t nread;

src/view.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ int view_main(int argc, char **argv, struct program_meta *meta) {
146146
EXIT_MSG(EXIT_FAILURE, argv, meta);
147147
return EXIT_FAILURE;
148148
} else if (optind != argc - 1) { // TODO handle more than 1 file?
149-
ERROR(">1 input file%s", "");
149+
ERROR("more than 1 input file is given%s", "");
150150
fprintf(stderr, HELP_SMALL_MSG, argv[0]);
151151
EXIT_MSG(EXIT_FAILURE, argv, meta);
152152
return EXIT_FAILURE;

test/data/exp/index/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
*.idx
2+
example_multi_rg_v0.2.0_none_none.blow5
450 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)