Skip to content

Commit 5645567

Browse files
Merge branch 'master' of github.com:smithlabcode/smithlab_cpp
2 parents 350444a + 5b76fd5 commit 5645567

File tree

5 files changed

+30
-38
lines changed

5 files changed

+30
-38
lines changed

README.md

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,22 @@
11
# The smithlab_cpp library
22

3-
This library contains code that has been used in the Smith lab for
4-
several years, and that we now depend on for several of our data
5-
analysis tools. Many of those tools use older versions of this source
6-
code in subdirectories of other repos.
3+
This library contains code that has been used in the Smith lab for too
4+
many years, and that we depend on for several of our data analysis
5+
tools. Many of those tools use older versions of this source code in
6+
subdirectories of other repos.
77

88
## Requirements
99

10-
- A C++ compiler that knows C++11. The GNU `g++` compiler works well
11-
for this after version 5.3.
12-
- The GNU Scientific Library, GSL, which is likely already on your
13-
system, or easily installed through a package manager.
10+
- A C++ compiler that knows C++17. The GNU `g++` compiler works well
11+
for this after version GCC9, and the default since GCC11. You can
12+
get this from apt, conda and brew, but beware on macos where it's
13+
not really installed by default even though it pretends to be.
1414
- The [zlib library](https://zlib.net), which we use for I/O of files
15-
in gzip format. You likely have this on your system.
15+
in gzip format. You likely have this on your system. You can get
16+
this from apt, conda and brew, and it's likely installed already.
1617
- Optional: The [HTSLib library](http://htslib.org), which we use for
17-
I/O of SAM and BAM format files.
18+
I/O of SAM and BAM format files. You can get this from apt, conda
19+
and brew.
1820

1921
## Building and installing the smithlab_cpp library
2022

@@ -55,7 +57,7 @@ use it the way it has been used from 2010-2019, then you can use the
5557
`Makefile` in this repo without running the `./configure` script:
5658
```
5759
$ make OptionParser.o
58-
g++ -Wall -std=c++11 -c -o OptionParser.o OptionParser.cpp
60+
g++ -Wall -std=c++17 -c -o OptionParser.o OptionParser.cpp
5961
```
6062
Note: if you run the `./configure` script it will overwrite the
6163
`Makefile` indicated above. If that happens, just get a new one. The
@@ -72,7 +74,7 @@ result should be less total code overall in smithlab_cpp.
7274
- `QualityScore.*pp` likely should be removed, as we only use
7375
sequencing quality scores in specific places, and in those places
7476
have chosen to re-implement anything that would be here.
75-
- `smithlab_os.* Any use of character arrays should be replaced with
77+
- `smithlab_os.*` Any use of character arrays should be replaced with
7678
strings for filenames. Implementation of many functions in the cpp
7779
file is sloppy.
7880
- `smithlab_utils.*pp`: lots to replace here. Many functions seem

chromosome_utils.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ adjust_start_pos(const size_t orig_start, const string &chrom_name) {
7474

7575
static size_t
7676
adjust_region_size(const size_t orig_start,
77-
const string &chrom_name,
77+
const string &chrom_name, // ADS: remove this soon
7878
const size_t orig_size) {
7979
static const double LINE_WIDTH = 50.0;
8080
const size_t preceding_newlines_start =

sam_record.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -112,13 +112,13 @@ sam_rec::sam_rec(const string &line) {
112112
istringstream iss; // ADS: change to set the buffer from "line"
113113
iss.rdbuf()->pubsetbuf(const_cast<char*>(line.c_str()), line.size());*/
114114
istringstream iss(line); // ADS: unfortunate macos stuff?
115-
uint32_t will_become_mapq = 0; // to not read mapq as character
116-
// since it's uint8_t
115+
int32_t will_become_mapq = 0; // to not read mapq as character
116+
// since it's uint8_t
117117
if (!(iss >>
118118
qname >> flags >> rname >> pos >> will_become_mapq >>
119119
cigar >> rnext >> pnext >> tlen >> seq >> qual))
120120
throw runtime_error("incorrect SAM record:\n" + line);
121-
if (mapq > 255)
121+
if (will_become_mapq < 0 || will_become_mapq > 255)
122122
throw runtime_error("invalid mapq in SAM record: " + line);
123123
mapq = static_cast<uint8_t>(will_become_mapq);
124124

smithlab_os.cpp

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
#include <cmath>
2525
#include <unordered_map>
2626
#include <exception>
27+
#include <filesystem>
2728

2829
#include "smithlab_os.hpp"
2930
#include "smithlab_utils.hpp"
@@ -535,9 +536,10 @@ read_dir(const string& dirname, vector<string> &filenames) {
535536

536537
bool
537538
is_valid_output_file(const string &filename) {
538-
const bool file_exists = (access(filename.c_str(), F_OK) == 0);
539-
if (file_exists)
540-
return (!isdir(filename.c_str()) &&
539+
// ADS: seems like there is no way around "access" and apparently
540+
// access is not a great solution anyway.
541+
if (std::filesystem::exists(filename))
542+
return (!std::filesystem::is_directory(filename) &&
541543
access(filename.c_str(), W_OK) == 0);
542544
else {
543545
// ADS: check if dir exists and is writeable

smithlab_utils.hpp

Lines changed: 7 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,6 @@ template <class T> std::string toa(T t) {
300300
return s.str();
301301
}
302302

303-
304303
////////////////////////////////////////////////////////////////////////
305304
// Code for dealing with the DNA alphabet
306305

@@ -338,8 +337,6 @@ bits2string_masked(size_t mask, size_t bits) {
338337
return s;
339338
}
340339

341-
342-
343340
inline std::string
344341
bits2string_for_positions(size_t positions, size_t bits) {
345342
std::string s;
@@ -356,22 +353,14 @@ percent(const size_t a, const size_t b) {
356353
return static_cast<size_t>((100.0*a)/b);
357354
}
358355

359-
360-
361-
362-
363-
////////////////////Code from Alphabet//////////////////////
364-
365-
366-
367-
bool
368-
inline valid_base(char c) {
356+
inline bool
357+
valid_base(char c) {
369358
char i = std::toupper(c);
370359
return (i == 'A' || i == 'C' || i == 'G' || i == 'T');
371360
}
372361

373-
size_t
374-
inline mer2index(const char *s, size_t n) {
362+
inline size_t
363+
mer2index(const char *s, size_t n) {
375364
size_t multiplier = 1, index = 0;
376365
do {
377366
--n;
@@ -381,9 +370,9 @@ inline mer2index(const char *s, size_t n) {
381370
return index;
382371
}
383372

384-
size_t
385-
inline kmer_counts(const std::vector<std::string> &seqs,
386-
std::vector<size_t> &counts, size_t k) {
373+
inline size_t
374+
kmer_counts(const std::vector<std::string> &seqs,
375+
std::vector<size_t> &counts, size_t k) {
387376
counts.clear();
388377
size_t nwords =
389378
static_cast<size_t>(pow(static_cast<float>(smithlab::alphabet_size),
@@ -404,7 +393,6 @@ inline kmer_counts(const std::vector<std::string> &seqs,
404393
return total;
405394
}
406395

407-
408396
/*
409397
* How to use the ProgressBar:
410398
*

0 commit comments

Comments
 (0)