Skip to content

Commit 35d1202

Browse files
authored
Merge pull request #146 from pbashyal-nmdp/0.7.0_release
0.7.0 release
2 parents f362de1 + 9d5fdcf commit 35d1202

File tree

3 files changed

+111
-82
lines changed

3 files changed

+111
-82
lines changed

README.md

Lines changed: 30 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,14 @@
33

44
ARD reduction for HLA with Python
55

6+
`py-ard` works with Python 3.8 and higher.
7+
8+
## Install from PyPi
9+
10+
```shell
11+
pip install py-ard
12+
```
13+
614
## Install from source
715

816
```shell
@@ -11,13 +19,6 @@ source venv/bin/activate
1119

1220
python setup.py install
1321
```
14-
15-
## Install from PyPi
16-
17-
```shell
18-
pip install py-ard
19-
```
20-
2122
## Testing
2223

2324
To run behavior-driven development (BDD) tests locally via the behave framework, you'll need to set up a virtual
@@ -30,10 +31,15 @@ pip install -r test-requirements.txt
3031

3132
# Running Behave and all BDD tests
3233
behave
34+
35+
# Run unit-tests
36+
python -m unittest tests.test_pyard
3337
```
3438

3539
## Using `py-ard` from Python code
3640

41+
`py-ard` can be used in a program to reduce/expand HLA GL String representation. If pyard discovers an invalid Allele, it'll throw an Invalid Exception, not silently return an empty result.
42+
3743
### Initialize `py-ard`
3844

3945
Import `pyard` package.
@@ -42,8 +48,7 @@ Import `pyard` package.
4248
import pyard
4349
```
4450

45-
The cache size of pre-computed reductions can be changed from the default of 1000
46-
51+
The cache size of pre-computed reductions can be changed from the default of 1000 (_not working_: will be fixed in a later release.)
4752
```python
4853
pyard.max_cache_size = 1_000_000
4954
```
@@ -74,7 +79,7 @@ ard = pyard.ARD()
7479

7580
### Reduce Typings
7681

77-
Reduce a single locus HLA Typing
82+
Reduce a single locus HLA Typing.
7883

7984
```python
8085
allele = "A*01:01:01"
@@ -107,13 +112,13 @@ ard.redux_gl('B14', 'lg')
107112

108113
## Valid Reduction Types
109114

110-
|Reduction Type | Description |
111-
|-------------- |-------------|
112-
| `G` | Reduce to G Group Level |
113-
| `lg` | Reduce to 2 field ARD level (append `g`) |
114-
| `lgx` | Reduce to 2 field ARD level |
115-
| `W` | Reduce/Expand to 3 field WHO nomenclature level|
116-
| `exon` | Reduce/Expand to exon level |
115+
| Reduction Type | Description |
116+
|----------------|-------------------------------------------------|
117+
| `G` | Reduce to G Group Level |
118+
| `lg` | Reduce to 2 field ARD level (append `g`) |
119+
| `lgx` | Reduce to 2 field ARD level |
120+
| `W` | Reduce/Expand to 3 field WHO nomenclature level |
121+
| `exon` | Reduce/Expand to exon level |
117122

118123
# Command Line Tools
119124

@@ -160,6 +165,12 @@ $ pyard-import --v2-to-v3-mapping map2to3.csv
160165
$ pyard-import --db-version 3450 --refresh-mac
161166
```
162167

168+
### Show the status of all `py-ard` databases
169+
170+
```shell
171+
$ pyard-status
172+
```
173+
163174
### Reduce a GL String from command line
164175

165176
```shell
@@ -172,10 +183,6 @@ DRB1*08:01:01G/DRB1*08:02:01G/DRB1*08:03:02G/DRB1*08:04:01G/DRB1*08:05/ ...
172183
$ pyard -v 3290 --gl 'A1' -r lgx # For a particular version of DB
173184
A*01:01/A*01:02/A*01:03/A*01:06/A*01:07/A*01:08/A*01:09/A*01:10/A*01:12/ ...
174185
```
186+
### Batch Reduce a CSV file
175187

176-
### Show the status of all `py-ard` databases
177-
178-
```shell
179-
$ pyard-status
180-
```
181-
188+
`pyard-csv-reduce` can be used to batch process a CSV file with HLA typings. See [documentation](extras/README.md) for instructions on how to configure and run.

extras/README.md

Lines changed: 80 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -4,111 +4,133 @@
44

55
**Example Scripts to batch reduce HLA typings from a CSV File**
66

7-
`pyard-reduce-csv` command can be used with a config file(that describes ways
8-
to reduce the file) can be used to take a CSV file with HLA typing data and
9-
reduce certain columns and produce a new CSV or an Excel file.
10-
11-
Install `py-ard` and use `pyard-reduce-csv` command specifying the changes in a JSON
12-
config file and running `pyard-reduce-csv -c <config-file>` will produce result based
13-
on the configuration in the config file.
7+
`pyard-reduce-csv` command can be used with a config file(that describes ways to reduce the file) can be used to take a
8+
CSV file with HLA typing data and reduce certain columns and produce a new CSV or an Excel file.
149

10+
Install `py-ard` and use `pyard-reduce-csv` command specifying the changes in a JSON config file and
11+
running `pyard-reduce-csv -c <config-file>` to produce a resulting file based on the configuration in the config file.
1512

1613
See [Example JSON config file](reduce_conf.json).
1714

18-
1915
### Input CSV filename
16+
2017
`in_csv_filename` Directory path and file name of the Input CSV file
2118

2219
### Output CSV filename
20+
2321
`out_csv_filename` Directory path and file name of the Reduced Output CSV file
2422

2523
### CSV Columns to read
24+
2625
`columns_from_csv` The column names to read from CSV file
2726

2827
```json
2928
[
30-
"nmdp_id",
31-
"r_a_typ1",
32-
"r_a_typ2",
33-
"r_b_typ1",
34-
"r_b_typ2",
35-
"r_c_typ1",
36-
"r_c_typ2",
37-
"r_drb1_typ1",
38-
"r_drb1_typ2",
39-
"r_dpb1_typ1",
40-
"r_dpb1_typ2"
41-
]
29+
"nmdp_id",
30+
"r_a_typ1",
31+
"r_a_typ2",
32+
"r_b_typ1",
33+
"r_b_typ2",
34+
"r_c_typ1",
35+
"r_c_typ2",
36+
"r_drb1_typ1",
37+
"r_drb1_typ2",
38+
"r_dpb1_typ1",
39+
"r_dpb1_typ2"
40+
]
4241
```
4342

4443
### CSV Columns to reduce
44+
4545
`columns_to_reduce_in_csv` List of columns which have typing information and need to be reduced.
4646

47-
**NOTE**: The locus is the 2nd term in the column name
48-
E.g., for column `column R_DRB1_type1`, `DPB1` is the locus name
47+
**Important**: The locus is the 2nd term in the column name separated by `_`. The program uses this to figure out the
48+
column name for the typings in that column.
49+
50+
E.g., for column `R_DRB1_type1`, `DPB1` is the locus name
4951

5052
```json
5153
[
52-
"r_a_typ1",
53-
"r_a_typ2",
54-
"r_b_typ1",
55-
"r_b_typ2",
56-
"r_c_typ1",
57-
"r_c_typ2",
58-
"r_drb1_typ1",
59-
"r_drb1_typ2",
60-
"r_dpb1_typ1",
61-
"r_dpb1_typ2"
62-
],
54+
"r_a_typ1",
55+
"r_a_typ2",
56+
"r_b_typ1",
57+
"r_b_typ2",
58+
"r_c_typ1",
59+
"r_c_typ2",
60+
"r_drb1_typ1",
61+
"r_drb1_typ2",
62+
"r_dpb1_typ1",
63+
"r_dpb1_typ2"
64+
]
6365
```
6466

65-
6667
### Redux Options
67-
`redux_type` Reduction Type
6868

69-
Valid Options: `G`, `lg` and `lgx`
69+
`redux_type` Reduction Type
7070

71-
### Compression Options
72-
`apply_compression` Compression to use for output file
71+
Valid Options are:
7372

74-
Valid options: `'gzip'`, `'zip'` or `null`
73+
| Reduction Type | Description |
74+
|----------------|-------------------------------------------------|
75+
| `G` | Reduce to G Group Level |
76+
| `lg` | Reduce to 2 field ARD level (append `g`) |
77+
| `lgx` | Reduce to 2 field ARD level |
78+
| `W` | Reduce/Expand to 3 field WHO nomenclature level |
79+
| `exon` | Reduce/Expand to exon level |
7580

76-
### Verbose log Options
77-
`log_comment` Show verbose log ?
7881

79-
Valid options: `true` or `false`
82+
### Kinds of typings to reduce
8083

81-
### Types of typings to reduce
8284
```json
83-
"verbose_log": true,
84-
"reduce_serology": false,
85-
"reduce_v2": true,
86-
"reduce_3field": true,
87-
"reduce_P": true,
88-
"reduce_XX": false,
89-
"reduce_MAC": true,
85+
"reduce_serology": false,
86+
"reduce_v2": true,
87+
"convert_v2_to_v3": false,
88+
"reduce_3field": true,
89+
"reduce_P": true,
90+
"reduce_XX": false,
91+
"reduce_MAC": true,
9092
```
9193
Valid options: `true` or `false`
9294

93-
9495
### Locus Name in Allele
95-
`locus_in_allele_name`
96-
Is locus name present in allele ? E.g. A*01:01 vs 01:01
96+
97+
`locus_in_allele_name`
98+
Is locus name present in allele ? E.g. `A*01:01` vs `01:01`
9799

98100
Valid options: `true` or `false`
99101

100102
### Output Format
103+
101104
`output_file_format` Format of the output file
102105

103-
Valid options: `csv` or `xlsx`
106+
Valid options: `csv` or `xlsx`
107+
108+
For Excel output, `openpyxl` library needs to be installed. Install with:
109+
```shell
110+
pip install openpyxl
111+
```
112+
104113

105-
### Create New Column
106-
`new_column_for_redux` Add a separate column for processed column or replace
107-
the current column. Creates a `reduced_` version of the column.
114+
### Create New Column
115+
116+
`new_column_for_redux` Add a separate column for processed column or replace the current column. Creates a `reduced_` version of the column. Otherwise, the same column is replaced with the reduced version.
108117

109118
Valid options: `true`, `false`
110119

111120
### Map to DRBX
112-
`map_drb345_to_drbx` Map to DRBX Typings based on DRB3, DRB4 and DRB5 typings.
121+
122+
`map_drb345_to_drbx` Map to DRBX Typings based on DRB3, DRB4 and DRB5 typings using [WMDA method](https://www.nature.com/articles/1705672).
113123

114124
Valid options: `true` or `false`
125+
126+
### Compression Options
127+
128+
`apply_compression` Compression to use for output file. Applies only to CSV files.
129+
130+
Valid options: `'gzip'`, `'zip'` or `null`
131+
132+
### Verbose log Options
133+
134+
`verbose_log` Show verbose log ?
135+
136+
Valid options: `true` or `false`

extras/reduce_conf.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@
2727
"r_dpb1_typ2"
2828
],
2929
"redux_type": "lgx",
30-
"apply_compression": "gzip",
3130
"reduce_serology": false,
3231
"reduce_v2": true,
3332
"convert_v2_to_v3": false,
@@ -40,5 +39,6 @@
4039
"output_file_format": "csv",
4140
"new_column_for_redux": false,
4241
"map_drb345_to_drbx": false,
42+
"apply_compression": "gzip",
4343
"verbose_log": true
4444
}

0 commit comments

Comments
 (0)