Skip to content

Commit 075e2ef

Browse files
authored
Rename dataset loader and add compression argument to more components (#930)
* rename component and workflow * add output_compression
1 parent fcf16f1 commit 075e2ef

File tree

17 files changed

+75
-36
lines changed

17 files changed

+75
-36
lines changed

src/datasets/api/comp_dataset_loader.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,9 @@ arguments:
1212
__merge__: file_raw.yaml
1313
direction: "output"
1414
required: true
15+
- name: --output_compression
16+
type: string
17+
choices: [gzip, lzf]
18+
required: false
19+
example: gzip
1520
test_resources: []

src/datasets/api/comp_normalization.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,11 @@ arguments:
1616
__merge__: file_normalized.yaml
1717
direction: output
1818
required: true
19+
- name: --output_compression
20+
type: string
21+
choices: [gzip, lzf]
22+
required: false
23+
example: gzip
1924
- name: "--normalization_id"
2025
type: string
2126
description: "The normalization id to store in the dataset metadata. If not specified, the functionality name will be used."

src/datasets/api/comp_processor_hvg.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,11 @@ arguments:
2020
direction: output
2121
__merge__: file_hvg.yaml
2222
required: true
23+
- name: --output_compression
24+
type: string
25+
choices: [gzip, lzf]
26+
required: false
27+
example: gzip
2328
- name: "--var_hvg"
2429
type: string
2530
default: "hvg"

src/datasets/api/comp_processor_knn.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,11 @@ arguments:
2020
direction: output
2121
__merge__: file_knn.yaml
2222
required: true
23+
- name: --output_compression
24+
type: string
25+
choices: [gzip, lzf]
26+
required: false
27+
example: gzip
2328
- name: "--key_added"
2429
type: string
2530
default: "knn"

src/datasets/api/comp_processor_pca.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,11 @@ arguments:
2424
direction: output
2525
__merge__: file_pca.yaml
2626
required: true
27+
- name: --output_compression
28+
type: string
29+
choices: [gzip, lzf]
30+
required: false
31+
example: gzip
2732
- name: "--obsm_embedding"
2833
type: string
2934
default: "X_pca"

src/datasets/api/comp_processor_subset.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,11 @@ arguments:
2222
__merge__: file_common_dataset.yaml
2323
direction: output
2424
required: false
25+
- name: --output_compression
26+
type: string
27+
choices: [gzip, lzf]
28+
required: false
29+
example: gzip
2530
test_resources:
2631
- path: /resources_test/common/pancreas
2732
dest: resources_test/common/pancreas

src/datasets/api/comp_processor_svd.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,11 @@ arguments:
2828
direction: output
2929
__merge__: file_svd.yaml
3030
required: false
31+
- name: --output_compression
32+
type: string
33+
choices: [gzip, lzf]
34+
required: false
35+
example: gzip
3136
- name: "--obsm_embedding"
3237
type: string
3338
default: "X_svd"

src/datasets/loaders/scrnaseq/op3/config.vsh.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: op3
1+
name: openproblems_op3
22
namespace: datasets/loaders/scrnaseq
33
description: |
44
"Loads and preprocesses the OP3 dataset from GEO accession GSE279945."
@@ -39,7 +39,7 @@ argument_groups:
3939
- name: "--dataset_id"
4040
type: string
4141
description: "Unique identifier for the dataset"
42-
default: "op3"
42+
default: "openproblems_op3"
4343
- name: "--dataset_name"
4444
type: string
4545
description: "Human-readable name for the dataset"

src/datasets/processors/hvg/script.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,4 @@
3232
adata.var[par["var_hvg_score"]] = out['dispersions_norm'].values
3333

3434
print(">> Writing data", flush=True)
35-
adata.write_h5ad(par['output'])
36-
35+
adata.write_h5ad(par['output'], compression=par["output_compression"])

src/datasets/processors/knn/script.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,5 +23,5 @@
2323
)
2424

2525
print(">> Writing data", flush=True)
26-
adata.write_h5ad(par['output'])
26+
adata.write_h5ad(par['output'], compression=par["output_compression"])
2727

0 commit comments

Comments
 (0)