Skip to content

Commit 11d1434

Browse files
authored
1.1.3 (#17)
* prep changelog * Fix PR template * WIP parallelise eager job submission * Correct syntax error * No printing to screen for arrays. fix whitespace * fix number of jobs * make executable * Add array qsub command * update .gitignore * print qsub command before submission * Fix job naming * Initial commit of poseidon package creation * Rscript to fill in janno and overwrite columns * Bugfixes * Add suffix option and correct utput janno path * move script * Add janno recreation. Other minor changes * Add pandora results to janno * Add log info. New pacakge creation completed. * Minor changes. Add Library_Names column * Add script to mirror Population and Sex from janno to fam/ind * Update CHANGELOG.md * Update package updating. * Add debug option. Add AE version in poseidon pkgs * Remove debug cause of clash. Error when update fails. * Update CHANGELOG.md * Only delete temp files if validation passed. * Bugfix.Runs now updated only if a change in the data occurs. * move update script to scripts/ * Server-side testing paths * Add path to trident executable * server-paths * Bump version * Add environment yml file * Update CHANGELOG.md * Update output folder to live * increase resources for AE_spawner jobs * More resource tweaking for array jobs * Increase memory further * Remove path from environment yml * Bump version * Match Run_ID, not Batch_ID * Array log subdir * Update CHANGELOG.md * prep CHANGELOG.md * 40G memory max for array job * indentation fix * correct column naming * document changes * correct Nr_libs in column selection * correct paths * Update .gitignore * Optimisation. Version bump. Distinct iids used for joining. * Bump version * Update CHANGELOG.md * bump version * Add mention of memory changes
1 parent 3bddb11 commit 11d1434

5 files changed

Lines changed: 28 additions & 9 deletions

File tree

.gitignore

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,10 @@ eager_outputs/
88
.Rproj.user
99
.nfs*
1010
dev/
11-
test_data/
11+
test_data/
12+
*.*.results.txt
13+
*Autorun_eager_queue.txt
14+
.tmp/
15+
eager_inputs_old/
16+
eager_outputs_old/
17+
array_Logs/

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,19 @@
33
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
44
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
55

6+
## [1.1.3] - 17/03/2023
7+
8+
### `Added`
9+
10+
### `Fixed`
11+
- Column naming in `fill_in_janno.R`. `Nr_Libs` -> `Nr_Libraries`.
12+
- `prepare_eager_tsv.R` no longer joins with non-unique iids. Optimised performance and less likely to kill the TSV maker.
13+
- Increased memory given to eager spawner array jobs.
14+
15+
### `Dependencies`
16+
17+
### `Deprecated`
18+
619
## [1.1.2] - 02/01/2023
720

821
### `Added`

scripts/fill_in_janno.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -111,8 +111,8 @@ poseidon_tsv_cols <- tsv_dat %>% dplyr::select(Sample_Name, Library_ID, Stranded
111111
unique(UDG_Treatment) %>% length(.) > 1 ~ 'mixed',
112112
TRUE ~ unique(UDG_Treatment)
113113
),
114-
Nr_Libs=dplyr::n(),
115-
Capture_Type=paste0(rep("1240K", Nr_Libs), collapse=";"),
114+
Nr_Libraries=dplyr::n(),
115+
Capture_Type=paste0(rep("1240K", Nr_Libraries), collapse=";"),
116116
Library_Built=dplyr::case_when(
117117
Strandedness == 'single' ~ 'ss',
118118
Strandedness == 'double' ~ 'ds',
@@ -188,7 +188,7 @@ updated_columns <- eager2poseidon::compile_eager_result_tables(
188188
"Contamination_Meas",
189189
"Damage",
190190
"UDG",
191-
"Nr_Libs",
191+
"Nr_Libraries",
192192
"Library_Names", ## Column including all the Library_IDs merged into these genotypes
193193
"Library_Built",
194194
"Capture_Type"

scripts/prepare_eager_tsv.R

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,10 @@ save_ind_tsv <- function(data, rename, output_dir, ...) {
4343
if (!dir.exists(ind_dir)) {write(paste0("[prepare_eager_tsv.R]: Creating output directory '",ind_dir,"'"), stdout())}
4444

4545
dir.create(ind_dir, showWarnings = F, recursive = T) ## Create output directory and subdirs if they do not exist.
46-
data %>% select(-individual.Full_Individual_Id) %>% readr::write_tsv(file=paste0(ind_dir,"/",ind_id,".tsv")) ## Output structure can be changed here.
46+
data %>% select(-individual.Full_Individual_Id) %>% readr::write_tsv(file=paste0(ind_dir,"/",ind_id,".tsv")) ## Output structure can be changed here.
4747

4848
## Print Autorun_eager version to file
49-
AE_version <- "1.1.2"
49+
AE_version <- "1.1.3"
5050
cat(AE_version, file=paste0(ind_dir,"/autorun_eager_version.txt"), fill=T, append = F)
5151
}
5252

@@ -124,13 +124,13 @@ complete_pandora_table <- join_pandora_tables(
124124
convert_all_ids_to_values(., con = con) %>%
125125
filter(sample.Ethically_culturally_sensitive == FALSE) ## Exclude ethically/culturally sensitive data. Conservative since it excludes NAs
126126

127-
tibble_input_iids <- complete_pandora_table %>% filter(sequencing.Run_Id == sequencing_batch_id) %>% select(individual.Full_Individual_Id)
127+
tibble_input_iids <- complete_pandora_table %>% filter(sequencing.Run_Id == sequencing_batch_id) %>% select(individual.Full_Individual_Id) %>% distinct()
128128

129129
## Pull information from pandora, keeping only matching IIDs and requested Sequencing types.
130130
results <- inner_join(complete_pandora_table, tibble_input_iids, by=c("individual.Full_Individual_Id"="individual.Full_Individual_Id")) %>%
131131
filter(grepl(paste0("\\.", analysis_type), sequencing.Full_Sequencing_Id), analysis.Analysis_Id == autorun_name_from_analysis_type(analysis_type)) %>%
132132
select(individual.Full_Individual_Id,individual.Organism,library.Full_Library_Id,library.Protocol,analysis.Result_Directory,sequencing.Sequencing_Id,sequencing.Full_Sequencing_Id,sequencing.Single_Stranded) %>%
133-
distinct() %>% ## TODO comment: would be worrying if not already unique, maybe consider throwing a warn?
133+
distinct() %>% ## Need distinct() call because of hoe analysis tab is read in, which created one copy of each row per analysis field.
134134
group_by(individual.Full_Individual_Id) %>%
135135
filter(!is.na(analysis.Result_Directory)) %>% ## Exclude individuals with no results directory (seem to mostly be controls)
136136
mutate(

scripts/update_poseidon_package.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env bash
22

3-
VERSION="1.1.2"
3+
VERSION="1.1.3"
44

55
## Colours for printing to terminal
66
Yellow=$(tput sgr0)'\033[1;33m' ## Yellow normal face

0 commit comments

Comments
 (0)