-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path.gitignore
More file actions
151 lines (127 loc) · 3.14 KB
/
.gitignore
File metadata and controls
151 lines (127 loc) · 3.14 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# .gitignore - Pipeline PanForest
# ===============================
# ============================================
# SNAKEMAKE - Outputs and temporary files
# ============================================
.snakemake/
logs/
benchmarks/
*.log
slurm-*.out
# ============================================
# RESULTS - Generated files
# ============================================
results/
data/interim/*
data/final/*
# Random Forest batch processing
data/interim/rf_batches/
results/random_forest/batches/
# D Statistic batch processing
data/interim/d_stat_batches/
results/statistics/batches/
# Keep directory structure
!results/.gitkeep
!data/interim/.gitkeep
!data/final/.gitkeep
# ============================================
# RAW DATA - Downloaded genomes
# ============================================
data/raw/genomes/genomes/
data/raw/genomes/ncbi_metadata.jsonl
data/raw/genomes/gff_files.txt
data/raw/genomes/accessions_filtered.txt
data/raw/genomes/deduplication_stats.txt
data/raw/ncbi_dataset/
data/raw/*.gff
data/raw/*.gbk
data/raw/*.fna
data/raw/*.faa
data/raw/*.fasta
data/raw/*.fa
data/raw/*.fastq
data/raw/*.fastq.gz
# Keep original repo data for Panaroo output from paper (gene_presence_absence.csv)
!data/raw/example_*.csv
!data/raw/.gitkeep
# ============================================
# PANAROO - Large intermediate outputs
# ============================================
# If Panaroo generates outputs in data/raw, ignore them
data/raw/panaroo_output/
data/raw/gene_data/
data/raw/*.aln
data/raw/*.tree
data/raw/*.nwk
data/raw/*.phy
# ============================================
# PYTHON - Bytecode and cache
# ============================================
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
*.egg
*.egg-info/
dist/
build/
.eggs/
pip-log.txt
pip-delete-this-directory.txt
# ============================================
# R - Temporary files and data
# ============================================
.Rhistory
.RData
.Rproj.user
*.Rproj
.Rapp.history
*_cache/
/cache/
# ============================================
# CONDA
# ============================================
.conda/
*.tar.bz2
.snakemake/conda/
workflow/envs/.conda/
# ============================================
# REGENERABLE GRAPHICS
# ============================================
dag.png
rulegraph.png
filegraph.png
*.pdf
*.svg
# Keep final figures
!results/figures/paper_*.png
!results/figures/paper_*.pdf
# ============================================
# DATABASE
# ============================================
results/database/*.db
results/database/*.sqlite
results/database/*.sqlite3
*.db-journal
# ============================================
# LARGE CSV - Importance matrices
# ============================================
# results/random_forest/imp.csv
# results/networks/simplified_imp.csv
# ============================================
# IDEs
# ============================================
# VSCode
.vscode/
*.code-workspace
# ============================================
# OPERATING SYSTEM
# ============================================
# Linux
.directory
.Trash-*
# ============================================
# DOCUMENTATION
# ============================================
CLUSTER_USAGE.md