Skip to content

Commit b4c6c69

Browse files
authored
Merge pull request #1150 from nf-core/flexible_tx2gene
Be more flexible on attribute values in GTFs
2 parents 221bdca + 87f603c commit b4c6c69

File tree

2 files changed

+17
-11
lines changed

2 files changed

+17
-11
lines changed

CHANGELOG.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,15 @@ Special thanks to the following for their contributions to the release:
1717
- [Phil Ewels](https://github.com/ewels)
1818
- [Vlad Savelyev](https://github.com/vladsavelyev)
1919

20+
### Enhancements & fixes
21+
22+
- [PR #1135](https://github.com/nf-core/rnaseq/pull/1135) - Update [action-tower-launch](https://github.com/marketplace/actions/action-tower-launch) to v2 which supports more variable handling
23+
- [PR #1141](https://github.com/nf-core/rnaseq/pull/1141) - Important! Template update for nf-core/tools v2.11
24+
- [PR #1143](https://github.com/nf-core/rnaseq/pull/1143) - Move fasta check back to Groovy ([#1142](https://github.com/nf-core/rnaseq/issues/1142))
25+
- [PR #1144](https://github.com/nf-core/rnaseq/pull/1144) - Interface to kmer size for pseudoaligners
26+
- [PR #1149](https://github.com/nf-core/rnaseq/pull/1149) - Fix and patch version commands for Fastp, FastQC and UMI-tools modules ([#1103](https://github.com/nf-core/rnaseq/issues/1103))
27+
- [PR #1150](https://github.com/nf-core/rnaseq/pull/1150) - Be more flexible on attribute values in GTFs ([#1132](https://github.com/nf-core/rnaseq/issues/1132))
28+
2029
### Parameters
2130

2231
| Old parameter | New parameter |
@@ -27,14 +36,6 @@ Special thanks to the following for their contributions to the release:
2736
> **NB:** Parameter has been **added** if just the new parameter information is present.
2837
> **NB:** Parameter has been **removed** if new parameter information isn't present.
2938
30-
### Enhancements & fixes
31-
32-
- [PR #1135](https://github.com/nf-core/rnaseq/pull/1135) - Update [action-tower-launch](https://github.com/marketplace/actions/action-tower-launch) to v2 which supports more variable handling
33-
- [PR #1141](https://github.com/nf-core/rnaseq/pull/1141) - Important! Template update for nf-core/tools v2.11
34-
- [PR #1149](https://github.com/nf-core/rnaseq/pull/1149) - Fix and patch version commands for Fastp, FastQC and UMI-tools modules ([#1103](https://github.com/nf-core/rnaseq/issues/1103))
35-
- [PR #1144](https://github.com/nf-core/rnaseq/pull/1144) - Interface to kmer size for pseudoaligners
36-
- [PR #1143](https://github.com/nf-core/rnaseq/pull/1143) - Move fasta check back to Groovy ([#1142](https://github.com/nf-core/rnaseq/issues/1142))
37-
3839
## [[3.13.2](https://github.com/nf-core/rnaseq/releases/tag/3.13.2)] - 2023-11-21
3940

4041
### Credits

bin/tx2gene.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import argparse
77
import glob
88
import os
9+
import re
910
from collections import Counter, defaultdict, OrderedDict
1011
from collections.abc import Set
1112
from typing import Dict
@@ -50,14 +51,18 @@ def discover_transcript_attribute(gtf_file: str, transcripts: Set[str]) -> str:
5051
Returns:
5152
str: The attribute name that corresponds to transcripts in the GTF file.
5253
"""
54+
5355
votes = Counter()
5456
with open(gtf_file) as inh:
5557
# Read GTF file, skipping header lines
5658
for line in filter(lambda x: not x.startswith("#"), inh):
5759
cols = line.split("\t")
58-
# Parse attribute column and update votes for each attribute found
59-
attributes = dict(item.strip().split(" ", 1) for item in cols[8].split(";") if item.strip())
60-
votes.update(key for key, value in attributes.items() if value.strip('"') in transcripts)
60+
61+
# Use regular expression to correctly split the attributes string
62+
attributes_str = cols[8]
63+
attributes = dict(re.findall(r'(\S+) "(.*?)(?<!\\)";', attributes_str))
64+
65+
votes.update(key for key, value in attributes.items())
6166

6267
if not votes:
6368
# Log a warning if no matching attribute is found

0 commit comments

Comments
 (0)