Conversation
…ding errors. Added test for this.
tcezard
left a comment
There was a problem hiding this comment.
This is a good improvement already. It makes the code cleaner.
I've added a few suggestions that you can address right away or in the next PR.
| pragma_fileformat = generate_custom_unstructured_metainformation_line("fileformat", "VCFv4.4") | ||
| pragmas_to_add.append(pragma_fileformat) |
There was a problem hiding this comment.
These 2 lines could be simplified
| pragma_fileformat = generate_custom_unstructured_metainformation_line("fileformat", "VCFv4.4") | |
| pragmas_to_add.append(pragma_fileformat) | |
| pragmas_to_add.append(generate_custom_unstructured_metainformation_line("fileformat", "VCFv4.4")) |
Same thing could be done on the other ones below.
| for pragma in gvf_pragmas: | ||
| # file date | ||
| if pragma.startswith("##file-date"): | ||
| date = pragma.split(" ")[1].replace("-", "") |
There was a problem hiding this comment.
For a latter refactoring but the if/elif could be converted to a mapping that looks like:
for pragma in gvf_pragmas:
pragma_name, pragma_value = parse_pragma(pragma) # <== need a function to parse the pragma
if pragma_name in list_of_pragma:
vcf_header_key = pragma_name_to_vcf_header.get(pragma_name) # <== need a map that match the pragma_name to a VCF header name
pragmas_to_add.append(generate_custom_unstructured_metainformation_line(vcf_header_key, pragma_value))
tests/test_convert_gvf_to_vcf.py
Outdated
| header_type = "FORMAT" | ||
| file_lines = read_file(prefix,header_type) | ||
| assert len(file_lines) > 1 | ||
| #2 |
There was a problem hiding this comment.
I wouldn't number the tests as it will quickly become unsustainable or unreliable.
| #3 | ||
| def test_read_info_attributes(self): | ||
| info_attribute_dict = read_info_attributes(self.info_attribute_input_file) | ||
| assert len(info_attribute_dict) > 1 |
There was a problem hiding this comment.
It is worth adding something that check length of the array and the content of the first element to ensure is formatted as intended.
| kv = next(iter(info_attribute_dict.items())) | ||
| key_to_check = "3'_inner_flank_link" | ||
| value_to_check = ["3'_inner_flank_link", '.', 'String', 'Three prime inner flank link', 'DGVa'] | ||
| assert kv[0] == key_to_check | ||
| assert kv[1] == value_to_check |
There was a problem hiding this comment.
If you're goinf to test 3'_inner_flank_link you might as well get it explicitly from the dict.
Unless you think that the fact that it is the first element added to the dict is important but I don't see why it would matter
| kv = next(iter(info_attribute_dict.items())) | |
| key_to_check = "3'_inner_flank_link" | |
| value_to_check = ["3'_inner_flank_link", '.', 'String', 'Three prime inner flank link', 'DGVa'] | |
| assert kv[0] == key_to_check | |
| assert kv[1] == value_to_check | |
| expected_value = ["3'_inner_flank_link", '.', 'String', 'Three prime inner flank link', 'DGVa'] | |
| assert info_attribute_dict.get("3'_inner_flank_link") == expected_value |
changes made to clean up