Add option for a custom alignment file in caprieval#1418
Add option for a custom alignment file in caprieval#1418AnnaKravchenko wants to merge 11 commits intomainfrom
Conversation
| if align_fname: | ||
| if not Path(align_fname).exists(): | ||
| raise FileNotFoundError(f"Custom alignment file {align_fname} not found") | ||
| align_func = partial(align_custom, custom_alig_file = Path(align_fname)) | ||
|
|
There was a problem hiding this comment.
This here is a bit odd, the function is deciding between structural/sequence alignment - what is the type of this new alignment you are adding?
There was a problem hiding this comment.
align_fname is a path to a custom alignment file - for those extremely rare cases when haddock cannot handle alignment, e.g. short DNA and extended DNA.
Here I am avoiding adding new allowed value in alignment_method parameter (would be smth like alignment_method = custom). Instead, I am adding a new parameter align_fname. And so the idea is to chek if user provided a custom alignment file - and also that this file exists - before checkig for the value in alignment_method.
Does it make sence?
There was a problem hiding this comment.
But not that I am checking - no need for the FileNotFoundError here, libutil will take care of it.
There was a problem hiding this comment.
I see, this is actually a good case for kwargs;
def get_align(method: str, **kwargs) -> partial[dict[str, dict[int, int]]]:
if method == "structure":
lovoalign_exec = kwargs.get("lovoalign_exec")
if lovoalign_exec is None:
raise ValueError("lovoalign_exec is required for structural alginment")
align_func = partial(align_strct, lovoalign_exec=lovoalign_exec)
elif method == "sequence":
align_func = partial(align_seq)
elif method == "custom":
align_fname = kwargs.get("align_fname")
if align_fname is None:
raise ValueError("align_fname is required for structure alignment")
align_func = partial(align_custom, custom_alig_file=Path(align_fname))
else:
available_alns = ("sequence", "structure", "custom")
raise ValueError(
f"Alignment method {method!r} not recognized. "
f"Available options are {', '.join(available_alns)}"
)
return align_funcThen this get_align can be called as:
foo = get_align("structure", lovoalign_exec=exec)
# OR
foo = get_align("custom", align_fname=fname)(I typed the code out of my head maybe the syntax is wrong)
There was a problem hiding this comment.
This implementation in my point of view is cleaner and simpler, *in your implementation you are effectively triggering a new type of method by using a parameter which is not "method", here you are using align_fname instead - you see the complexity?
There was a problem hiding this comment.
Yes and no. I do see how the code is cleaner, but I also see a new dependency between method == “custom” and align_fname=fname.
In practice, most if not all users will specify align_fname, while method will remain at its default, so “sequence” - this is smth we’ve seen resently with the per-interface-scoring PR. Sure, it’s possible to handle this mismatch, print error message etc., but wouldn’t it be more straightforward to ignore method var whenever align_fname is defined? Given I add comment in the code to explain this behavior?
There was a problem hiding this comment.
The frequency (or the practice) of a code pathway is not really relevant as the code exists regardless of how many times it is used.
This get_align is a factory function, and the proposed implementation will break this design. So it's not a matter of whether this is documented or not, it's more about design patterns. My suggestion is to stick to the patterns, as they exist for a reason.
There was a problem hiding this comment.
Then how do you interupt the run if align_fname=fname and method == “ sequence”?
Raising error in libparallel.py just logs warnings, but the run keeps going.
There was a problem hiding this comment.
Yep, this is also by design, parameter validation is done upstream. For this specific case this can be a method in the CAPRI class:
class CAPRI:
def __init__(
# ...
):
# ...
with log_error_and_exit():
self.validate_alignment_parameters()
def validate_alignment_parameters(self):
# handle the validation logic here
if self.params["align_fname"] is not None and method == "sequence":
raise ValueError("if `align_fname` is defined method must be `custom`")
# other checks etcf80c05c to
4ad1abd
Compare
4ad1abd to
f369cf1
Compare
Checklist
CHANGELOG.mdis updated to incorporate new changesSummary of the Pull Request
Related Issue
#1410
Additional Info
This does work on DNA-ligan example under the condition that full alignment is provided for both receptor and ligand chains. If any given chain is missing from the alignement - this chain is ignored by caprieval completely.