Skip to content

prepare_hpo_data #39

@pnrobinson

Description

@pnrobinson

Can we simplify this? Why are we returning a Tuple here?

def prepare_hpo_data(
        phenopackets: List[ppkt.Phenopacket], 
        hpo_file: Optional[Union[IO, str]] = None,
        variant_effect_type: Optional[VariantEffect] = None,
        mane_tx_id: Optional[Union[str, List[str]]] = None,
        external_target_matrix: Optional[pd.DataFrame] = None, 
        threshold: float = 0, 
        mode: Optional[str] = None,
        use_label: bool = True,
        nan_strategy: Optional[str] = None,
    ) -> Tuple[Tuple[pd.DataFrame,Optional[pd.DataFrame]], pd.DataFrame]:

It seems we are not using the second item of the Tuple. Also, the signature does not seem to match with what hpo_data is.
In general, we should avoid returning nested tuples like this, it is hard to understand. If we need this amount of complexity, we should make a data class or something to store the various items!
Here is the error that I see:

hpo_matrix, _ = PhenopacketMatrixProcessor.prepare_hpo_data(
    phenopackets=phenopackets, 
    threshold=0, 
    mode=None, 
    use_label=True,
    nan_strategy=None)

analyzer = HPOStatisticsAnalyzer(
    hpo_data = hpo_matrix,
    min_individuals_for_correlation_test=30)


Argument of type "Tuple[DataFrame, DataFrame | None]" cannot be assigned to parameter "hpo_data" of type "Tuple[DataFrame, DataFrame, DataFrame | None]" in function "__init__"

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions