Skip to content

Create 01_input/-located guide file that controls clustering #70

@AndrewRadev

Description

@AndrewRadev

Right now, we generate geostas with K from 2 to 10, but that's arbitrary. We should have a simple CSV file in 01_input/ for each protein with a list of clusterings and (optional) Ks, e.g.:

method K
Chainsaw
Merizo
GeoStaS K-means 4-7
GeoStaS Hierarchical 5-8

This file can be automatically generated if it doesn't exist with the default list, but then users can edit it to refine the types of clustering they're interested in and rerun. The result should be fast, because all the intermediate files would be there, only the clustering output, and the collected clustering output would be different (we'd have that input file as a dependency in select rules).

In general, the input dir should be for the original input files, but in this case, I think this counts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions