Skip to content

Question: How to determine optimal K using ADAMIXTURE? #2

@rellaovo

Description

@rellaovo

Hi ADAMIXTURE team,

Thank you for developing and sharing this excellent tool! I’m writing to ask for some guidance regarding model selection.

Background

In ADMIXTURE, I usually determine the optimal number of clusters (K) using cross-validation, for example:

admixture pop.bed 16 --cv=10 -j8 --seed=12345

which reports:

CV error (K=16): 0.21043

This makes it straightforward to compare different K values and select the one with the lowest CV error.

My question

In ADAMIXTURE, I did not find a --cv option in the CLI. I understand that ADAMIXTURE reports log-likelihood and other metrics, but I am unsure about the recommended approach for selecting the best K.

Could you please advise:

  • What is the recommended way to choose the optimal K in ADAMIXTURE?
  • Is there an equivalent to ADMIXTURE’s cross-validation workflow?

If there is documentation, an example workflow, or a recommended strategy that I may have overlooked, I would greatly appreciate being pointed to it.

Thank you very much for your time and help!

Best regards,
Aaron

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions