Clarification on using `gather` function with databases having different scale factors

Hi,
I’m trying to confirm whether I’m misunderstanding any part of the documentation regarding `sourmash gather/fastgather/fastmultigather` and how these commands handle databases built with different scale factors.

While reviewing this guide: https://hackmd.io/vH2LMY38TEy8miUXI1OSNg . It seems possible to run `sourmash gather` using a query signature generated with `--scaled 1000` against a combination of databases where some are built with `scaled=1000` and others with `scaled=10000`. I’ve tested a similar setup and the command executes without errors.

I want to make sure I’m not missing any documentation on this. Specifically, I would like to confirm:
* Is it officially supported to gather against databases with different scale factors?
* Is there a recommended relationship between query scale and database scale? For example, if a query signature generated with `--scaled 1000`, should all databases satisfy scale=1000, ≤1000, ≥1000, or is any combination acceptable?
* Does mixing scales (e.g., 1000 + 10000) affect sensitivity, containment estimates, abundance, or the final interpretation in any meaningful way?

Sorry if this is a silly question 'cause I always assumed the query and database scales needed to match, so maybe I’ve been carrying around a wrong assumption.
If there are best practices for choosing appropriate scales (or any documentation I may have overlooked), I’d really appreciate any guidance.

Thanks so much!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on using `gather` function with databases having different scale factors #3870

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on using gather function with databases having different scale factors #3870

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Clarification on using `gather` function with databases having different scale factors #3870