Skip to content

Conversation

srivarra
Copy link

@srivarra srivarra commented Sep 17, 2025

Annotations are saved in an AnnData object and then written to zarr.

Here is an example of this, annotation-specific AnnData object.

AnnData object with n_obs × n_vars = 18861 × 768
    obs: 'id', 'fov_name', 'track_id', 'parent_track_id', 'parent_id', 't', 'y', 'x'
    obsm: 'X_pca', 'X_phate', 'X_projections', 'X_umap'

You can access the observations Pandas DataFrame with adata.obs.

Embeddings are saved in obsm which is a Key-value store. For example you can access X_pca by running adata.obsm["X_pca"]

The feature matrix is saved as the main data array of the AnnData object, saved to .X.
Removes the / in FOV names, ("/B/1/000000" $\rightarrow$ "B/1/000000")

I tried using it with the current implementation of UMAP, but ran into some errors with n_neighbors so I added a quick kwarg fix.

Todos:

@srivarra srivarra requested a review from edyoshikun September 17, 2025 23:49
@srivarra srivarra marked this pull request as ready for review September 18, 2025 00:11
@edyoshikun edyoshikun requested a review from ziw-liu September 19, 2025 03:20
@ziw-liu
Copy link
Collaborator

ziw-liu commented Sep 19, 2025

Is it easy to include a converter from the old format?

@srivarra
Copy link
Author

Is it easy to include a converter from the old format?

Yes it's pretty straightforward. I'll add a function to do so.

@ziw-liu ziw-liu added enhancement New feature or request representation Representation learning (SSL) labels Sep 19, 2025
@ziw-liu ziw-liu added this to the v0.4.0 milestone Sep 19, 2025
@edyoshikun
Copy link
Member

edyoshikun commented Sep 19, 2025

I suggest adding a simple script that shows how to use this for plotting the PCs or wrangling the new format. Once we have that converter, then we should be good to go.

@srivarra srivarra requested a review from ziw-liu September 19, 2025 22:13
@srivarra
Copy link
Author

@edyoshikun Ready for re-review

@edyoshikun
Copy link
Member

Im getting an error when loading the annotations. There is a mismatch in the embedding xarray fov_name and the annotation_df['fov_name']. The fov_name should not have the forward slash.

annotations_path = "/hpc/websites/public.czbiohub.org/comp.micro/viscy/DynaCLR_data/DENV/test/20240204_A549_DENV_ZIKV_timelapse/extracted_inf_state.csv"
embeddings_path = "/hpc/websites/public.czbiohub.org/comp.micro/viscy/DynaCLR_data/DENV/test/20240204_A549_DENV_ZIKV_timelapse/precomputed_embeddings/infection_160patch_94ckpt_rev6_dynaclr.zarr"

@ziw-liu
Copy link
Collaborator

ziw-liu commented Sep 22, 2025

Can we have tests for these functions?

@srivarra srivarra requested a review from edyoshikun September 29, 2025 22:32
if __name__ == "__main__":
from jsonargparse import CLI

CLI(main)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for the CLI?

from viscy.representation.embedding_writer import get_available_index_columns


def convert_xarray_annotation_to_anndata(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test is failing because the function has the same name as the module (generally a thing to avoid).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request representation Representation learning (SSL)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants