|
| 1 | +.. _multimodal_configuration: |
| 2 | + |
| 3 | +###################### |
| 4 | +Multimodal configuration |
| 5 | +###################### |
| 6 | + |
| 7 | +After running the main conversion pipeline you can populate the required YAML parameters file to run the multimodal integration pipeline. |
| 8 | + |
| 9 | +.. _multimodal_parameters_file: |
| 10 | + |
| 11 | +*************** |
| 12 | +Parameters file |
| 13 | +*************** |
| 14 | + |
| 15 | +The parameters file looks like this: |
| 16 | + |
| 17 | +.. code-block:: yaml |
| 18 | +
|
| 19 | + outdir: "/path/to/output/" |
| 20 | +
|
| 21 | + url: http://localhost:3000/ |
| 22 | + project: my_project |
| 23 | + title: "My Project" |
| 24 | +
|
| 25 | + data: |
| 26 | + - |
| 27 | + dataset: scrnaseq |
| 28 | + obs_type: cell |
| 29 | + anndata: /path/to/main/output/scrnaseq-anndata.zarr |
| 30 | + offset: 0 |
| 31 | + is_spatial: false |
| 32 | + vitessce_options: |
| 33 | + spatial: |
| 34 | + xy: obsm/spatial |
| 35 | + mappings: |
| 36 | + obsm/X_umap: [0,1] |
| 37 | + matrix: X |
| 38 | + - |
| 39 | + dataset: visium |
| 40 | + obs_type: spot |
| 41 | + anndata: /path/to/main/output/visium-anndata.zarr |
| 42 | + offset: 1000000 |
| 43 | + is_spatial: true |
| 44 | + raw_image: /path/to/main/output/visium-raw.zarr |
| 45 | + label_image: /path/to/main/output/visium-label.zarr |
| 46 | + vitessce_options: |
| 47 | + spatial: |
| 48 | + xy: obsm/spatial |
| 49 | + matrix: X |
| 50 | +
|
| 51 | +In contrast to the main conversion pipeline's parameters file, this file includes a single `project` to which multiple `datasets` belong to. |
| 52 | + |
| 53 | +Each ``dataset`` block defines the name of the dataset and paths to the converted data and image files (if any). |
| 54 | + |
| 55 | +Each ``dataset`` also requires a set of ``vitessce_options`` that specify the location of certain data (spatial coordinates, embeddings, expression matrix, etc.) within the AnnData object that is processed/generated. |
| 56 | +This follows the same structure as in the :ref:`main conversion's vitessce_options <vitessce_options>`. |
| 57 | + |
| 58 | +Additionally, each ``dataset`` requires: |
| 59 | + |
| 60 | +* ``obs_type``, the type of observation of the dataset. For example, "cell" or "spot". |
| 61 | +* ``offset``, an integer offset to add to the dataset's ID's so they don't clash with the other datasets. |
| 62 | +* ``is_spatial``, whether the dataset contains spatial information and has associated image files (raw and/or label images) |
| 63 | + |
| 64 | +Given that raw images are only read but not modified the pipeline does not generate new output from them. |
| 65 | +In order for the output directory (defined by ``outdir``) to contain all necessary files that need to be served for the web application to consume, |
| 66 | +by default, the pipeline copies the raw images to the output directory. |
| 67 | +This process can take a long time depending on the size of the image. |
| 68 | +You may want to manually copy or move the image or serve it from its own directory separate from the rest of the output. |
| 69 | +The default copying can be disabled by setting ``copy_raw: false`` as a project-wide parameter (at the same level as ``outdir``, ``project``, etc). |
| 70 | +For example, |
| 71 | + |
| 72 | +.. code-block:: yaml |
| 73 | +
|
| 74 | + outdir: "/path/to/output/" |
| 75 | + url: http://localhost:3000/ |
| 76 | + project: my_project |
| 77 | + title: "My Project" |
| 78 | + copy_raw: false |
| 79 | +
|
| 80 | +
|
| 81 | +With additional features |
| 82 | +======================== |
| 83 | + |
| 84 | +Using the above example parameters file to run the multimodal integration pipeline will run the reindexing and intersection steps. |
| 85 | +To perform the concatenation of additional features (like celltypes) to visualise them as continuous values, some extra parameters need to be added. |
| 86 | + |
| 87 | +As a project-wide parameter (at the same level as ``outdir``, ``project``, etc.): |
| 88 | + |
| 89 | +* ``extend_feature_name``, the name of the additional feature. For example, "celltype" |
| 90 | + |
| 91 | +And at a ``dataset`` level: |
| 92 | + |
| 93 | +* ``extend_feature``, the location of the additional feature information. |
| 94 | + This can be either the path to a *cell2location* output file, or the location within the AnnData object where the feature is stored as a categorical within ``obs``. |
| 95 | + For example, ``/path/to/c2l.h5ad`` containing predicted continuous values, or ``obs/celltype`` containing categoricals. |
| 96 | + |
| 97 | +The full parameters file will then look like this |
| 98 | + |
| 99 | +.. code-block:: yaml |
| 100 | +
|
| 101 | + outdir: "/path/to/output/" |
| 102 | +
|
| 103 | + url: http://localhost:3000/ |
| 104 | + project: my_project |
| 105 | + title: "My Project" |
| 106 | +
|
| 107 | + extend_feature_name: celltype |
| 108 | +
|
| 109 | + data: |
| 110 | + - |
| 111 | + dataset: scrnaseq |
| 112 | + obs_type: cell |
| 113 | + anndata: /path/to/main/output/scrnaseq-anndata.zarr |
| 114 | + extend_feature: obs/celltype |
| 115 | + offset: 0 |
| 116 | + is_spatial: false |
| 117 | + vitessce_options: |
| 118 | + spatial: |
| 119 | + xy: obsm/spatial |
| 120 | + mappings: |
| 121 | + obsm/X_umap: [0,1] |
| 122 | + matrix: X |
| 123 | + - |
| 124 | + dataset: visium |
| 125 | + obs_type: spot |
| 126 | + anndata: /path/to/main/output/visium-anndata.zarr |
| 127 | + extend_feature: /path/to/c2l.h5ad |
| 128 | + offset: 1000000 |
| 129 | + is_spatial: true |
| 130 | + raw_image: /path/to/main/output/visium-raw.zarr |
| 131 | + label_image: /path/to/main/output/visium-label.zarr |
| 132 | + vitessce_options: |
| 133 | + spatial: |
| 134 | + xy: obsm/spatial |
| 135 | + matrix: X |
| 136 | +
|
| 137 | +With this parameters the multimodal integration pipeline will concatenate the expression matrix with the additional feature values so both can be queried and visualised across datasets within the same portal. |
0 commit comments