Replies: 3 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
-
Hi @cbe135, the input channel of the mask autoencoder is 8 (see ).This is mainly because we use binary representation to encode the input mask, which saves memory. 8 channels can represent 2**8 (0~255) labels. Each channel represents a bit (see the following function). tutorials/generation/maisi/scripts/utils.py Lines 175 to 190 in 8b90a16 For example, label 1 is encoded as [0, 0, 0, 0, 0, 0, 0, 1]. For your use case, is the label of your dataset covered in the pre-defined label dict? |
Beta Was this translation helpful? Give feedback.
-
Hi @guopengf, thank you. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
During the implementation of our own Mask Autoencoder, we encountered two questions.
Specifically, our dataset masks have the characteristic of input channel of 1 and output channel of 1.
Our question relates to the different input channel and output channel setting used in pretraining the provided model weights.
The output channel count of 128 makes sense as there are 128 possible option choices for organs and disease phenomena.
In terms of the input channel count, we were wondering why the input channel was 7?
Also, in terms of fine tuning it to our dataset, we were thinking of averaging/modifying the first layer and last layer of the model into the desired input or output. Are there other recommended suggestions?
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions