-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
Line 85 in ec5afa9
| features: Dict[str, _tensor_t] = { |
I'd like to use this feature extractor for standard ViT (CLIP) model. I found that the naive output of attention layer is tuple shape (activation, None). Since this feature extractor is also used in icnn.py, it will raise error when we perform reconstruction analysis using attention layer. One way to avoid this issue is just selecting the first element when the output is a tuple.
if type(output) is tuple:
features[layer] = output[0]
else:
features[layer] = output
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels