Skip to content

Conversation

@buiminhhien2k
Copy link

Hi there,

This PR is for UMIC score. The difference between this PR and the UMIC author are:

  1. The UMIC author precompute the image feature of the popular benchmark dataset, but they did not provide the code to compute visual feature, and my work did it with Detectron2, it is the class ImageFeatureEmbedder. That means this version can be applied for new datasets as well.
  2. Their work seems to use the faster_rcnn_R_101_C4_3x.yaml which produce the visual feature with size of (B, N, 2048) and my work use faster_rcnn_R_101_FPN_3x.yaml which produce the visual feature with size of (B, N, 1024). But the idea is the same, both methods want to represent image with regional image feature. Potential for future work to replicate their work with faster_rcnn_R_101_C4_3x.yaml config.

the files models/uniter/model.py and models/uniter/layer.py is originated from UNITER repository. I only replace their choice of FusedLayerNorm from apex because it was kinda unreasonable, nobody wants to install the whole apex for just a norm layer which can easily imported from torch LayerNorm

Lastly, the instruction on how to install detectron2 is already included in the umic_score.py file, I didnt modify your requirements.txt to include my detectron2 version because it was a quite tricky process to get it running on my environment. so I recommend to install it separately.

Cheers,

Hien Bui

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant