-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
🚀 The feature
@Alexandre-SCHOEPP @NicolasHug @AntoineSimoulin
Thank you for releasing the tv_tensor keypoints functionality. I use it in my university teaching, where students build a custom keypoint dataloader for training Keypoint R-CNN.
The tv_tensor utilities correctly handle the geometric transformations, but unfortunately they do not yet support the visibility flag, which is required for training Keypoint R-CNN.
Ultimately, this leads to redundant and inefficient code when building custom dataloaders. See the snippet below:
with open(annotation_path) as f:
data = json.load(f)
shapes = data["shapes"]
keypoints = []
for shape in shapes:
cx, cy = shape["points"][0]
keypoints.append([cx, cy])
keypoints = torch.tensor(keypoints, dtype=torch.float32)
keypoints = keypoints.view(-1, 1, 2)
target = {}
target["keypoints"] = tv_tensors.KeyPoints(keypoints, canvas_size=F.get_size(img))
if self.transforms is not None:
img, target = self.transforms(img, target)
## add visibility flag - redundant code
rows = target['keypoints'].shape[0]
visibility = torch.full(((rows, 1, 1)), 2.0, device=target['keypoints'].device, dtype=target['keypoints'].dtype)
target['keypoints'] = torch.cat([target['keypoints'], visibility], dim=2)Motivation, pitch
It would be much more efficient if I could include the visibility flag directly in the keypoints list and then wrap it with the tv_tensors.KeyPoints class. However, this is currently not possible due to the input tensor shape.
A useful new feature would be to apply the transformations only to the first two values (x, y) while leaving the third value (visibility) unchanged.
I’m curious to hear your thoughts on this potential feature. Thank you in advance.
Alternatives
No response
Additional context
No response