-
Notifications
You must be signed in to change notification settings - Fork 17
Description
I am currently working with the provided datasets and noticed a significant discrepancy between the processed_LAE-1M versions and the original official datasets (e.g., DOTAv2 / DIOR).
Observation: I compared processed_LAE-1M_DOTAv2_train.json with the official DOTAv2_train.json and found that the "1M" version has drastically fewer annotations.
File Size: The 1M version (67MB) is much smaller than the original (165MB).
Annotation Count: In dense scenes, the original dataset has ~2000 objects, while the 1M version often has fewer than 1000.
c9b2ff236" />
My Question: What is the specific purpose or intended usage of this processed_LAE-1M dataset?
Is it intended to be a subset for low-shot/semi-supervised learning?
Is it generated by a model (pseudo-labels) rather than human annotation?
Or is this a potential data processing error?
I am confused because the annotations seem too sparse to be used as standard Ground Truth for fully supervised training. Clarification on how this dataset fits into the LAE pipeline would be greatly appreciated.
Thanks!