I ran Grounding DINO on my own image, which contains boxes, an emergency alert, a lamp, and other objects. I used the official Colab notebook for Grounding DINO. According to the notebook, it's possible to detect multiple objects using a single prompt by separating them with commas.
However, when I try to detect three objects at once—like 'boxes, emergency alert, lamp'—none of them are detected. If I include only two objects, it detects only the last one.
Can you please help me understand why I'm unable to detect multiple objects using a single prompt?
Thank you in advance.