Use open3d point cloud [methods](https://github.com/remyxai/VQASynth/blob/32e5fb31ebfbfbbef4090df02d9ad648699847df/tests/data_processing/clipseg_data_processing.py#L229) to get 3D bounding box of objects. Synthesize QA pairs for 3D object detection tasks similar to 