There is a slight difference in the /ai/annotate-image compared to the frontend, due to some unknown conversion frontend-side.
Base image: image
Base image from frontend: image_input_frontent
Putting the image_input_frontent through the endpoint yields the same result as the frontend.