Now that we got image recognition to work, we shoud try "Object Detection". Something like this or this or this.
I wonder if we can download pre-trained models with that and then display the images from Smalltalk. Did you ever try? @SergeStinckwich ?