Summary
Investigate how viable it is to make a CLI program also able to control the computer using Computer-Use tools (like mouse click, mouse move, take screenshot etc).
Since images are expensive in token we need to come up with a solution to "denoising" images and reducing their size to include only that's what is essential.
Only subset of the LLMs will support this.
Acceptance Criteria