By adding `tsv` or `hocr` to the end of the tesseract command, you can get the positions of words, example tsv shown below:  We should support this as a return type, possible converted to json (this might be cleanest)