I had trouble with one PDF. It had double-page spreads as JPEG, but `pdfimages` extracted each twice which doubled the file size (I guess they were referenced with cropping or something in the PDF). Moreover they were rotated, and colour files despite the content being monochrome. These incantations solved it for me, albeit doing twice as much work as necessary:
pdfimages -all input.pdf tmp
exiftran -i -2 *.jpg
for i in tmp-???.jpg ; do jpegtran -grayscale -copy none -crop 877x1240+0+0 $i > $i-l.jpg ; done
for i in tmp-???.jpg ; do jpegtran -grayscale -copy none -crop 877x1240+876+0 $i > $i-r.jpg ; done
ls tmp-*.jpg-?.jpg > tmp.list
tesseract tmp.list output pdf
`jpegtran` is from `libjpeg-progs` on Debian-based Linux distributions. Not sure where `exiftran` is from, I already had it installed. It's better to use `jpegtran` than ImageMagick `convert` or other tools because it doesn't recompress (which is a lossy operation).