First you must tweak those images. I recommend a batch tool like XnViewMP which is free and multiplatform.
It has a file explorer. Select all your images, then go to Tools - Batch convert. Add actions like I did:
Here are my actions:
- HLS - make it grayscale:
- Hue: 0
- Lightness: 0
- Saturation: -127
- Levels - lower black level a bit so that the gray noise will disappear
- Black point: 0
- White point: 212 - may vary depending on image
- Reduce noise filter
- Adjust for increasing the contrast
- Brightness: 0
- Contrast: 127 - this one matters
- Gamma: 1.06
- Minimum for making the black thicker
- Filter size: 5x5 - may vary depending on image
Don't forget to save as tiff
(See Output tab). After that I run tesseract
:
tesseract test.tif text -psm 7
Note I selected PSM mode 7: Treat the image as a single text line. If you have multiple lines you'll probably need to use mode 6 or 3.
And here are the contents of text.txt
output file:
570 394 666 638 043