Python, pytesseract not recognizing image

42 views Asked by At

I have this code simple code

Path to exe is right, path to screenshot is right there is an image

screenshot.jpg

Why it can't recognize text, even if it's not something unreadable

I tried using ImageEnhance, threshold and converting to grayscale Also I tried with .png format, it still doesn't work Why does it happening?

import pytesseract as pt
from PIL import Image

pt.pytesseract.tesseract_cmd = 'D:\\tesseract\\tesseract.exe'

def extract_text_from_image(image_path):

    img = Image.open(image_path)

    text = pt.image_to_string(img)
    
    return text

image_path = 'D:\\script\\materials\\screenshot.jpg'
text = extract_text_from_image(image_path)
print(text)
1

There are 1 answers

0
H.Syd On

I have managed to read the text from your image, by passing the page segmentation mode 6 to the "image_to_string"-methode.

pt.image_to_string(img, config="--psm 6")

You can find out more about the different segmentation modes and what they do here: https://pyimagesearch.com/2021/11/15/tesseract-page-segmentation-modes-psms-explained-how-to-improve-your-ocr-accuracy/