How to extract specific text from a pdf using python?
ex: Pdf contain ( Name: Python , Color: Blue ). In that case I want to extract whatever text that comes after "Name:" and not to extract any text after the "," between "Python" and "Color".
Any help is appreciated.
import PyPDF2
pdf = open("C:\\Users\\ME\\Desktop\\test.pdf)
reader = PyPDF2.PdfReader(pdf)
page = reader.pages[0]
print(page.extract_text())
This extracts the whole pdf.
If your library returns a string, you can use regex to find your desired output: