Cannot convert Hebrew characters using pdftotext

49 views Asked by At

I have a PDF file that I can see and open, and send to every one:

The PDF I can read and see clearly

Now I want to convert it to text. I am using Linux so I use these 3 commands:

  1. pdftotext -enc ISO-8859-8 -layout barIlan.pdf bar.txt
  2. pdftotext -enc UTF-8 -layout barIlan.pdf bar.txt
  3. pdftotext -layout barIlan.pdf bar.txt

Each command did convert the PDF to text, but when I open the converted file I see:

The text after I convert it

I tried all the commands to convert it to text with a different encoding, but that did not help.

I am pretty sure it is a problem with the encoding because I have another Hebrew PDF, and when I use the command pdftotext -layout Ariel.pdf ariel.txt it works, and shows me the Hebrew characters.

0

There are 0 answers