I am trying to create a corpus using pdf documents

44 views Asked by At

When I am writing the code, I am getting the following error:

PDF error: Unknown Metadata type: 'XMP'
PDF error: Unknown Metadata type: 'XMP'

corp <- Corpus(URISource(files),
               readerControl = list(reader = readPDF))

I have saved all the pdf documents in the working directory, I have created the vector using the following code :

files <- list.files(pattern = "pdf$")

and I am able to see the length of each pdf file using the following piece of codes:

my_corpus <- lapply(files, pdf_text)
length(my_corpus)
lapply(my_corpus, length) 

How can I solve this?

0

There are 0 answers