Parsing HTML with Spray

118 views Asked by At

I get an exception The entity “nbsp” was referenced, but not declared when parsing valid HTML that contains the &nbsp entity (which makes it invalid XML; I do not control the server) while unmarshalling a HttpEntity into a NodeSeq with spray.httpx.unmarshalling.BasicUnmarshallers.NodeSeqUnmarshaller.

I can probably preprocess the HTML to remove &nbsp, but what is the accepted method for parsing HTML (with &nbsp) with Spray?

1

There are 1 answers

0
Brian Kent On BEST ANSWER

You might try to write a Custom Unmarshaller that wraps JSoup.