I am using QXmlSimpleReader to parse an XML file with internally defined entities in it, e.g.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ELEMENT root (#PCDATA)>
<!ENTITY ent "some internally defined entity">
]>
<root>
text &ent; text
</root>
I am handling the document with a QXmlDefaultHandler subclass and the most I can do about internal entities is to have their usage reported.
By default all internally defined entities (&ent; in the example above) are substituted into characters automatically. How can I change this behavior, so that I can specify to what string should they be replaced? I am also fine with switching to another Qt's XML reader if that is required to make it work.
I found one way to do it, although it is more of a hack then a proper solution, since it doesn't stop Qt from actually replacing the entity characters with resolved ones. It's just a workaround where those characters are ignored.
First, make the
QXmlSimpleReaderreport entities by setting the appropriate feature and handle content and lexical info:Next, in the
handlerabove, overridebool QXmlLexicalHandler::startEntity(const QString &name)andbool QXmlLexicalHandler::endEntity(const QString &name)and keep inside a state whether the reader is currently reading an entity. When it is, just ignore input frombool QXmlContentHandler::characters(const QString &ch)and instead just handle the resolution instartEntityorendEntity.