I am having a hard time figuring out how to bind a ResolveEntityHandler of my own to a SAX parser. On SO there this answer. But unfortunately I cannot reproduce the result there.
When I run the following code, which is actually copied from the aforementioned answer, just updated to Python 3,
import io
import xml.sax
from xml.sax.handler import ContentHandler
# Inheriting from EntityResolver and DTDHandler is not necessary
class TestHandler(ContentHandler):
# This method is only called for external entities. Must return a value.
def resolveEntity(self, publicID, systemID):
print ("TestHandler.resolveEntity(): %s %s" % (publicID, systemID))
return systemID
def skippedEntity(self, name):
print ("TestHandler.skippedEntity(): %s" % (name))
def unparsedEntityDecl(self, name, publicID, systemID, ndata):
print ("TestHandler.unparsedEntityDecl(): %s %s" % (publicID, systemID))
def startElement(self, name, attrs):
summary = attrs.get('summary', '')
print ('TestHandler.startElement():', summary)
def main(xml_string):
try:
parser = xml.sax.make_parser()
curHandler = TestHandler()
parser.setContentHandler(curHandler)
parser.setEntityResolver(curHandler)
parser.setDTDHandler(curHandler)
stream = io.StringIO(xml_string)
parser.parse(stream)
stream.close()
except xml.sax.SAXParseException as e:
print ("ERROR %s" % e)
XML = """<!DOCTYPE test SYSTEM "test.dtd">
<test summary='step: #'>Entity: ¬</test>
"""
main(XML)
and the external test.dtd
<!ENTITY num "FOO">
<!ENTITY pic SYSTEM 'bar.gif' NDATA gif>
What I got is
TestHandler.startElement(): step:
TestHandler.skippedEntity(): not
Process finished with exit code 0
So my questions are:
resolveEntity
never called?What you are seeing has to do with a change in Python 3.7.1:
Changed in version 3.7.1: The SAX parser no longer processes general external entities by default to increase security. Before, the parser created network connections to fetch remote files or loaded local files from the file system for DTD and entities. The feature can be enabled again with method
setFeature()
on the parser object and argumentfeature_external_ges
.
To get the same behaviour as in earlier versions, add these lines:
from xml.sax.handler import feature_external_ges
and (in the main
function)
parser.setFeature(feature_external_ges, True)