pythonxmldtdlibxml2

DTD Validation Failing (Python)


I am doing a Python script which generates files from a XML + DTD passed as inputs, but it fails because the DTD cannot be validated, while I do not see any problem "visually".

Here is my code :

DTD = 'scenario.dtd'

def OpenXML(xmlDesc):
    dtd = libxml2.parseDTD(None,DTD)
    ctxt = libxml2.newValidCtxt()
    doc = libxml2.parseDoc(xmlDesc)

    frags = doc.xpathEval('/scenario/config_script/param/*')
    for frag in frags:
        frag.unlinkNode()   # We remove children of param for validation

    if doc.validateDtd(ctxt, dtd) != 1:
        print "ERROR : DTD Validation failed ! "
        sys.exit()

    doc.freeDoc()
    dtd.freeDtd()

    return libxml2.parseFile(xmlDesc)

So here is the DTD, and the XML String I pass as a parameter (xmlDesc)

Original DTD (scenario.dtd)

 <!ELEMENT scenario (name, description, config_script*)>
 <!ELEMENT name (#PCDATA)>
 <!ELEMENT description (#PCDATA)>
 <!ELEMENT config_script (param)>
 <!ELEMENT param ANY>

 <!ATTLIST scenario target (win32|win64|linux32|linux64) "win32">
 <!ATTLIST config_script name CDATA #REQUIRED>
 <!ATTLIST config_script repository CDATA #REQUIRED>

Value of the dtd variable (1st line of the function)

<!DOCTYPE none SYSTEM "scenario.dtd" [
 <!ELEMENT scenario (name, description, config_script*)>
 <!ELEMENT name (#PCDATA)>
 <!ELEMENT description (#PCDATA)>
 <!ELEMENT config_script (param)>
 <!ELEMENT param ANY>

 <!ATTLIST scenario target (win32|win64|linux32|linux64) "win32">
 <!ATTLIST config_script name CDATA #REQUIRED>
 <!ATTLIST config_script repository CDATA #REQUIRED>

]>

xml (everything is on the same line for me, but for readability I break lines)

<config_scripts>
    <script name="reset" repository="config_os">
        <param>
            <user>
                <name/>
                <full_name/>
                <password/>
                <groups/>
            </user>
        </param>
    </script>
</config_scripts>

And I finally get this error -> ERROR : DTD Validation failed !

Plus, I can read this in the console :

No declaration for element config_script
No declaration for element script
No declaration for attribute name of element script
No declaration for attribute repository of element script
No declaration for element user 
No declaration for element full_name
No declaration for element password
No declaration for element groups

But as far as I know, they are declared... Or maybe is it because I left all mark-ups empty ?

Any ideas ?

Best regards and thank you


Solution

  • I'm not sure if there is anything wrong with the Python code, but I can tell you what's wrong with your DTD.

    First your doctype declaration should match the name of your root element. You have none but your root element is config_scripts.

    You're specifying scenario.dtd within "scenario.dtd". You should remove the system identifier.

    In your xml you have a script element which is not defined. You do have a config_script defined though so either the XML or the DTD needs to be changed. I changed the DTD in my example. (I also combined the ATTLIST declarations.)

    You also didn't have these elements defined: user, full_name, password, and groups.

    Here's what the DTD should look like (without any modifications to the XML):

    <!DOCTYPE config_scripts [
    <!ELEMENT scenario (name, description, config_script*)>
    <!ELEMENT name (#PCDATA)>
    <!ELEMENT description (#PCDATA)>
    <!ELEMENT config_scripts (script)>
    
    <!ELEMENT script (param)>
    <!ATTLIST script 
               name CDATA #REQUIRED
               repository CDATA #REQUIRED> 
    
    <!ELEMENT param ANY>
    
    <!ELEMENT user (name,full_name,password,groups)>
    <!ELEMENT full_name (#PCDATA)>
    <!ELEMENT password (#PCDATA)>
    <!ELEMENT groups (#PCDATA)>
    
    <!ATTLIST scenario target (win32|win64|linux32|linux64) "win32">
    ]>
    

    The XML validates against this DTD in oXygen, so if any other changes need to be made, they will most likely need to be made in the Python code.