pythonxml-parsingjenkinsyamljenkins-job-builder

How to convert jenkins job configuration config.xml to YAML format in python to be used jenkins-job-builder?


jenkins-job-builder is a nice tool to help me to maintain jobs in YAML files. see example in configuration chapter.

Now I had lots of old jenkins jobs, it will be nice to have a python script xml2yaml to convert the existing jenkins job config.xml to YAML file format.

Do you any suggestions to had a quick solution in python ?

I don't need it to be used in jenkins-job-builder directly, just can be converted it into YAML for reference.

For the convert, some part can be ignored like namespace.

config.xml segment looks like:

<project>
  <logRotator class="hudson.tasks.LogRotator">
    <daysToKeep>-1</daysToKeep>
    <numToKeep>20</numToKeep>
    <artifactDaysToKeep>-1</artifactDaysToKeep>
    <artifactNumToKeep>-1</artifactNumToKeep>
  </logRotator>
  ...
</project>

The yaml output could be:

- project:
   logrotate:
     daysToKeep: -1
     numToKeep: 20
     artifactDaysToKeep: -1
     artifactNumToKeep: -1

If you are not familiar with config.xml in jenkins, you can check infra_backend-merge-all-repo job in https://ci.jenkins-ci.org


Solution

  • It's hard to tell from your question exactly what you're looking for here, but assuming you're looking for the basic structure:

    Python has good support on most platforms for XML Parsing. Chances are you'll want to use something simple and easy to use like minidom. See the XML Processing Modules in the python docs for your version of python.

    Once you've opened the file, looking for project and then parsing down from there and using a simple mapping should work pretty well given the simplicity of the yaml format.

    from xml.dom.minidom import parse
    
    def getText(nodelist):
        rc = []
        for node in nodelist:
            if node.nodeType == node.TEXT_NODE:
                rc.append(node.data)
        return ''.join(rc)
    
    def getTextForTag(nodelist,tag):
        elements = nodelist.getElementsByTagName(tag)
        if (elements.length>0):
            return getText( elements[0].childNodes)
        return ''
    
    def printValueForTag(parent, indent, tag, valueName=''):
        value = getTextForTag( parent,tag)
        if (len(value)>0):
            if (valueName==''):
                valueName = tag
            print indent + valueName+": "+value
    
    def emitLogRotate(indent, rotator):
        print indent+"logrotate:"
        indent+='  '
        printValueForTag( rotator,indent, 'daysToKeep')
        printValueForTag( rotator,indent, 'numToKeep')  
    def emitProject(project):
        print "- project:"
        # all projects have log rotators, so no need to chec
        emitLogRotate("   ",project.getElementsByTagName('logRotator')[0])
        # next section...
    
    dom = parse('config.xml')
    emitProject(dom)
    

    This snippet will print just a few lines of the eventual configuration file, but it puts you in the right direction for a simple translator. Based on what I've seen, there's not much room for an automatic translation scheme due to naming differences. You could streamline the code as you iterate for more options and to be table driven, but that's "just a matter of programming", this will at least get you started with the DOM parsers in python.