pythonxmlcsvopenrefinegrel

What is the Python way to express a GREL line that is creating as many tags as needed in an XML document?


I'm using Open Refine to do something that I KNOW Python can do. I'm using it to convert a csv into an XML metadata document. I can figure out most of it, but the one thing that trips me up, is this GREL line:

{{forEach(cells["subjectTopicsLocal"].value.split('; '), v, '<subject authority="local"><topic>'+v.escape("xml")+'</topic></subject>')}}

What this does, is beautiful for me. I've got a "subject" field in my Excel spreadsheet. My volunteers enter keywords, separated with a "; ". I don't know how many keywords they'll come up with, and sometimes there is only one. That GREL line creates a new <subject authority="local"><topic></topic></subject> for each term created, and of course slides it into the field.

I know there has to be a Python expression that can do this. Could someone recommend best practice for this? I'd appreciate it!


Solution

  • Basically you want to use 'split' in Python to convert the string from your subject field into a Python list, and then you can iterate over the list.

    So assuming you've read the content of the 'subject' field from a line in your csv/excel document already and assigned it to a string variable 'subj' you could do something like:

    subjList = subj.split(";")
    for subject in subjList:
      #do what you need to do to output 'subject' in an xml element here