I've got a directory full of salesforce objects in XML format. I'd like to identify the <fullName>
and parent file of all the custom <fields>
where <required>
is true. Here is some truncated sample data, lets call it "Custom_Object__c:
<?xml version="1.0" encoding="UTF-8"?>
<CustomObject xmlns="http://soap.sforce.com/2006/04/metadata">
<deprecated>false</deprecated>
<description>descriptiontext</description>
<fields>
<fullName>custom_field1</fullName>
<required>false</required>
<type>Text</type>
<unique>false</unique>
</fields>
<fields>
<fullName>custom_field2</fullName>
<deprecated>false</deprecated>
<visibleLines>5</visibleLines>
</fields>
<fields>
<fullName>custom_field3</fullName>
<required>false</required>
</fields>
<fields>
<fullName>custom_field4</fullName>
<deprecated>false</deprecated>
<description>custom field 4 description</description>
<externalId>true</externalId>
<required>true</required>
<scale>0</scale>
<type>Number</type>
<unique>false</unique>
</fields>
<fields>
<fullName>custom_field5</fullName>
<deprecated>false</deprecated>
<description>Creator of this log message. Application-specific.</description>
<externalId>true</externalId>
<label>Origin</label>
<length>255</length>
<required>true</required>
<type>Text</type>
<unique>false</unique>
</fields>
<label>App Log</label>
<nameField>
<displayFormat>LOG-{YYYYMMDD}-{00000000}</displayFormat>
<label>Entry ID</label>
<type>AutoNumber</type>
</nameField>
</CustomObject>
The desired output would be a dictionary with format something like:
required_fields = {'Custom_Object__1': 'custom_field4', 'Custom_Object__1': 'custom_field5',... etc for all the required fields in all files in the fold.}
or anything similar.
I've already gotten my list of objects through glob.glob, and I can get a list of all the children and their attributes with ElementTree but I'm struggling past there. I feel like I'm very close but I'd love a hand finishing this task off. Here is my code so far:
import os
import glob
import xml.etree.ElementTree as ET
os.chdir("/Users/paulsallen/workspace/fforce/FForce Dev Account/config/objects/")
objs = []
for file in glob.glob("*.object"):
objs.append(file)
fields_dict = {}
for object in objs:
root = ET.parse(objs).getroot()
....
and once I get the XML data parsed I don't know where to take it from there.
You really want to switch to using lxml
here, because then you can use an XPath query:
from lxml import etree as ET
os.chdir("/Users/paulsallen/workspace/fforce/FForce Dev Account/config/objects/")
objs = glob.glob("*.object")
fields_dict = {}
for filename in objs:
root = ET.parse(filename).getroot()
required = root.xpath('.//n:fullName[../n:required/text()="true"]/text()',
namespaces={'n': tree.nsmap[None]})
fields_dict[os.path.splitext(filename)[0]] = required
With that code you end up with a dictionary of lists; each key is a filename (without the extension), each value is a list of required fields.
The XPath query looks for fullName
elements in the default namespace, that have a required
element as sibling with the text 'true'
in them. It then takes the contained text of each of those matching elements, which is a list we can store in the dictionary.