pythonxmllxmlelementtree

Python lxml xpath not working on some elements


I'm having trouble extracting a specific element text from a soap response. Other elements seems to be working fine.

I have tried the following:

Python 3.13.3 (main, Apr  8 2025, 13:54:08) [Clang 16.0.0 (clang-1600.0.26.6)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> xml = '''<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
...   <soap:Body>
...     <soap:Fault>
...       <soap:Code>
...         <soap:Value>soap:Sender</soap:Value>
...         <soap:Subcode>
...           <soap:Value xmlns:ns1="http://docs.oasis-open.org/wss/oasis-wss-wssecurity-secext-1.1.xsd">
...             ns1:unauthorized
...           </soap:Value>
...         </soap:Subcode>
...       </soap:Code>
...       <soap:Reason>
...         <soap:Text xml:lang="en">AccessResult: result: Access Denied | AuthenticationAsked: true |
...           ErrorCode: IDP_ERROR:
...           137 | ErrorReason: null</soap:Text>
...       </soap:Reason>
...       <soap:Detail>
...         <WebServiceFault xmlns="http://www.taleo.com/ws/integration/toolkit/2005/07">
...           <code>SystemError</code>
...           <message>AccessResult: result: Access Denied | AuthenticationAsked: true | ErrorCode:
...             IDP_ERROR: 137 |
...             ErrorReason: null</message>
...         </WebServiceFault>
...       </soap:Detail>
...     </soap:Fault>
...   </soap:Body>
... </soap:Envelope>'''
>>> root = etree.fromstring(xml)
>>> print(root)
<Element {http://www.w3.org/2003/05/soap-envelope}Envelope at 0x1032c4680>
>>> ns = { 'soap':'http://www.w3.org/2003/05/soap-envelope', 'ns1':'http://docs.oasis-open.org/wss/oasis-wss-wssecurity-secext-1.1.xsd"' }
>>> print(root.xpath('//soap:Subcode/soap:Value',namespaces=ns)[0].text)

            ns1:unauthorized
          
>>> print(root.xpath('//soap:Reason/soap:Text',namespaces=ns)[0].text)
AccessResult: result: Access Denied | AuthenticationAsked: true |
          ErrorCode: IDP_ERROR:
          137 | ErrorReason: null
>>> print(root.xpath('//soap:Detail/WebServiceFault/message',namespaces=ns)[0].text)
Traceback (most recent call last):
  File "<python-input-7>", line 1, in <module>
    print(root.xpath('//soap:Detail/WebServiceFault/message',namespaces=ns)[0].text)
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

For some reason, the message element text I can't get.

Appreciate any help.


Solution

  • add name namespace in ns dist and as prefix

    
    <WebServiceFault xmlns="http://www.taleo.com/ws/integration/toolkit/2005/07">
    
    ns = { 'soap':'http://www.w3.org/2003/05/soap-envelope', 'ns1':'http://docs.oasis-open.org/wss/oasis-wss-wssecurity-secext-1.1.xsd"', 'ns2': 'http://www.taleo.com/ws/integration/toolkit/2005/07' }
    
    print(root.xpath('//soap:Detail/ns2:WebServiceFault/ns2:message',namespaces=ns)[0].text)
    
    #Output
    AccessResult: result: Access Denied | AuthenticationAsked: true | ErrorCode:
             IDP_ERROR: 137 |
             ErrorReason: null