c++xmlqtparsingqxmlstreamreader

Cannot identify empty Elements with QXmlStreamReader


I have some trouble detecting empty elements using the Qt QXmlStreamReader (Qt 4.8.1). There is an XML file that has the following section

<Groups Number="4">
  <Group Id="0" GroupName="Chambers">
    <MemberChannels>4,5,6,7,8,9,10,11</MemberChannels>
    <AverageShown>true</AverageShown>
  </Group>
  <Group Id="1" GroupName="Fluids">
    <MemberChannels>0,1,17,18</MemberChannels>
    <AverageShown>false</AverageShown>
  </Group>
  <Group Id="2"/>
  <Group Id="3"/>
</Groups>

As you can see, the elements with the Id 2 and 3 are empty except the attribute. The attribute doesn't change anything. If it's not in the element, the problem still occurs.

This is the parsing code using QXmlStreamReader, I simplified it, so it may not compile. It is just that you get the basic idea.

[...]
QXmlStreamReader* m_poStreamReader = new QXmlStreamReader;
[...]

if(m_poStreamReader->readNextStartElement() && m_poStreamReader->name().toString() == "Group") {
  this->parseGroupElement();
}

[...]

bool CTempscanXmlParser::parseGroupElement( void ) {
  TGroupElement tElement;
  if(m_poStreamReader->isStartElement() && !m_poStreamReader->isEndElement()) { // not empty
    TGroupElement tElement = this->readGroupElement();
  } else if(m_poStreamReader->isStartElement() && m_poStreamReader->isEndElement()) { // empty
    tElement.oGroupName = QString::null;
  }
  [...]
}

The documentation says:

Empty elements are also reported as StartElement, followed directly by EndElement.

I can use readNext() and still not receive an end element. It seems like the parser is only able to detect

<tag></tag>

as an empty element but not

<tag/>

So, is it just me or does the problem exist in Qt? And if so, how can I detect empty elements that do not consist of 2 seperated elements (start/end)?

Edit: So Huytard asked me for a working example. But his answer lead me to a solution that almost answered my question. Therefore I put a clarified example in my answer.


Solution

  • So thanks to Huytard I recognized that it is just the behaviour of QXmlStreamReader::readNextStartElement which is somehow unexpected. What I expected was, that it really would just read start elements. And originally I wanted to check beforehand if an element is empty and then decide, what to do with its content. It seems like that is not possible. And that impossibility is covered by the documentation I quoted myself. I.e. even if an atomic element is empty it is virtually followed by a start element which is actually an end element. Thats bad, since you cannot go back in a stream.

    Based on his answer I wrote a small example that both clarifies (sorry) and answers my original question.

      const QString XML_STR = "<Groups Number=\"4\">" \
                              "<Group Id=\"0\" GroupName=\"Chambers\">" \
                              "<MemberChannels>4,5,6,7,8,9,10,11</MemberChannels>" \
                              "<AverageShown>true</AverageShown>" \
                              "</Group>" \
                              "<Group Id=\"1\"/>" \
                              "<Group Id=\"2\"/>" \
                              "</Groups>";
    
      int main(int /* argc */, char** /* argv[] */)
      {
    
         qDebug() << "the way it would have made sense to me:";
         {
            QXmlStreamReader reader(XML_STR);
    
            while(!reader.atEnd())
            {
               reader.readNextStartElement();
               QString comment = (reader.isEndElement()) ? "is empty" : "has children";
               qDebug() << reader.name() << comment;
            }
         }
    
        qDebug() << "\napproximation to the way it should probably be done:";
        {
           QXmlStreamReader reader(XML_STR);
    
           bool gotoNext = true;
           while(!reader.atEnd())
           {
              if(gotoNext) {
                 reader.readNextStartElement();
              }
              QString output = reader.name().toString();
              reader.readNext();
              if(reader.isEndElement()) {
                 output += " is empty";
                 gotoNext = true;
              } else {
                 output += " has children";
                 gotoNext = false;
              }
              qDebug() << output;
           }
        }
    
        return 0;
      }
    

    which leads to the following output

      #they way it would have made sense to me: 
      "Groups" "has children" 
      "Group" "has children" 
      "MemberChannels" "has children" 
      "MemberChannels" "is empty" 
      "AverageShown" "has children" 
      "AverageShown" "is empty" 
      "Group" "is empty" 
      "Group" "has children" 
      "Group" "is empty" 
      "Group" "has children" 
      "Group" "is empty" 
      "Groups" "is empty" 
      "" "has children"
      # this has all been plain wrong
    
      #approximation to the way it should probably be done: 
      "Groups has children" 
      "Group has children" 
      "MemberChannels has children" 
      " is empty"           # ... but not a start element
      "AverageShown has children" 
      " is empty" 
      "Group has children"  # still wrong! this is an end element
      "Group is empty" 
      "Group is empty" 
      "Groups has children" # ditto
    

    I still don't like it. The way this works I need to triple-check everything which makes the code less readable.