I'm trying to parse xml file in Android app. When I'm trying to process it I don't receive eventType except START_DOCUMENT and END_DOCUMENT right after START. It looks like I will deliver empty file or something. The stranges thing is that I tryied this code on 5 different xml files and it worked (worked I mean I get different eventTypes than START and END) for corrupted one (there was missing some end tags). I was thinking that maybe I created wrong xml files but I even download some sample xml files and it didn't worked too.
some code:
public static void parseXML(Activity activity) throws XmlPullParserException, IOException {
XmlPullParserFactory parserFactory;
parserFactory = XmlPullParserFactory.newInstance();
parserFactory.setNamespaceAware(true);
XmlPullParser parser = parserFactory.newPullParser();
InputStream inputStream = activity.getAssets().open("XML_RENAME.xml");
InputStreamReader isReader = new InputStreamReader(inputStream);
BufferedReader reader = new BufferedReader(isReader);
parser.setInput(reader);
processParces(parser);
}
private static void processParces(XmlPullParser parser) throws XmlPullParserException, IOException {
int eventType = parser.getEventType();
String tagname = "";
String text = "";
while(eventType != XmlPullParser.END_DOCUMENT)
{
tagname = parser.getName();
switch(eventType)
{
case XmlPullParser.START_TAG:
if (tagname.equalsIgnoreCase(KEY_REGION)) {
}
break;
case XmlPullParser.TEXT:
//grab the current text so we can use it in END_TAG event
text = parser.getText();
Log.e("Text: ", text);
break;
case XmlPullParser.END_TAG:
if (tagname.equalsIgnoreCase(KEY_SECTOR)) {
Log.e("XML ",KEY_SECTOR);
} else if (tagname.equalsIgnoreCase(KEY_DIRECTIONS)) {
Log.e("XML ",KEY_DIRECTIONS );
} else if (tagname.equalsIgnoreCase(KEY_CONDITIONS)) {
Log.e("XML ",KEY_CONDITIONS );
} else if (tagname.equalsIgnoreCase(KEY_NEIGHBORS)) {
Log.e("XML ",KEY_NEIGHBORS );
} else if (tagname.equalsIgnoreCase(KEY_CONTINUATIONS)) {
Log.e("XML ",KEY_CONTINUATIONS );
} else if (tagname.equalsIgnoreCase(KEY_BLOCKS)) {
Log.e("XML ",KEY_BLOCKS );
}
break;
default:
break;
}
eventType = parser.next();
}
}
I found a problem. Some xml files contains at start some extra bytes, to be specific "EF BB BF". It's called BOM (Byte-Order-Mark
). When xml contains this extra bytes our XmlPullParser
doesn't work properly and behave like there is no START_TAG event and goes to END_DOCUMENT.