javaandroidxmlxmlpullparserandroid-xmlpullparser

XmlPullParser skip from START_DOCUMENT to END_DOCUMENT eventType


I'm trying to parse xml file in Android app. When I'm trying to process it I don't receive eventType except START_DOCUMENT and END_DOCUMENT right after START. It looks like I will deliver empty file or something. The stranges thing is that I tryied this code on 5 different xml files and it worked (worked I mean I get different eventTypes than START and END) for corrupted one (there was missing some end tags). I was thinking that maybe I created wrong xml files but I even download some sample xml files and it didn't worked too.

some code:

public static void parseXML(Activity activity) throws XmlPullParserException, IOException {
        XmlPullParserFactory parserFactory;
        parserFactory = XmlPullParserFactory.newInstance();
        parserFactory.setNamespaceAware(true);
        XmlPullParser parser = parserFactory.newPullParser();
        InputStream inputStream = activity.getAssets().open("XML_RENAME.xml");
        InputStreamReader isReader = new InputStreamReader(inputStream);
        BufferedReader reader = new BufferedReader(isReader);
        parser.setInput(reader);
        processParces(parser);
    }

    private static void processParces(XmlPullParser parser) throws XmlPullParserException, IOException {
        int eventType = parser.getEventType();
        String tagname = "";
        String text = "";
        while(eventType != XmlPullParser.END_DOCUMENT)
        {
            tagname = parser.getName();
            switch(eventType)
            {
                case XmlPullParser.START_TAG:
                    if (tagname.equalsIgnoreCase(KEY_REGION)) {
                    }
                    break;

                case XmlPullParser.TEXT:
                    //grab the current text so we can use it in END_TAG event
                    text = parser.getText();
                    Log.e("Text: ", text);
                    break;

                case XmlPullParser.END_TAG:
                    if (tagname.equalsIgnoreCase(KEY_SECTOR)) {
                        Log.e("XML ",KEY_SECTOR);
                    } else if (tagname.equalsIgnoreCase(KEY_DIRECTIONS)) {
                        Log.e("XML ",KEY_DIRECTIONS );
                    } else if (tagname.equalsIgnoreCase(KEY_CONDITIONS)) {
                        Log.e("XML ",KEY_CONDITIONS );
                    } else if (tagname.equalsIgnoreCase(KEY_NEIGHBORS)) {
                        Log.e("XML ",KEY_NEIGHBORS );
                    } else if (tagname.equalsIgnoreCase(KEY_CONTINUATIONS)) {
                        Log.e("XML ",KEY_CONTINUATIONS );
                    } else if (tagname.equalsIgnoreCase(KEY_BLOCKS)) {
                        Log.e("XML ",KEY_BLOCKS );
                    }
                    break;

                default:
                    break;
            }
            eventType = parser.next();
            }
        }

Solution

  • I found a problem. Some xml files contains at start some extra bytes, to be specific "EF BB BF". It's called BOM (Byte-Order-Mark). When xml contains this extra bytes our XmlPullParser doesn't work properly and behave like there is no START_TAG event and goes to END_DOCUMENT.