iosobjective-clibxml2

Can’t parse too deep node by using libxml2


I generate a html of an img tag is wrapped by 255 div tags. I use libxml parse the html and output the result, but the img tag is missed in the result. But if the count of the div tags is 254, the output result is right, the img tag is show in it. Why it happend and How to fix it? Is the node level too deep? The code is below:

    NSString* html = @"<img src='https://www.baidu.com/img/PCtm_d9c8750bed0b3c7d089fa7d55720d6cf.png'>";
    for (int i = 0; i < 255; i++) {
        html = [NSString stringWithFormat:@"<div>%@</div>", html];
    }
    NSData* data = [html dataUsingEncoding:NSUTF8StringEncoding];

    xmlDocPtr _doc = htmlReadMemory([_data bytes], (int)[_data length], "", NULL, HTML_PARSE_NOWARNING | HTML_PARSE_NOERROR);
    if (_doc == NULL) {
        NSLog(@"Unable to parse.");
        return;
    }
    
    xmlXPathContextPtr _context = xmlXPathNewContext(_doc);
    if(_context == NULL) {
        NSLog(@"Unable to create XPath context.");
        return;
    }
    
    xmlNodePtr rootNode = xmlDocGetRootElement(_doc);
    xmlBufferPtr buffer = xmlBufferCreate();
    if (rootNode == NULL || rootNode == nil || buffer == NULL || buffer == nil) {
        return;
    }
    xmlNodeDump(buffer, rootNode->doc, rootNode, 0, 0);
    NSString *htmlContent = [NSString stringWithCString:(const char *)buffer->content encoding:NSUTF8StringEncoding];
    xmlBufferFree(buffer);

    xmlXPathFreeContext(_context);
    xmlFreeDoc(_doc);

    NSLog(@“%@”, htmlContent);

i try many different tags, but is the same way. Maybe the node level is too deep?


Solution

  • By default, the maximum depth of the document tree is indeed limited to 256. You should get an error message like:

    Excessive depth in document: 256 use XML_PARSE_HUGE option
    

    As the message explains, you get around this limitation by using the XML_PARSE_HUGE parser option.