I am trying to use DOMParser
method .parseFromString
to convert strings I have in array containing HTML in to DOM elements.
Some of the strings are getting the following parse errors and I can't figure out why.
This is the loop I'm using to parse the strings and create the DOM elements (thanks to this stackoverflow post: Converting HTML string into DOM elements?)
var x = 0;
while (x < stringsArray.length) {
var parser = new DOMParser();
var doc = parser.parseFromString(stringsArray[x].html, "text/xml");
outputDOMElements[x] = doc.firstChild;
x++;
}
This is an example of a string that is successfully parsed:
"<div class="instagrampost"><span>Siamak Amini</span><p>#USA</p><span>Posted 1 month ago</span><a href="https://instagram.com/p/3zG3kDGeE8/"><img src="https://scontent.cdninstagram.com/hphotos-xaf1/t51.2885-15/s320x320/e15/11377935_1114448771906000_731563461_n.jpg" /></a></div>"
This is an example of a string that has a parse error:
"<div class="user">
<a href="https://twitter.com/theclarkofben" aria-label="Ben Clark (screen name: theclarkofben)" data-scribe="element:user_link" target="_blank">
<img alt="" src="https://pbs.twimg.com/profile_images/1877162520/199389_10150123771869463_502259462_6247107_944624_n_normal.jpg" data-src-2x="https://pbs.twimg.com/profile_images/1877162520/199389_10150123771869463_502259462_6247107_944624_n_bigger.jpg" data-scribe="element:avatar">
<span >
<span data-scribe="element:name">Ben Clark</span>
</span>
<span data-scribe="element:screen_name">@theclarkofben</span>
</a>
</div><p class="tweet">Just testing out the Twitter feed I just made. <a href="https://twitter.com/hashtag/halogenpeanut?src=hash" data-scribe="element:hashtag" target="_blank">#halogenpeanut</a> <a href="http://t.co/WtoznYSUGS" data-pre-embedded="true" data-scribe="" target="_blank">pic.twitter.com/WtoznYSUGS</a></p><p class="timePosted"><a href="https://twitter.com/theclarkofben/status/611514122509922304">Posted on 18 Jun</a></p><div class="media"><img src="https://pbs.twimg.com/media/CHyI2rqWEAAJRN-.jpg:large" alt="Image from tweet" /></div>"
The parse error for the above string states: error on line 10 at column 7: Opening and ending tag mismatch: img line 0 and a
And here is the full output from .parseFromString
for the above string:
<div class="user"><parsererror xmlns="http://www.w3.org/1999/xhtml" style="display: block; white-space: pre; border: 2px solid #c77; padding: 0 1em 0 1em; margin: 1em; background-color: #fdd; color: black"><h3>This page contains the following errors:</h3><div style="font-family:monospace;font-size:12px">error on line 10 at column 7: Opening and ending tag mismatch: img line 0 and a
</div><h3>Below is a rendering of the page up to the first error.</h3></parsererror>
<a href="https://twitter.com/theclarkofben" aria-label="Ben Clark (screen name: theclarkofben)" data-scribe="element:user_link" target="_blank">
<img alt="" src="https://pbs.twimg.com/profile_images/1877162520/199389_10150123771869463_502259462_6247107_944624_n_normal.jpg" data-src-2x="https://pbs.twimg.com/profile_images/1877162520/199389_10150123771869463_502259462_6247107_944624_n_bigger.jpg" data-scribe="element:avatar">
<span>
<span data-scribe="element:name">Ben Clark</span>
</span>
<span data-scribe="element:screen_name">@theclarkofben</span></img></a></div>
Is anyone able to help me identify the cause and fix? Could it be the whitespace in the HTML string perhaps?
You have problems with quotings, you are using double quotes within the dom elements variables
var x = "<div class="instagrampost">
.... is wrong the quotes of the html shall be single, besides, the dom elements shall not have spaces, in order to fit in a variable...
Use doc.body.innerHTML
to set the parsed html to the div obtained, you do not need a loop, also use "text/html"
and not "text/xml"
when passing the content type to the parser.
below is a working example.
var html = "<div class='user'><a href='https://twitter.com/theclarkofben' aria-label='Ben Clark (screen name: theclarkofben)' data-scribe='element:user_link' target='_blank'><img alt='' src='https://pbs.twimg.com/profile_images/1877162520/199389_10150123771869463_502259462_6247107_944624_n_normal.jpg' data-src-2x='https://pbs.twimg.com/profile_images/1877162520/199389_10150123771869463_502259462_6247107_944624_n_bigger.jpg' data-scribe='element:avatar'/><span><span data-scribe='element:name'>Ben Clark</span></span><span data-scribe='element:screen_name'>@theclarkofben</span></a></div><p class='tweet'>Just testing out the Twitter feed I just made. <a href='https://twitter.com/hashtag/halogenpeanut?src=hash' data-scribe='element:hashtag' target='_blank'>#halogenpeanut</a> <a href='http://t.co/WtoznYSUGS' data-pre-embedded='true' data-scribe='' target='_blank'>pic.twitter.com/WtoznYSUGS</a></p><p class='timePosted'><a href='https://twitter.com/theclarkofben/status/611514122509922304'>Posted on 18 Jun</a></p><div class='media'><img src='https://pbs.twimg.com/media/CHyI2rqWEAAJRN-.jpg:large'alt='Image from tweet' /></div>";
parser = new DOMParser();
doc = parser.parseFromString(html, "text/html");
document.getElementById("parsedHtml").innerHTML = doc.body.innerHTML;
<div id="parsedHtml"></div>