I have a manipulated HTML text like this:
Lorem ipsum <a href="#">dolor</a> sit amet
<div>Consectetur adipisicing elit, sed do </div> eiusmod tempor
incididunt ut labore et dolore magna aliqua.
I need to convert these lines into this:
<p>Lorem ipsum <a href="#">dolor</a> sit amet</p>
<div>Consectetur adipisicing elit, sed do </div>
<p>eiusmod tempor</p>
<p>incididunt ut labore et dolore magna aliqua</p>
How to convert? Any ideas? Thank you!
This is a little lengthy, so please read carefully. I have annotated the important parts to give a general idea as to the flow of the function.
function paragraphify(parent) {
var i, str, p, workspace = document.createElement('div'), nodes = parent.childNodes;
p = document.createElement('p');
workspace.appendChild(p);
// Filter nodes out of parent and into the workspace.
while(nodes.length > 0) {
// Get the first child node of the parent element.
node = nodes[0];
// Divs and paragraphs need not be processed; skip them.
if(node.nodeName === 'P' || node.nodeName === 'DIV') {
workspace.insertBefore(node, p);
continue;
}
// Drop the node into the paragraph.
p.appendChild(node);
// Skip non-text nodes.
if(node.nodeName !== '#text') { continue; }
// We need to parse the text of the node for newlines.
str = node.nodeValue;
for(i = 0; i < str.length; i += 1) {
if(str[i] === '\n') {
// If text contains a newline ...
if(i < (str.length - 1)) {
// ... and there's enough space to split it, then split it.
parent.insertBefore(document.createTextNode(str.substr(i+1)), nodes[0]);
node.nodeValue = str.substr(0, i+1);
}
// Create a new paragraph for holding elements, and add it to the workspace.
p = document.createElement('p');
workspace.appendChild(p);
// Break here to return to the node-processing loop.
// If the text was split on a newline, then that will be the next node to be processed.
break;
}
}
}
// Pull the nodes back out of the workspace and into the parent element.
nodes = workspace.children;
while(nodes.length > 0) {
node = nodes[0];
// Skip empty paragraphs.
if(node.nodeName === 'P' && node.textContent.replace(/\s/g, '').length === 0) {
workspace.removeChild(node);
}
else {
parent.appendChild(node);
}
}
}
This function will do as you have specified in your example. Paragraphify
iterates through the children nodes of the parent
argument, skipping <div>
and <p>
elements, as those need not be formatted. It creates a paragraph node and moves the parent's nodes in, one at a time, until it encounters a newline character within a text node, at which time it splits the text node appropriately.
This is processed until the parent element is empty. The workspace elements are then transferred back into parent. This was done to make processing much simpler, as manipulating a collection that you are actively iterating can be messy.
I should warn that this does not descend further than the parent's immediate children, but if you have this need, please let me know. To put that basically, this function would not perform this translation:
<span>Hello
world</span>
... into ...
<p><span>Hello</span></p>
<p><span>world</span></p>
Even then, this should be a good example of the base functionality required for line processing in HTML with basic Javascript.