htmldom

Finding an html element based on text content


I have a html code like

<div>
   <span>TV</span>
</div>

I want to find this span through documentObject having text 'TV', like getElementById etc ... something like getElementByText. I know that it's possible through XPath/JQuery/Regex.

But I need it to get through DOM object model only. As only DOM model is available in my context. I see couple of answers:

But these are not helpful to me, as I need to get it through DOM model only.


Solution

  • Assuming the document is well-formed enough to parse into a proper DOM object tree, you can iterate through the entire structure without using an external library. Depending on the structure, you may have to examine every node to find all matches, and this may be slow. If you have access to IDs of any sort, you may be able to reduce search scope and improve performance.

    The key property you will need is the childNodes collection on every DOM node. Starting with the BODY (or some other container), you can recurse through all the child nodes.

    This site is pretty basic but shows dependency-free methods for accessing DOM elements. See the section called "Tools to Navigate to a Certain Element".

    I noticed that you mentioned regular expressions as a means to find elements. Regexes are poor for parsing entire documents, but they can be very useful in evaluating the textual content of a single node (e.g. partial matches, pattern matches, case insensitivity, etc.) Regular expressions are part of the JavaScript language itself and have been so for well over a decade.