I need to remove specific data from a SQL column along with data, I can't get the right approach.
Example input:
<div></div><div><div><div><div><p>12345</p></div></div></div></div>
<div><div><span>45678<span></span></span></div></div>
<div><div>Test No.</div></div><div><div><div><div><p>4466</p></div></div></div></div>
<div>6789</div>
<p></p><p><i>As per the attachment, no provided because we were supposed to be onboarded.</i></p>
Desired output:
12345
45678
Test No.4466
6789
As per the attachment, no provided because we were supposed to be onboarded.
Column data type is varchar NULL and I have tried the below but not working
Select Col, REPLACE(col, SUBSTRING(col, CHARINDEX('<', col), LEN(col) - CHARINDEX('>', REVERSE(col)) - CHARINDEX('<', col) + 2), '')
from table
Just wondering don't we have straight code in SQL Server in order to remove <></>
angle brackets and it's value and keep the rest?
Assuming your HTML is also valid XML, you can just use the following code to get all text nodes and concatenate them together.
SELECT
t.id,
t.xhtml,
t.xhtml.query('//text()').value('text()[1]', 'nvarchar(max)') AS value
FROM t;
.query('//text()')
gets all the text nodes, you can use any XQuery filter to get whatever nodes you want. Then .value('text()[1]', 'nvarchar(max)')
concats them all together.