I'm trying to extract a value from an xml document that has been read into my script as a variable. The original variable, $data, is:
<item>
<title>15:54:57 - George:</title>
<description>Diane DeConn? You saw Diane DeConn!</description>
</item>
<item>
<title>15:55:17 - Jerry:</title>
<description>Something huh?</description>
</item>
and I wish to extract the first title value, so
15:54:57 - George:
I've been using the sed command:
title=$(sed -n -e 's/.*<title>\(.*\)<\/title>.*/\1/p' <<< $data)
but this only outputs the second title value:
15:55:17 - Jerry:
Does anyone know what I have done wrong? Thanks!
As Charles Duffey has stated, XML parsers are best parsed with a proper XML parsing tools. For one time job the following should work.
grep -oPm1 "(?<=<title>)[^<]+"
###Test:
$ echo "$data"
<item>
<title>15:54:57 - George:</title>
<description>Diane DeConn? You saw Diane DeConn!</description>
</item>
<item>
<title>15:55:17 - Jerry:</title>
<description>Something huh?</description>
$ title=$(grep -oPm1 "(?<=<title>)[^<]+" <<< "$data")
$ echo "$title"
15:54:57 - George: