xmlbashshellawk

get specific value from xml file using bash script


I am not familiar with bash and strings, and need to get a value from xml file, after curling the xml I get:

<metadata modelVersion="1.1.0">
  <groupId>com.company.NYC</groupId>
  <artifactId>scripts</artifactId>
  <version>2.3.7-SNAPSHOT</version>
  <versioning>
    <snapshot>
      <timestamp>20201209.102124</timestamp>
      <buildNumber>38</buildNumber>
    </snapshot>
    <lastUpdated>20201209104133</lastUpdated>
    <snapshotVersions>
      <snapshotVersion>
        <classifier>NYC</classifier>
        <extension>zip</extension>
        <value>2.3.7-20201209.102124-38</value>
        <updated>20201209102124</updated>
      </snapshotVersion>
      <snapshotVersion>
        <classifier>tests</classifier>
        <extension>jar</extension>
        <value>2.3.7-20201209.102124-38</value>
        <updated>20201209102124</updated>
      </snapshotVersion>
      <snapshotVersion>
        <extension>pom</extension>
        <value>2.3.7-20201209.102124-38</value>
        <updated>20201209102124</updated>
      </snapshotVersion>
    </snapshotVersions>
  </versioning>
</metadata>

using this command, I get the version tags value:

#wget -q -N $urlversion=$(curl -s https://www.artifactoty.company.net/artifactory/maven-dev-local/com/google/api/framework/maven-metadata.xml --insecure | grep  -oP '(?<=<value>).*?(?=</value>))

And all what i need is value after the zip file:

extension>zip</extension>
<value>2.3.7-20201209.102124-38</value>

The 2.3.7-20201209.102124-38 value.

How can i get it alone.

Thanks friends.


Solution

  • Assuming OP has a reason for not using a XML-aware tool to parse the data ...

    NOTE: I've copied the sample xml data into file tag.xml for the purposes of this answer.

    One awk solution:

    awk -F'[<>]' '             # define multiple field delimiters as "<" and ">"
    $3=="zip" {getline         # if field #3 = "zip" then read the next line and ...
               print $3        # print field #3
               exit}           # at this point we have what we want so exit
    ' tag.xml
    

    Collapsing into a one-liner and storing in a variable:

    zipver=$(awk -F'[<>]' '$3=="zip" {getline; print $3; exit}' tag.xml)
    echo "${zipver}"
    

    Both of the above generate:

    2.3.7-20201209.102124-38
    

    OP could modify the current command as follows:

    wget -q -N $urlversion=$(curl -s ... | awk -F'[<>]' '$3=="zip" {getline; print $3; exit}')