awksedgrepgnu-sed

Re-index two digit strings based on occurrence of a common string


I have a urlwatch .yaml file that has this format:

name: 01_urlwatch update released
url: "https://github.com/thp/urlwatch/releases"
filter:
  - xpath:
      path: '(//div[contains(@class,"release-timeline-tags")]//h4)[1]/a'
  - html2text: re
---
name: 02_urlwatch webpage
url: "https://thp.io/2008/urlwatch/"
filter: 
  - html2text: re
  - grep: (?i)current\sversion  #\s Matches a whitespace character
  - strip # Strip leading and trailing whitespace 
---
name: 04_RansomWhere? Objective-See
url: "https://objective-see.com/products/ransomwhere.html"
filter:
  - html2text: re
  - grep: (?i)current\sversion #\s Matches a whitespace character
  - strip #Strip leading and trailing whitespace
---
name: 05_BlockBLock Objective-See
url: "https://objective-see.com/products/blockblock.html"
filter:
  - html2text: re
  - grep: (?i)current\sversion #(?i) \s 
  - strip #Strip leading and trailing whitespace
---

I need to "re-index" the two digit number depending on the occurrence of name: . In this example the first and second occurrence of name: are followed by the correct index numbers but the third and fourth are not.

In the example above the third and fourth occurrence of name: would have their index number re-indexed to have 03_ and 04_ before the text string. That is: a two digit index number, and an underscore.

Also, there are instances of this string #name: which should not be counted in the re-indexing. (They have been commented out so those lines are not acted upon by urlwatch)

I tried using sed but had trouble with generating an index number based on occurrence of the string. I don't have GNU sed but can install if that is the only method.


Solution

  • awk '/^name/{sub(/[0-9]{2}/,sprintf("%02d", ++c))}1' file
    

    For any line starting with "name" we replace the first 2-digit number with our counter, which increments on every occurrence, with the help of the GNU awk sprintf function to print it with leading zeros when needed.