I need to parse urls in order to grab a value that comes after .com/
AND before the next /
character. My data looks like this:
url
https://www.delish.com/food-news/news/jdhgkjdf/100-years-of-christmas
https://www.delish.com/food-news/news/100-years-of-christmas
The desired output is:
new_string
food-news
food-news
I have tried the following:
SPLIT(url, '/')[SAFE_OFFSET(ARRAY_LENGTH(SPLIT(url, '/')) - 4)] AS new string
But because the URLs are not consistent, sometimes it grabs food_news
, sometimes it grabs www.delish.com
, that's why offset is not working in this particular case.
Use below
regexp_extract(url, net.host(url) || r'/([^/]+)')