regexrubyparsingregex-greedyrubular

How can I capture certain data using regex if it is dependent on another field?


I need help in writing regex for the below mentioned log:

URLReputation: Risk unknown, URL: http://facebook.com

I wrote a regex like below:

URLReputation\:\s*(.*?),\s*URL\:\s*(.*)

Here everything is working. But in case URL isn't there, the URLReputation also will not be captured.

Please help.

Regards,

Mitesh Agrawal


Solution

  • You could turn the non greedy .*? into a negated character class [^,]+ and match any char except a comma. Then make the URL part optional using an optional non capturing group (?:...)?

    You want to capture the value of a url using .* but that could possibly also match an empty string.

    You might make the pattern more specific by matching at least a single non whitespace char \S+ or use a pattern like for example specifying the start https?://\S+

    URLReputation:\s*([^,]+)(?:,\s*URL:\s*(\S+))?
    

    Regex demo