How do I include symbols into the record separator of awk. I know the basic syntax like this:
awk 'BEGIN{RS="[:.!]"}{if (tolower($0) ~ "$" ) print $0 }'
which will separate a single line into separate records based on ! . and : but I also want to include symbols like a green checkmark this ✅
. I am having trouble understanding the syntax, so I put it in like this
awk 'BEGIN{RS="[:.!\u2705]"}{if (tolower($0) ~ "$" ) print $0 }'
which doesnt seem to work.
Sample input is this:
✅ Team collaboration ✅ Project organisation✅ SSO support✅ API Access✅ Priority Support
You need to use a regex with an alternation operator (|
) because the character you want to split with consists of three separate UTF8 code units: E2
, 9C
and 85
.
You can use
awk 'BEGIN{RS="[:.!]|\xE2\x9C\x85"} tolower($0) ~ "$"'
See the online demo:
#!/bin/bash
s='✅ Team collaboration ✅ Project organisation✅ SSO support✅ API Access✅ Priority Support'
awk 'BEGIN{RS="[:.!]|\xE2\x9C\x85"} tolower($0) ~ "$"' <<< "$s"
Output:
Team collaboration
Project organisation
SSO support
API Access
Priority Support
Note that print $0
is a default action, no need to use it explicitly.