I have a list of inputs in google sheets,
Input | Desired Output | "To demonstrate only not an input" The repeated letters |
---|---|---|
Outdoors | Match | o |
dog | No Match | |
step | No Match | |
bee | Match | e |
Chessboard | Match | s |
Cookbooks | Match | o, k |
How do I verify if all letters are unique in a string without splitting it?
In other words if the string has one letter or more occurred twice or more, return TRUE
My process so far
I tried this solution in addition to splitting the string and dividing the length of the string on the COUNTA
of unique letters of the string, if = 1 "Match", else "No match"
Or using regex
I found a method to match a letter is occure in a string 2 times this demonstration with REGEXEXTRACT
But wait what needed is get TRUE when the letters are not unique in the string
=REGEXEXTRACT(A1,"o{2}?")
Returns:
oo
Something like this would do
=REGEXMATCH(Input,"(anyletter){2}?")
OR like this
=REGEXMATCH(lower(A6),"[a-zA-Z]{2}?")
Notes
TRUE
or FALSE
instead of Match
or No Match
to keep it simple.More examples
Input | Desired Output |
---|---|
Professionally | Match |
Attractiveness | Match |
Uncontrollably | Match |
disreputably | No Match |
Recommendation | Match |
Interrogations | Match |
Aggressiveness | Match |
doublethinks | No Match |
You are explicitly asking for an answer using a single regular expression. Unfortunately there is no such thing as a backreference to a former capture group using RE2. So if you'd spell out the answer to your problem it would look like:
=INDEX(IF(A2:A="","",REGEXMATCH(A2:A,"(?i)(?:a.*a|b.*b|c.*c|d.*d|e.*e|f.*f|g.*g|h.*h|i.*i|j.*j|k.*k|l.*l|m.*m|n.*n|o.*o|p.*p|q.*q|r.*r|s.*s|t.*t|u.*u|v.*v|w.*w|x.*x|y.*y|z.*z)")))
Since you are looking for case-insensitive matching (?i)
modifier will help to cut down the options to just the 26 letters of the alphabet. I suppose the above can be written a bit neater like:
=INDEX(IF(A2:A="","",REGEXMATCH(A2:A,"(?i)(?:"&TEXTJOIN("|",1,REPLACE(REPT(CHAR(SEQUENCE(26,1,65)),2),2,0,".*"))&")")))
EDIT 1:
The only other reasonable way to do this (untill I learned about the PREG supported syntax of the matches clause in QUERY()
by @DoubleUnary) with a single regex other than the above is to create your own UDF in GAS (AFAIK). It's going to be JavaScript based thus supporting a backreferences. GAS is not my forte, but a simple example could be:
function REGEXMATCH_JS(s) {
if (s.map) {
return s.map(REGEXMATCH_JS);
} else {
return /([a-z]).*?\1/gi.test(s);
}
}
The pattern ([a-z]).*?\1
means:
([a-z])
- Capture a single character in range a-z;.*?\1
- Look for 0+ (lazy) characters up to a copy of this 1st captured character with a backreference.The match is global and case-insensitive. You can now call:
=INDEX(IF(A2:A="","",REGEXMATCH_JS(A2:A)))
EDIT 2:
For those that are benchmarking speed, I am not testing this myself but maybe this would speed things up:
=INDEX(REGEXMATCH(A2:INDEX(A:A,COUNTA(A:A)),"(?i)(?:a.*a|b.*b|c.*c|d.*d|e.*e|f.*f|g.*g|h.*h|i.*i|j.*j|k.*k|l.*l|m.*m|n.*n|o.*o|p.*p|q.*q|r.*r|s.*s|t.*t|u.*u|v.*v|w.*w|x.*x|y.*y|z.*z)"))
Or:
=INDEX(REGEXMATCH(A2:INDEX(A:A,COUNTA(A:A)),"(?i)(?:"&TEXTJOIN("|",1,REPLACE(REPT(CHAR(SEQUENCE(26,1,65)),2),2,0,".*"))&")"))
Or:
=REGEXMATCH_JS(A2:INDEX(A:A,COUNTA(A:A)))
Respectively. Knowing there is a header in 1st row.