In a string like "30 x 40 x 900" or "3 x 4 x 90", I need to replace the space characters around the "x" - if present - by a special non-breaking character. I formulated:
s = s:gsub('(%d)(%s*x%s*)(%d)', '%1\u{2009}x\u{2009}%3')
which turns out to work for the first, but not for the second string. In the second string, only the space characters around the first "x" will be replaced. Is this the case because the last item of my search pattern happens to be the first item at the same time? Or, in other words: If "3 x 4" were found as the first string part to be replaced, then the "4 x 9" doesn't seem to be recognized as the next string part. Correct? Am I missing something here, or is this simply the case?
The problem occurs because once a match is found and replaced, the search continues from the position immediately after the matched text. Here's what happens with your pattern:
(%d)(%s*x%s*)(%d)
"3 x 4 x 90" - First match: "3 x 4" (positions 1-5)
Search resumes from position 6, which is " x 90"
The pattern needs a digit before the "x", but position 6 is a space
No second match found
To overcome this design limitation you should use multiple passes with sub
to handle overlapping matches:
repeat
local old_s = s
s = s:gsub('(%d)(%s*x%s*)(%d)', '%1\u{2009}x\u{2009}%3')
until s == old_s
Or more concisely, let gsub
do multiple passes by checking the count:
repeat
s, count = s:gsub('(%d)(%s*x%s*)(%d)', '%1\u{2009}x\u{2009}%3')
until count == 0
Here is an example with the sample input:
s = "3 x 4 x 90"
repeat
s, count = s:gsub('(%d)(%s*x%s*)(%d)', '%1\u{2009}x\u{2009}%3')
until count == 0
print(s) -- "3 x 4 x 90" (with thin spaces around both x's