I need to iterate over some pairs of strings in a program that I am writing. Instead of putting the string pairs in a big table-of-tables, I am putting them all in a single string, because I think the end result is easier to read:
function two_column_data(data)
return data:gmatch('%s*([^%s]+)%s+([^%s]+)%s*\n')
end
for a, b in two_column_data [[
Hello world
Olá hugomg
]] do
print( a .. ", " .. b .. "!")
end
The output is what you would expect:
Hello, world!
Olá, hugomg!
However, as the name indicates, the two_column_data
function only works if there are two exactly columns of data. How can I make it so it works on any number of columns?
for x in any_column_data [[
qwe
asd
]] do
print(x)
end
for x,y,z in any_column_data [[
qwe rty uio
asd dfg hjk
]] do
print(x,y,z)
end
I'm OK with using lpeg for this task if its necessary.
Here is an lpeg re version
function re_column_data(subj)
local t, i = re.compile([[
record <- {| ({| [ %t]* field ([ %t]+ field)* |} (%nl / !.))* |}
field <- escaped / nonescaped
nonescaped <- { [^ %t"%nl]+ }
escaped <- '"' {~ ([^"] / '""' -> '"')* ~} '"']], { t = '\t' }):match(subj)
return function()
local ret
i, ret = next(t, i)
if i then
return unpack(ret)
end
end
end
It basicly is a redo of the CSV sample and supports quoted fields for some nice use-cases: values with spaces, empty values (""), multi-line values, etc.
for a, b, c in re_column_data([[
Hello world "test
test"
Olá "hug omg"
""]].."\tempty a") do
print( a .. ", " .. b .. "! " .. (c or ''))
end