I am trying to write a custom power query function to convert a given string into a list of its bigrams.
For example, "Hi World" should output
List | |
---|---|
1 | Hi |
2 | iW |
3 | Wo |
4 | or |
5 | rl |
6 | ld |
I stumbled across an implementation of a recursive function in M here which led me to writing this:
( String1 as text ) =>
let
S1 = Text.Remove(String1, " "),
String1_Tokens = Text.ToList(S1),
N = List.Count(String1_Tokens),
func = (x as list, n as number, y as list) =>
let
Calc = Text.Combine({x(n), x(n+1)}),
Lister = List.Combine({y,{Calc}}),
Check = if n = N-1 then Lister else @func(x, n+1, Lister)
in Check,
res = func(String1_Tokens, 1, {})
in res
The basic idea is the string is cleaned of white space and broken down into a list of its constituent symbols. That list is then passed into a function that takes the nth and n+1th symbol in the list and concatenates them, then appends them to the list of bigrams already there. This repeats until the list of symbols is exhausted.
When trying to invoke this function with an input string, I get the following error:
An error occurred in the ‘’ query. Expression.Error: We cannot convert a value of type List to type Function. Details: Value=[List] Type=[Type]
What is it that I am missing? I am fairly new to M, so I feel I'm definitely missing something fundamental here.
You almost got it!
( String1 as text ) =>
let
S1 = Text.Remove(String1, " "), // Remove spaces
String1_Tokens = Text.ToList(S1), // Convert the string to a list of characters
N = List.Count(String1_Tokens), // Get the length of the list
func = (x as list, n as number, y as list) =>
let
// Combine the current and next characters
Calc = Text.Combine({x{n}, x{n+1}}),
// Append the result to the list
Lister = List.Combine({y, {Calc}}),
// Check if we are at the second to last element, stop recursion if we are
Check = if n = N-2 then Lister else @func(x, n+1, Lister)
in
Check,
// Start the recursion at index 0
res = func(String1_Tokens, 0, {})
in
res
Side note, I think you wrote a very elegant recursive funtion, but if I was to do it (not that I'm an expert, it's just what I like to do), I would avoid using recursion (mostly because I always get lost in it):
( String1 as text ) =>
let
S1 = Text.Remove(String1, " "), // Remove spaces
charList = Text.ToList(S1), // Convert the string to a list of characters
N = List.Count(charList), // Get the count of the list
positions = List.Numbers(0, N-1), // Create a list of positions from 0 to N-1
bigrams = List.Transform(positions, each if _ < N-1 then Text.Combine({charList{_}, charList{_+1}}) else null),
// Remove any null values (last element might be null due to the if condition)
cleanedBigrams = List.RemoveNulls(bigrams)
in
cleanedBigrams