powerbipowerquerym

Bigrams of a string in Power Query M


I am trying to write a custom power query function to convert a given string into a list of its bigrams.

For example, "Hi World" should output

List
1 Hi
2 iW
3 Wo
4 or
5 rl
6 ld

I stumbled across an implementation of a recursive function in M here which led me to writing this:

( String1 as text ) =>
let
    S1 = Text.Remove(String1, " "),
    String1_Tokens = Text.ToList(S1),
    N = List.Count(String1_Tokens),
    func = (x as list, n as number, y as list) =>
        let 
            Calc = Text.Combine({x(n), x(n+1)}),
            Lister = List.Combine({y,{Calc}}),
            Check = if n = N-1 then Lister else @func(x, n+1, Lister)
        in Check,
    res = func(String1_Tokens, 1, {})
in res

The basic idea is the string is cleaned of white space and broken down into a list of its constituent symbols. That list is then passed into a function that takes the nth and n+1th symbol in the list and concatenates them, then appends them to the list of bigrams already there. This repeats until the list of symbols is exhausted.

When trying to invoke this function with an input string, I get the following error:

An error occurred in the ‘’ query. Expression.Error: We cannot convert a value of type List to type Function. Details: Value=[List] Type=[Type]

What is it that I am missing? I am fairly new to M, so I feel I'm definitely missing something fundamental here.


Solution

  • You almost got it!

    ( String1 as text ) =>
    let
        S1 = Text.Remove(String1, " "), // Remove spaces
        String1_Tokens = Text.ToList(S1), // Convert the string to a list of characters
        N = List.Count(String1_Tokens), // Get the length of the list
        func = (x as list, n as number, y as list) =>
            let 
                // Combine the current and next characters
                Calc = Text.Combine({x{n}, x{n+1}}),
                // Append the result to the list
                Lister = List.Combine({y, {Calc}}),
                // Check if we are at the second to last element, stop recursion if we are
                Check = if n = N-2 then Lister else @func(x, n+1, Lister)
            in 
                Check,
        // Start the recursion at index 0
        res = func(String1_Tokens, 0, {})
    in 
        res
    

    Side note, I think you wrote a very elegant recursive funtion, but if I was to do it (not that I'm an expert, it's just what I like to do), I would avoid using recursion (mostly because I always get lost in it):

    ( String1 as text ) =>
    let
        S1 = Text.Remove(String1, " "),    // Remove spaces
        charList = Text.ToList(S1),        // Convert the string to a list of characters
        N = List.Count(charList),          // Get the count of the list
        positions = List.Numbers(0, N-1),  // Create a list of positions from 0 to N-1
        bigrams = List.Transform(positions, each if _ < N-1 then Text.Combine({charList{_}, charList{_+1}}) else null),
                                           // Remove any null values (last element might be null due to the if condition)
        cleanedBigrams = List.RemoveNulls(bigrams)
    in
        cleanedBigrams