qlikviewqliksense

QlikView/QlikSense: Transforming Column with Lots of Strings


I have the following structure as input in Qlik, comprising two columns. The first column (index) contains sequentially increasing numbers, while the second column, which requires transformation, consists of a series of strings separated by a character (in my case, a space).

What I aim to achieve is a transformation where, for each index, every string in the second column is repeated.

To illustrate, consider the following example using chapters of a book and words. The second column contains all the words of a book:

Input:

enter image description here

Output:

enter image description here

I want to emphasize that the challenge lies in handling multiple strings in the second column during this transformation. Creating an intermediate table with columns corresponding to individual strings is not feasible because there can be hundreds of strings.


Solution

  • You can achieve this by using the SubField() function, which lets you split a string by a delimiter. Normally you would also specify which nth substring you want to be returned but in the Data Load Editor, you can actually ignore that parameter, which will return each substring in a new record.

    Here's the script you can use:

    [data]:
    Load
        [Index]
      , SubField([Col], ' ') as NewCol
    Resident [data load];
    

    That returns the data how you need it:

    Screenshot of Qlik Sense result table

    A similar way to achieve this would be to add in the SubStringCount() function and the IterNo() function with the use of the while clause. Using those elements in combination allowing you to iterate over a given row, creating a new row for each iteration.

    Given this example:

    [data]:
    Load
        [Index]
      , SubField([Col], ' ', IterNo()) as NewCol
    Resident [data load]
      While IterNo() <= SubStringCount([Col], ' ') + 1
    ;
    

    ...we're saying that we want to iterate over a given row as many times as the space character ' ' appears in the [Col] field for that row. We add 1 to that expression since we assume there will always be one more substring than spaces. The IterNo() function returns the current iteration number, so we use as our argument for the field_no parameter, which is the third parameter in our SubField() function (the one we omitted in our first example).

    So, since there are two spaces in We went tiptoeing, we loop over the row 2+1 times, each iteration looking like this:

    SubField( 'We went tiptoeing' , ' ', 1) = We

    SubField( 'We went tiptoeing' , ' ', 2) = went

    SubField( 'We went tiptoeing' , ' ', 3) = tiptoeing