pythonopenrefinegrel

OpenRefine: create a shifted copy of a column


I wonder if OpenRefine lets you access data from other rows, when creating a new column. I suspect it does not (and it would be a sane design principle) but there could be a hack around that.

Here is an example of what one could want to do: shifting a column by one row.

I have the following table:

╔═════╦════════╗
║ row ║ Model  ║
╠═════╬════════╣
║   1 ║ Quest  ║
║   2 ║ DF     ║
║   3 ║ Waw    ║
║   4 ║ Strada ║
╚═════╩════════╝

And I want to obtain the following result:

╔═════╦════════╦══════════╗
║ row ║ Model  ║ Previous ║
╠═════╬════════╬══════════╣
║   1 ║ Quest  ║          ║
║   2 ║ DF     ║ Quest    ║
║   3 ║ Waw    ║ DF       ║
║   4 ║ Strada ║ Waw      ║
╚═════╩════════╩══════════╝

Looking at https://github.com/OpenRefine/OpenRefine/wiki/Variables it seems that there isn't any variable that would let you access information outside the current row or record, so I wonder if this sort of operation is possible.


Solution

  • Unfortunately, there is no "column" variable in Open Refine. A possible workaround would be to turn all the dataset into a single record, then apply a bit of Python/Jython.

    Example:

    data = row['record']['cells']['Model']['value']
    for i, el in enumerate(data):
        if value == el and i !=0:
            return data[i - 1]
    

    Screencast:

    enter image description here

    I don't know if a solution in GREL is possible.