openrefinegoogle-refine

How convert a table format or it structure with Google Refine


I have a table with the following format:

ID  Estation    Y   M   D   H   N   Nh     h   Cl
1   78357     2017  5   1   1   0   0   -9001   0
2   78357     2017  5   1   2   0   0   -9001   0
3   78357     2017  5   1   3   1   1    750    5

I want to convert the data in this table to the following format:

ID  Estation    Y   M   D   H   Var Value
1   78357     2017  5   1   1   N    0
2   78357     2017  5   1   2   N    0
3   78357     2017  5   1   3   N    1
4   78357     2017  5   1   1   Nh   0
5   78357     2017  5   1   2   Nh   0
6   78357     2017  5   1   3   Nh   1
7   78357     2017  5   1   1   h   -9001
8   78357     2017  5   1   2   h   -9001
9   78357     2017  5   1   3   h    750
10  78357     2017  5   1   1   Cl   0
11  78357     2017  5   1   2   Cl   0
12  78357     2017  5   1   3   Cl   5

Due to the amount of registration I must take from one format to another I want to do it using Google Refine. Someone has any idea how to do it?.


Solution

  • You can do this in Google Refine (now called OpenRefine) using the Transpose option.

    In the 'N' column click the drop down menu and choose "Transpose -> Transpose cells across columns into rows"

    In the screen shown choose "N" as the "From Column" and "(last column)" as the "To Column"

    Choose to Transpose into Two New Columns. Call the Key column "Var" and the Value column "Value"

    Check the box that says "Fill down in other columns"

    Click Transpose

    This should give you the various variables & values in a single column with multiple rows

    To sort in the order you give in your example maybe challenging. If you Sort the Var col in reverse alphabetical order it is close although not quite - not sure how important this is to you.

    Remember in OpenRefine you have to choose to Reorder Rows Permanently to commit the new sort order.

    You may have to transform the ID column to renumber with unique IDs. You can do this with the GREL rowIndex+1 once you have got the sort order correct