I'm working on some data in openrefine where many cells have multiple values that pair to a single value each in another multi-value cell in a different row. I want to turn that into one row with each paired data so turn something like this
Name | Account Number |
---|---|
John; Jennifer; JT | 0239; 2191; 1788 |
JT | 1788 |
Janet; John | 0591; 0239 |
into this:
Name | Account Number |
---|---|
John | 0239 |
Jennifer | 2191 |
JT | 1788 |
JT | 1788 |
Janet | 0591 |
John | 0239 |
Is this possible to do? If so any suggestions on how to do so?
I tried this but it didn't get me there.
EDIT: Here's what happens when I split each row using Edit Multi-Valued Cells and the semicolon separator. I'm unsure how to get the names next to the account numbers as fill down wouldn't work.
Name | Account Number |
---|---|
John | 0239 |
2191 | |
1788 | |
Jennifer | |
JT | |
JT | 1788 |
Janet | 0591 |
0239 | |
John |
The concept you are searching for is cell splitting for multi-valued cells. For details see the documentation about cell editing.
If you do not have a record structure yet I suggest you apply the cell splitting command from right to left.
You can forgo step 3 by activating the regular expression checkbox in the cell splitting dialog and use a regular expression like "\s*;\s*" as separator.