yamlyq

Extract and merge object contents from two different trees


I have the following structure in yaml:

root1:
  foo:
    prop1: val1
  bar:
    prop1: val2
root2:
  foo:
    prop2: val3
  bar:
    prop2: val4

I wish to split the tree in multiple docs along the foo/bar axis:

root1:
  foo:
    prop1: val1
root2:
  foo:
    prop2: val3
---
root1:
  bar:
    prop1: val2
root2:
  bar:
    prop2: val4

I see yq has the split_doc function, but how would I go to first select the components from my tree and merge it into an array that I can split in multiple docs ?


Solution

  • I thought this would be trivial with yq, but I struggled quite a bit to transform the single input mapping (hash/dictionary) into two independent, separate result mappings.

    I first needed to merge the input mapping with an empty object to get an independent copy that I could then transform.

    Eventually, this is what I ended up with:

    ((. * {} | map_values(pick(["foo"]))), (. * {} | map_values(pick(["bar"])))) | split_doc
    

    Output:

    root1:
      foo:
        prop1: val1
    root2:
      foo:
        prop2: val3
    ---
    root1:
      bar:
        prop1: val2
    root2:
      bar:
        prop2: val4
    

    To split based on the actual values in the second level, without knowing them in advance, you can build on @pmf's excellent answer to initially extract the keys from the first mapping, then split the result:

    yq '[(to_entries[0].value|keys[]) as $axis | . * {} | map_values(pick([$axis]))][] | split_doc'
    

    This assumes that all top-level mapping contain the same keys. If that's not the case, extract and deduplicate the keys from all mappings:

    yq '[([.[]|keys[]]|sort|unique[]) as $axis | . * {} | map_values(pick([$axis]))][] | split_doc'