I am looking to retrieve in a BigQuery SQL query data from a table named products
and 2 attributes, the material
, and the material_color
.
My base table products
has only the the first material and material color.
The remaining materials are in a separate table named product_properties
, stored by key value (field_name and field_value). I need to use a field called sequence
to pair the material with its material color.
For example, in products
I only have:
product_id | material | material_color |
---|---|---|
product_1 | paper | white |
My second table, product_properties
has :
product_id | field_name | field_value | sequence |
---|---|---|---|
product_1 | material | metal | 1 |
product_1 | material_color | grey | 1 |
product_1 | material | wood | 2 |
product_1 | material_color | brown | 2 |
Expected Result:
product_id | material | material_color |
---|---|---|
product_1 | paper | white |
product_1 | metal | grey |
product_1 | wood | brown |
I managed to get the expected result by doing 2 self joins on product_properties
, one on field_name="material" and the other one on field_name="material_color", and then doing an UNION ALL
with the base table.
Is there a more optimized way to achieve this ?
Use below approach
select * from table_1
union all
select * except(sequence) from table_2 pivot (
max(field_value) for field_name in ('material', 'material_color')
)
if applied to sample data in your question - output is as below