I am new to querying in Google BigQuery and am attempting to flatten an ARRAY field in my query so that the array values are listed as a single result in a comma separated list. In my query "associations.associatedvids" is an array field in the deals table. My issue is really is a 2 step problem as I also need to match the associatedvids with the corresponding first and last name fields in another table called contacts.
First, for the contact ids, when I do the following
Select
CAST(property_hs_object_id.value AS String) AS deal_ID,
associations.associatedvids AS associated_contacts_ID
From hubspot_data.deals
I get a result like this:
Row deal_ID associated_contacts_ID.value
1 1814103617 3240001
3239951
...
but what I want is:
Row deal_ID associated_contacts_ID.value
1 1814103617 3240001,3239951
...
I've tried different ways of unnesting the array, but cannot seem to get it right. For instance the following attempt returns the error "Scalar subquery produced more than one element".
Select
CAST(property_hs_object_id.value AS String) AS deal_ID,
(select associations.associatedvids from unnest(associations.associatedvids)) AS associated_contacts_ID
From hubspot_data.deals
Second, what I ultimately want is:
Row deal_ID associated_contact_names
1 1814103617 John Doe,Jane Doe
...
The names fields are property_firstname.value and property_lastname.value, and associations.associatedvids (data type ARRAY<STRUCT>)=contacts.vids (data type INT64). I've tired the following, but since the data types are different I'm getting an error.
Select
CAST(property_hs_object_id.value AS String) AS deal_ID,
(
select concat(property_firstname.value, " ", property_lastname.value)
from hubspot_data.contacts
where contacts.vid=associations.associatedvids
) AS contact_name
From hubspot_data.deals
Any guidance would be much appreciated!
EDIT: Here's is my attempt at a minimal working example piece of code. I believe the field I'm trying to query is an ARRAY of STURCTs with the data type of the Struct element I want being INT64.
WITH deals AS (
Select "012345" as deal_ID,
[STRUCT(["abc"] as company_ID, [123,678,810] as contact_ID)]
AS associations)
SELECT
deal_ID,
contacts
FROM deals d
CROSS JOIN UNNEST(d.associations) as contacts
this give me:
Row deal_ID contacts.company_ID contacts.contact_ID
1 012345 abc 123
678
810
but what I want is
Row deal_ID contacts.contact_ID
1 012345 123, 678, 810
And ultimately, I need to replace the contact_IDs with the contact first and last names that are in a different table (but fortunately not in an array).
Below is for BigQuery Standard SQL
Based on limited info in your question - I guess you are missing STRING_AGG in the second query you presented in your question
It should be
SELECT
CAST(property_hs_object_id.value AS String) AS deal_ID,
(SELECT STRING_AGG(associations.associatedvids) FROM UNNEST(associations.associatedvids)) AS associated_contacts_ID
FROM hubspot_data.deals
Update: answer on updated question
#standardSQL
SELECT
deal_ID,
ARRAY(
SELECT AS STRUCT
company_ID,
( SELECT STRING_AGG(CAST(id AS STRING), ', ')
FROM t.contact_ID id
) AS contact_ID
FROM d.associations t
) AS contacts
FROM deals d