I have a query in Snowflake that produces json output in rows. 29,000+ rows. I am using this query to create records in an MDM system. However, this ingestion process is a bit inefficient and costly. I have since taken my final SELECT statement that created the json and placed in a CTE. Then I made a new final SELECT statement
SELECT ARRAY_AGG(*) as clin_prov
FROM clin_prov
This overwhelmed the system in its size. Using LIMIT in the CTE of 1000, and even 5000, produced results, making a large json record, as expected, and desired. I'd like to have 30 rows of arrays with <= 1000 json records in the final output. Is there a way to do that? Can I create Arrays based on LIMIT and OFFSET and feed into separate rows? I
I tried with and without LIMIT in the clin_prov cte. I tried to use separate ctes based on LIMIT and OFFSET then UNION ALL. That overloaded as well.
Rows can be grouped into "buckets" of predefined size. Example:
WITH cte AS (
SELECT CEIL(ROW_NUMBER() OVER(ORDER BY NULL) / 30) AS grp, *
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF10.ORDERS
)
SELECT grp, ARRAY_AGG(OBJECT_CONSTRUCT(*))
FROM cte
GROUP BY grp;