SNOWFLAKE - Query that gets a column sum and aggregates array column
23:12 06 Jul 2020

I have a table like:

|------|-------|----------------------|
|  id  |  qty  |      collection      |
|------|-------|----------------------|
| foo  |   2   |    ['foo', 'bar']    |
|------|-------|----------------------|
| foo  |   4   |    ['baz', 'qux']    |
|------|-------|----------------------|
| bar  |   8   |    ['beep', 'boop]   |
|------|-------|----------------------|

and I want an output like:

|------|-------|------------------------------------|
|  id  |  qty  |      collection                    |
|------|-------|------------------------------------|
| foo  |   6   |    ['foo', 'bar', 'baz', 'qux']    |
|------|-------|------------------------------------|
| bar  |   8   |    ['beep', 'boop']                |
|------|-------|------------------------------------|

My first attempt was to do something like

SELECT
    id, SUM(qty), ARRAY_AGG(collection)
GROUP BY id

which gives me the correct qty sum but the array agg is multidimensional array

Doing something like a lateral flatten gives me the correct output array but the sum is off because the flattens array created extra rows with qty.

arrays group-by snowflake-cloud-data-platform