google-sheetsgoogle-sheets-formulagoogle-sheets-querygoogle-query-languagestdev

How to get average and standard deviation of grouped data?


I'm trying to work with a database using Google Sheets.

I am able to access the database using the importJson script from here:https://github.com/bradjasper/ImportJSON

using the query: =importJson("https://mywebsite.com/json-data");

It populates my Google Sheets just fine like this mockup:

+-----+--------+---------+
| Age | Weight | Country |
+-----+--------+---------+
| 10  | 120    | US      |
+-----+--------+---------+
| 10  | 140    | US      |
+-----+--------+---------+
| 10  | 160    | US      |
+-----+--------+---------+
| 11  | 180    | CA      |
+-----+--------+---------+
| 12  | 190    | GBR     |
+-----+--------+---------+
| 12  | 200    | GBR     |
+-----+--------+---------+

I'd like to be able to show the data grouped by age, with an average weight for that age, standard deviation for the weights in that age, like so...

+-----+----------------+----------+
| Age | Average weight | Std. Dev |
+-----+----------------+----------+
| 10  | 140            | 20       |
+-----+----------------+----------+
| 11  | 180            | 0        |
+-----+----------------+----------+
| 12  | 195            | 5        |
+-----+----------------+----------+

I've tried to play around with creating a pivot table in sheets, but I can't figure out how to get the average of all values for each age in a column or the std deviation.

When I try to add a column with weight values by average, I get a DIV/0 error.

Is this doable in Sheets?


Solution

  • 0

    paste in E2 cell:

    =QUERY(A2:B, 
     "select A,avg(B) 
      where A is not null 
      group by A 
      label avg(B)''", 0)
    

    paste in G2 cell and drag down:

    =IFERROR(STDEV(FILTER(B:B, A:A=E2)), 0)
    

    paste in H2 cell and drag down:

    =IFERROR(STDEVP(FILTER(B:B, A:A=E2)), 0)
    

    UPDATE:

    STDEV and STDEVP are now supported under array processing

    =BYROW(E2:E4, LAMBDA(xx, IFERROR(STDEV(FILTER(B:B, A:A=xx)), 0)))
    

    enter image description here

    all-in-one formula solution:

    ={QUERY(A2:B, 
     "select A,avg(B) where A is not null 
      group by A label avg(B)''", 0), 
     BYROW(UNIQUE(FILTER(A2:A, A2:A<>"")), 
     LAMBDA(xx, IFERROR(STDEV(  FILTER(B:B, A:A=xx)), 0))), 
     BYROW(UNIQUE(FILTER(A2:A, A2:A<>"")), 
     LAMBDA(xx, IFERROR(STDEV.P(FILTER(B:B, A:A=xx)), 0)))}
    

    enter image description here