google-sheetsaveragegoogle-sheets-formulastandard-deviationiqr

Calculating statistical data for data sets by using single values with multiple quantities in Google Sheets


In Google Sheets, I've been working on data sets of varying prices to compare them to actual costs and excluding values outside of the normal ranges. I've had help previously, but I am unable to design more formulas based on the original formulas I've had help with.

In my data sets, I have very large expanding data sets I've typed manually where multiple numbers occur over again. I've notated each value with multiple values as value (x) where as the X represents the quantity or total amount of occurrences within the data set to make it smaller; as I need to type each manually

The formula to calculate the Average is below:

=ARRAYFORMULA(SUM(QUERY(SPLIT(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
 IF(ISNUMBER(A19:AP19), "♦"&A19:AP19&" 1", IF((A19:AP19<>"")*(NOT(ISNUMBER(A19:AP19))), "♦"&REGEXREPLACE(A19:AP19, "[()]", ), ))),,99^99)),,99^99), "♦")), " "), 
 "select Col1*Col2 label Col1*Col2 ''", 0))/SUM(QUERY(SPLIT(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
 IF(ISNUMBER(A19:AP19), "♦"&A19:AP19&" 1", IF((A19:AP19<>"")*(NOT(ISNUMBER(A19:AP19))), "♦"&REGEXREPLACE(A19:AP19, "[()]", ), ))),,99^99)),,99^99), "♦")), " "), 
 "select Col2", 0)))

This works extremely well. I've also have a formula that calculates the Min and Max of the data set.

I've attempted to edit the code to give myself for mode (which will not read value (x) values), and IQR (Interquartile Range) with absolutely no success.

I want to be able to find the Standard Deviation, Mean Deviation, and IQR for datasets while using values that are notated using values (x)

Below is a link of a sample of a data set: https://docs.google.com/spreadsheets/d/15-UY7salQ8fI011twnIKSjEJQWCSVmfV1wErE28gvKw/edit?usp=sharing


Solution

  • base formula (which takes your range B14:B and expands all values which are in format value (x))

    =ARRAYFORMULA(TRANSPOSE(SPLIT(QUERY(REPT(
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,1)&"♦", 
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,2)),,999^99), "♦")))
    

    0


    AVERAGE (B3)

    =ARRAYFORMULA(SUM(QUERY(SPLIT(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
     IF(ISNUMBER(B14:B), "♦"&B14:B&" 1", IF((B14:B<>"")*(NOT(ISNUMBER(B14:B))), 
     "♦"&REGEXREPLACE(B14:B, "[()]", ), ))),,99^99)),,99^99), "♦")), " "), 
     "select Col1*Col2 label Col1*Col2 ''", 0))/
     SUM(QUERY(SPLIT(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
     IF(ISNUMBER(B14:B), "♦"&B14:B&" 1", IF((B14:B<>"")*(NOT(ISNUMBER(B14:B))), 
     "♦"&REGEXREPLACE(B14:B, "[()]", ), ))),,99^99)),,99^99), "♦")), " "), 
     "select Col2", 0)))
    

    RANGE (B4)

    =B7-B8
    

    MEDIAN (B5)

    =ARRAYFORMULA(MEDIAN(TRANSPOSE(SPLIT(QUERY(REPT(
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,1)&"♦", 
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,2)),,999^99), "♦"))))
    

    MODE (B6)

    =ARRAYFORMULA(MODE(TRANSPOSE(SPLIT(QUERY(REPT(
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,1)&"♦", 
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,2)),,999^99), "♦"))))
    

    MAX (B7)

    =ARRAYFORMULA(MAX(QUERY(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(B14:B)
    ,,99^99)),,99^99), " ")), "where not Col1 contains '-'", 0)))
    

    MIN (B8)

    =ARRAYFORMULA(MIN(QUERY(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(B14:B)
    ,,99^99)),,99^99), " ")), "where not Col1 contains '-'", 0)))
    

    IQR - INTERQUARTILE RANGE (B9)

    =ARRAYFORMULA(QUARTILE(TRANSPOSE(SPLIT(QUERY(REPT(
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,1)&"♦", 
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,2)),,999^99), "♦")), 3)-
     QUARTILE(TRANSPOSE(SPLIT(QUERY(REPT(
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,1)&"♦", 
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,2)),,999^99), "♦")), 1))
    

    MEAN DEVIATION (B10)

    =ARRAYFORMULA(AVEDEV(TRANSPOSE(SPLIT(QUERY(REPT(
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,1)&"♦", 
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,2)),,999^99), "♦"))))
    

    STANDARD DEVIATION (B11)

    =ARRAYFORMULA(STDEV(TRANSPOSE(SPLIT(QUERY(REPT(
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,1)&"♦", 
     INDEX(IF(B14:B="",,SPLIT(IF(ISNUMBER(B14:B), B14:B&" 1", B14:B), " ()"))*1,,2)),,999^99), "♦"))))