sqlgoogle-bigqueryhistogramhtmldomcontentloaded

What is the difference of "dom_content_loaded.histogram.bin.start/end" in Google's BigQuery?


I need to build a histogram, concerning DOMContentLoaded of a webpage. When I used BigQuery, I noticed that apart from density, there are 2 more attributes (start, end). In my head there should only be 1 attribute, the DOMContentLoaded event is only fired when the DOM has loaded.

Can anyone help clarify the difference of .start and .stop? These attributes always have 100 milliseconds difference between them (if start = X ms, then stop = X+100 ms. See a query example posted below.

I can not understand what these attributes represent exactly:

dom_content_loaded.histogram.bin.START

AND

dom_content_loaded.histogram.bin.END

Q: Which one of them represents the time that the DOMContentLoaded event is fired in a user's browser?

SELECT
    bin.START AS start,
    bin.END AS endd
FROM
    `chrome-ux-report.all.201809`,
    UNNEST(dom_content_loaded.histogram.bin) AS bin
WHERE
    origin = 'https://www.google.com'

Output:

Row |start | end    
1    0       100
2    100     200
3    200     300
4    300     400
[...]

Solution

  • Below explains meaning of bin.start, bin.end and bin.density

    Run below SELECT statement

    SELECT
      origin, 
      effective_connection_type.name type_name, 
      form_factor.name factor_name,
      bin.start AS bin_start,
      bin.end AS bin_end,
      bin.density AS bin_density
    FROM `chrome-ux-report.all.201809`,
      UNNEST(dom_content_loaded.histogram.bin) AS bin
    WHERE origin = 'https://www.google.com'  
    

    You will get 1550 rows in result

    below are first 5 rows

    Row origin                  type_name   factor_name bin_start   bin_end bin_density  
    1   https://www.google.com  4G          phone       0           100     0.01065  
    2   https://www.google.com  4G          phone       100         200     0.01065  
    3   https://www.google.com  4G          phone       200         300     0.02705  
    4   https://www.google.com  4G          phone       300         400     0.02705  
    5   https://www.google.com  4G          phone       400         500     0.0225    
    

    You can read them as: for phone with 4G load of dom_content was loaded within 100 milliseconds for 1.065% of loads; in between 100 and 200 milliseconds for 1.065%; in between 200 and 300 milliseconds for 2.705% and so on

    To summarize for each origin, type and factor you got histogram that is represented as a repeated record with start and end of each bin along with density which represents percentage of respective user experience

    Note: if you add up the dom_content_loaded densities across all dimensions for a single origin, you will get 1 (or a value very close to 1 due to approximations).

    For example

    SELECT SUM(bin.density) AS total_density
    FROM `chrome-ux-report.all.201809`,
      UNNEST(dom_content_loaded.histogram.bin) AS bin
    WHERE origin = 'https://www.google.com'
    

    returns

    Row total_density    
    1   0.9995999999999978   
    

    Hope this helped