sqlgoogle-bigquery

How to count a feature x if y condition is true (bigquery)


I am new in machine learning and I am running one of my first machine learning problems. I do a lot of research by myself but could not find the solution for this question.

I am creating new variables from my dataset in Bigquery. One of the variables consists in count a variable (x) if a condition in variable (y) is met. I cannot use WHERE or HAVING since the condition is only related to the calculation of this specific variable.

The code is more or less like this:

COUNT(DISTINCT sessionid IF(date > “2018-06-01” && date < “2018-06-30”)) 
AS sessions_lastmonth

I know that this code is not correct and could not be but is just a way to express more or less what I need. The goal is to know the number of sessions in a certain period. I cannot use the count for date because in one day you can have more than one session and not every row is a different session (a session can have several lines because the user can go through many pages in the same session).

Thank you!


Solution

  • Below for BigQuery Standard SQL

    COUNT(DISTINCT IF(date >= '2018-06-01' AND date <= '2018-06-30', sessionid, NULL)) AS sessions_lastmonth