I am auditing user details from my application using open Id login .If a first time a user is login a OPEN ID we consider as signup . I am generating audit signin report using this details . Sample Table Data.
+---------+----------+-----------+---------------+
| USER_ID | PROVIDER | OPERATION | TIMESTAMP |
+---------+----------+-----------+---------------+
| 120 | Google | SIGN_UP | 1347296347000 |
| 120 | Google | SIGN_IN | 1347296347000 |
| 121 | Yahoo | SIGN_IN | 1347296347000 |
| 122 | Yahoo | SIGN_IN | 1347296347000 |
| 120 | Google | SIGN_UP | 1347296347000 |
| 120 | FaceBook | SIGN_IN | 1347296347000 |
+---------+----------+-----------+---------------+
In this table I want to exclude already SIGN_UP ed "SIGN_IN" ed user count based on provider .
Show Create table
CREATE TABLE `signin_details` (
`USER_ID` int(11) DEFAULT NULL,
`PROVIDER` char(40) DEFAULT NULL,
`OPERATION` char(40) DEFAULT NULL,
`TIMESTAMP` bigint(20) DEFAULT NULL
) ENGINE=InnoDB
I am using this query .
select
count(distinct(USER_ID)) as signin_count,
PROVIDER from signin_details s1
where
s1.USER_ID NOT IN
(
select
USER_ID
from signin_details
where
signin_details.PROVIDER=s1.PROVIDER
and signin_details.OPERATION='SIGN_UP'
and signin_details.TIMESTAMP/1000 BETWEEN UNIX_TIMESTAMP(CURRENT_DATE()-INTERVAL 1 DAY) * 1000 AND UNIX_TIMESTAMP(CURRENT_DATE()) * 1000
)
AND OPERATION='SIGN_IN' group by PROVIDER;
Explain Output:
+----+--------------------+----------------+------+---------------+------+---------+------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------------+------+---------------+------+---------+------+------+-----------------------------+
| 1 | PRIMARY | s1 | ALL | NULL | NULL | NULL | NULL | 6 | Using where; Using filesort |
| 2 | DEPENDENT SUBQUERY | signin_details | ALL | NULL | NULL | NULL | NULL | 6 | Using where |
+----+--------------------+----------------+------+---------------+------+---------+------+------+-----------------------------+
Query Output :
+--------------+----------+
| signin_count | PROVIDER |
+--------------+----------+
| 1 | FaceBook |
| 2 | Yahoo |
+--------------+----------+
It takes more than 40 minutes to execute for 200k rows.
My assumption is it will check each row with total number of dependant subquery output.
My Assumption on this query.
A -> Dependant Outputs (B,C,D) .
A check with B
A check with C
A check with D
If dependant query output is larger it will take so long time to execute. How to improve this query?
If you use MySQL you have to know that sub queries performs awful slow.
IN
is slow...
EXISTS
is often faster then IN
JOIN
is mostly the fastest way do things like this.
SELECT DISTINCT
s1.PROVIDER,
COUNT(DISTINCT s1.USER_ID)
FROM
signin_details s1
LEFT JOIN
(
SELECT DISTINCT
USER_ID, PROVIDER
FROM
signin_details
WHERE
signin_details.OPERATION='SIGN_UP'
AND
signin_details.TIMESTAMP
BETWEEN
UNIX_TIMESTAMP(CURRENT_DATE()-INTERVAL 1 DAY) * 1000
AND UNIX_TIMESTAMP(CURRENT_DATE()) * 1000
) AS t USING (USER_ID, PROVIDER)
WHERE
t.USER_ID IS NULL
AND OPERATION='SIGN_IN'
GROUP BY s1.PROVIDER
http://sqlfiddle.com/#!2/122ac/12
NOTE: If you wonder about the sqlfiddle result consider here is a UNIX_TIMESTAMP
in the query.
Result:
| PROVIDER | COUNT(DISTINCT S1.USER_ID) |
-----------------------------------------
| FaceBook | 1 |
| Yahoo | 2 |
MySQL and the INTERSECT
story. You get all combinations of USER_ID
and PROVIDER
which you don't want to count. Then LEFT JOIN
them to your data. Now all the rows you want to count have no values from the LEFT JOIN
. You get them by t.USER_ID IS NULL
.
Input:
| rn° | USER_ID | PROVIDER | OPERATION | TIMESTAMP |
-------------------------------------------------------
| 1 | 120 | Google | SIGN_UP | 1347296347000 | -
| 2 | 120 | Google | SIGN_IN | 1347296347000 | - (see rn° 1)
| 3 | 121 | Yahoo | SIGN_IN | 1347296347000 | Y
| 4 | 122 | Yahoo | SIGN_IN | 1347296347000 | Y
| 5 | 120 | Google | SIGN_UP | 1347296347000 | -
| 6 | 120 | FaceBook | SIGN_IN | 1347296347000 | F
| 7 | 119 | FaceBook | SIGN_IN | 1347296347000 | - (see rn° 8)
| 8 | 119 | FaceBook | SIGN_UP | 1347296347000 | -