mysqlmysql-python

Why MYSQL IN keyword not considering NULL values


I am using the following query:

select count(*) from Table1 where CurrentDateTime>'2012-05-28 15:34:02.403504' and Error not in ('Timeout','Connection Error');

Surprisingly, this statement doesnot include the rows having Error value as NULL.My intention is to filter only rows with Error value as 'Timeout' (or) 'Connection Error'. I need to give an additional condition( OR Error is NULL) to retrieve the correct result.

Why is MYSQL filtering out results with NULL values? I thought that IN keyword would return a boolean result (1/0) and now i understand that some MYSQL keywords doesnt return boolean values,it might return NULL too....but Why is it treating NULL as special?


Solution

  • This :

    Error not in ('Timeout','Connection Error');
    

    is semantically equivalent to:

    Error <> 'TimeOut' AND Error <> 'Connection Error'
    

    Rules about null comparison applies to IN too. So if the value of Error is NULL, the database can't make the expression true.

    To fix, you could do this:

    COALESCE(Error,'') not in ('Timeout','Connection Error');
    

    Or better yet:

    Error IS NULL OR Error not in ('Timeout','Connection Error');
    

    Or more better yet:

     CASE WHEN Error IS NULL THEN 1
     ELSE Error not in ('Timeout','Connection Error') THEN 1
     END = 1
    

    OR doesn't short-circuit, CASE can somehow short-circuit your query


    Perhaps a concrete example could illustrate why NULL NOT IN expression returns nothing:

    Given this data: http://www.sqlfiddle.com/#!2/0d5da/11

    create table tbl
    (
      msg varchar(100) null,
      description varchar(100) not null
      );
    
    
    insert into tbl values
    ('hi', 'greet'),
    (null, 'nothing');
    

    And you do this expression:

    select 'hulk' as x, msg, description 
    from tbl where msg not in ('bruce','banner');
    

    That will output 'hi' only.

    The NOT IN is translated as:

    select 'hulk' as x, msg, description 
    from tbl where msg <> 'bruce' and msg <> 'banner';
    

    NULL <> 'bruce' can't be determined, not even true, not even false

    NULL <> 'banner' can't be determined, not even true not even false

    So the null value expression, effectively resolved to:

    can't be determined AND can't bedetermined
    

    In fact, if your RDBMS supports boolean on SELECT(e.g. MySQL, Postgresql), you can see why: http://www.sqlfiddle.com/#!2/d41d8/828

    select null <> 'Bruce' 
    

    That returns null.

    This returns null too:

    select null <> 'Bruce' and null <> 'Banner'
    

    Given you are using NOT IN, which is basically an AND expression.

    NULL AND NULL
    

    Results to NULL. So it's like you are doing a: http://www.sqlfiddle.com/#!2/0d5da/12

    select * from tbl where null
    

    Nothing will be returned