While trying to isolate the 7 rows in the first query below, I got results that, as a novice, I was not expecting. I read over this SQLite document but do not understand why the 7 rows which are NULL are separated in the GROUP BY in the first query, but testing on != ''
, not in ('A','H')
, and =''
all exclude the NULL rows.
It seems as though the tests are exclusive, such that NULL is either =''
or !=''
, or in ('A','H')
or not in ('A','H')
. It appears to be ignored by all of these, yet is separated in the GROUP BY.
Would you please explain why it works this way?
Thank you.
sqlite> select substr(trim(grammarCode),1,1) as c, count(indexRow) as cnt
from tbl
group by c
order by cnt;
c cnt
- ------
7
A 4828
20046
H 300679
sqlite> select substr(trim(grammarCode),1,1) as c, count(indexRow) as cnt
from tbl
where c != ''
group by c
order by cnt;
c cnt
- ------
A 4828
H 300679
sqlite> select substr(trim(grammarCode),1,1) as c, count(indexRow) as cnt
from tbl
where c not in ('A', 'H')
group by c
order by cnt;
c cnt
- ------
20046
sqlite> select substr(trim(grammarCode),1,1) as c, count(indexRow) as cnt
from tbl
where c = ''
group by c
order by cnt;
c cnt
- ------
20046
sqlite> select substr(trim(grammarCode),1,1) as c, count(indexRow) as cnt
from tbl
where c is null
group by c
order by cnt;
c cnt
- ------
7
A boolean expression in SQL evaluates as true
or false
or null
.
The WHERE
clause of any SQL statement filters out all rows for which the boolean expression/condition is not true
, meaning false
and null
are filtered out.
All these boolean expressions:
null != ''
null = ''
null not in ('A', 'H')
are evaluated as null
(demo), because any comparison of/to null
without the use of the operator IS
returns null
.
This is why your 2nd, 3d and 4th queries filter out not only the rows that don't satisfy the condition in the WHERE
clause but also the rows where c
is null
.
If you want these rows where c
is null
you must explicitly mention that:
c != '' OR c IS NULL
c = '' OR c IS NULL
c not in ('A', 'H') OR c IS NULL
Or, for the first 2 cases you can use the operator IS
:
c IS NOT ''
c IS ''