Explore this snippet here.
Part of the data cleaning process involves understanding the quality of your data. NULL values are usually best avoided, so counting their occurrences is a common operation. There are several methods that can be used here:
sum(if(<column> is null, 1, 0)- use theIFFfunction to return 1 or 0 if a value is NULL or not respectively, then aggregate.count(*) - count(<column>)- use the different forms of thecount()aggregation which include and exclude NULLs.sum(case when x is null then 1 else 0 end)- similar to the IFF method, but using aCASEstatement instead.
with data as (
select * from (values (1), (2), (null), (null), (5)) as data (x)
)
select
sum(iff(x is null, 1, 0)) with_iff,
count(*) - count(x) with_count,
sum(case when x is null then 1 else 0 end) with_case
from data| WITH_IFF | WITH_COUNT | WITH_CASE |
|---|---|---|
| 2 | 2 | 2 |