Explore this snippet here.
Part of the data cleaning process involves understanding the quality of your data. NULL values are usually best avoided, so counting their occurrences is a common operation. There are several methods that can be used here:
sum(if(<column> is null, 1, 0)- use theIFfunction to return 1 or 0 if a value is NULL or not respectively, then aggregate.count(*) - count(<column>)- use the different forms of thecount()aggregation which include and exclude NULLs.sum(case when x is null then 1 else 0 end)- similar to the IF method, but using aCASEstatement instead.
with data as (
select * from unnest([1, 2, null, null, 5]) as x
)
select
sum(if(x is null, 1, 0)) with_if,
count(*) - count(x) with_count,
sum(case when x is null then 1 else 0 end) with_case
from data| with_if | with_count | with_case |
|---|---|---|
| 2 | 2 | 2 |