From 12f562d8fd59458d193b2e79dfc1517f94c1ca2a Mon Sep 17 00:00:00 2001 From: Bruno Calza Date: Tue, 9 Jun 2026 12:13:50 -0300 Subject: [PATCH] document kurtosis and skewness Signed-off-by: Bruno Calza --- documentation/changelog.mdx | 1 + documentation/query/functions/aggregation.md | 124 +++++++++++++++++++ 2 files changed, 125 insertions(+) diff --git a/documentation/changelog.mdx b/documentation/changelog.mdx index dc592f933..3f7c78cbe 100644 --- a/documentation/changelog.mdx +++ b/documentation/changelog.mdx @@ -26,6 +26,7 @@ This page tracks significant updates to the QuestDB documentation. - Documented custom root CA support for the replication object store: the `ca_cert_file` and `ca_builtin_roots` [connection-string parameters](/docs/high-availability/setup/#tls-with-a-private-or-self-signed-ca) trust a private, internal, or self-signed CA for S3, Azure Blob, and GCS stores - Added [`array_agg()`](/docs/query/functions/aggregation/#array_agg) aggregate function, covering scalar and array input forms, `SAMPLE BY` with FILL, and NULL handling - Documented the [`twap()`](/docs/query/functions/aggregation/#twap) designated-timestamp limitation: the timestamp argument must be the table's designated timestamp +- Added [`kurtosis()`](/docs/query/functions/aggregation/#kurtosis--kurtosis_samp), [`kurtosis_pop()`](/docs/query/functions/aggregation/#kurtosis_pop), [`skewness()`](/docs/query/functions/aggregation/#skewness--skewness_samp), and [`skewness_pop()`](/docs/query/functions/aggregation/#skewness_pop) aggregate functions for distribution shape ### Updated diff --git a/documentation/query/functions/aggregation.md b/documentation/query/functions/aggregation.md index 093b458ea..fcf9e970a 100644 --- a/documentation/query/functions/aggregation.md +++ b/documentation/query/functions/aggregation.md @@ -38,7 +38,11 @@ calculations. Functions are organized by category below. | [corr](#corr) | Pearson correlation coefficient | | [covar_pop](#covar_pop) | Population covariance | | [covar_samp](#covar_samp) | Sample covariance | +| [kurtosis / kurtosis_samp](#kurtosis--kurtosis_samp) | Sample excess kurtosis | +| [kurtosis_pop](#kurtosis_pop) | Population excess kurtosis | | [mode](#mode) | Most frequent value | +| [skewness / skewness_samp](#skewness--skewness_samp) | Sample skewness | +| [skewness_pop](#skewness_pop) | Population skewness | | [stddev / stddev_samp](#stddev--stddev_samp) | Sample standard deviation | | [stddev_pop](#stddev_pop) | Population standard deviation | | [var_pop](#var_pop) | Population variance | @@ -1306,6 +1310,66 @@ SELECT ksum(a) FROM (SELECT rnd_double() a FROM long_sequence(100)); ``` +## kurtosis / kurtosis_samp + +`kurtosis_samp(value)` - Calculates the sample excess kurtosis of a set of +values, ignoring missing data (e.g., NULL values). Kurtosis measures the +tailedness of a distribution. Following Fisher's definition, the value is +shifted so that a normal distribution has kurtosis 0: positive values indicate +heavier tails than the normal distribution, negative values indicate lighter +tails. The sample variant applies the Fisher-Pearson bias correction and returns +`null` when fewer than four non-null values are present. + +`kurtosis` is an alias for `kurtosis_samp`. + +#### Parameters + +- `value` is any numeric value. + +#### Return value + +Return value type is `double`. Returns `null` when fewer than four non-null +values are observed, or when all observed values are equal. + +#### Examples + +```questdb-sql demo title="Sample excess kurtosis" +SELECT kurtosis_samp(value) +FROM UNNEST(ARRAY[-10.0, -20.0, 100.0, 1000.0, 1000.0]); +``` + +| kurtosis_samp | +| :----------------- | +| -3.289971233898511 | + +## kurtosis_pop + +`kurtosis_pop(value)` - Calculates the population excess kurtosis of a set of +values, ignoring missing data (e.g., NULL values). Like `kurtosis_samp` this +follows Fisher's definition (a normal distribution has kurtosis 0), but applies +no bias correction. Defined when at least one non-null value is observed and the +values are not all equal; otherwise returns `null`. + +#### Parameters + +- `value` is any numeric value. + +#### Return value + +Return value type is `double`. Returns `null` when no non-null values are +observed, or when all observed values are equal. + +#### Examples + +```questdb-sql demo title="Population excess kurtosis" +SELECT kurtosis_pop(value) +FROM UNNEST(ARRAY[-10.0, -20.0, 100.0, 1000.0, 1000.0]); +``` + +| kurtosis_pop | +| :----------------- | +| -1.822492808474628 | + ## last - `last(column_name)` - returns the last value of a column. @@ -1579,6 +1643,66 @@ SELECT nsum(a) FROM (SELECT rnd_double() a FROM long_sequence(100)); ``` +## skewness / skewness_samp + +`skewness_samp(value)` - Calculates the sample skewness of a set of values, +ignoring missing data (e.g., NULL values). Skewness measures the asymmetry of a +distribution around its mean: positive values indicate a right-leaning (longer +right tail) distribution, negative values indicate a left-leaning one, and zero +indicates a symmetric distribution. The sample variant uses the Fisher-Pearson +bias-corrected G1 estimator and returns `null` when fewer than three non-null +values are present. + +`skewness` is an alias for `skewness_samp`. + +#### Parameters + +- `value` is any numeric value. + +#### Return value + +Return value type is `double`. Returns `null` when fewer than three non-null +values are observed, or when all observed values are equal. + +#### Examples + +```questdb-sql demo title="Sample skewness" +SELECT skewness_samp(value) +FROM UNNEST(ARRAY[-10.0, -20.0, 100.0, 1000.0, 1000.0]); +``` + +| skewness_samp | +| :----------------- | +| 0.5745116147533554 | + +## skewness_pop + +`skewness_pop(value)` - Calculates the population skewness of a set of values, +ignoring missing data (e.g., NULL values). Like `skewness_samp` this uses +Fisher's moment-ratio definition, but applies no bias correction. Defined when +at least one non-null value is observed and the values are not all equal; +otherwise returns `null`. + +#### Parameters + +- `value` is any numeric value. + +#### Return value + +Return value type is `double`. Returns `null` when no non-null values are +observed, or when all observed values are equal. + +#### Examples + +```questdb-sql demo title="Population skewness" +SELECT skewness_pop(value) +FROM UNNEST(ARRAY[-10.0, -20.0, 100.0, 1000.0, 1000.0]); +``` + +| skewness_pop | +| :------------------ | +| 0.38539410733550217 | + ## stddev / stddev_samp `stddev_samp(value)` - Calculates the sample standard deviation of a set of