-
Notifications
You must be signed in to change notification settings - Fork 5
quantile regression implementation notes #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| # Quantile Regression | ||
|
|
||
| This discusses how we can include quantile regression in tidymodels. | ||
|
|
||
| The discussion below is organized by package | ||
|
|
||
| ## parsnip | ||
|
|
||
| Users will probably have to specify the quantiles at the specification time (see section on new models). | ||
|
|
||
| ### Engines and Modes | ||
|
|
||
| The main question is: Should we make new engines available via `set_engine()` or create a new mode of `"quantile regression"`? | ||
|
|
||
| Pros of `set_engine()`: | ||
|
|
||
| * Most (but not all) quantile regression predictions will be for numeric modes. | ||
| * There is not much difference between quantile models that require a new mode (unlike censored regression). For example, our yardstick functions to compute loss for quantile regression can ingest the list column of predictions (similar to dynamic survival models). | ||
| * Where do we specify the list of quantiles? It would be suboptimal to add a main option to each function. The users could set it in `set_engine(),`, but that would require more specialized parsing of the return value of that function. | ||
|
|
||
| Cons of `set_engine()`: | ||
|
|
||
| * There could be confusion around the type of model and prediction being made. Suppose that `rpart` could make quantile predictions (it cannot). Would use an engine of “part” result in the regular CART model or one optimized via quantile loss? | ||
| * Since the list of quintiles would be specified upfront, it would be advantageous to do this with the new mode (`set_mode(“quantgreg,” quantiles = (1:9) / 10)`). | ||
|
|
||
| To test using a new mode, there is a [parsnip branch](https://github.com/tidymodels/parsnip/tree/quantile-mode) that enables that so you can use: | ||
|
|
||
| ``` | ||
| pak::pak(c("tidymodels/parsnip@quantile-mode"), ask = FALSE) | ||
| ``` | ||
|
|
||
| ### New Models | ||
|
|
||
| Some packages already fit these models, notably quantreg and quantregForest. | ||
|
|
||
| We can also make our own (I might do this with brulee). If so, the model functions should try to emulate the main arguments that we have. So for a neural network, use `hidden_units`, `penalty`, etc. | ||
|
|
||
| Many models need a specific training run for each quantile value (since the loss depends on it). For this reason, we would have users specify the required quantiles when they specify the model. | ||
|
|
||
| We should make a general class for these models in an engine-specific package. | ||
|
|
||
| #### Predictions | ||
|
|
||
| We currently note tin `?predict.model_fit`: | ||
|
|
||
| > For `type = "quantile"`, the tibble has a `.pred` column, which is a list-column. Each list element contains a tibble with columns `.pred` and `.quantile` (and perhaps other columns). | ||
|
|
||
| This should be changed so the list column is called `.pred_quantile` if we want `.pred` to contain the 50% quantile results. | ||
|
|
||
| The format of the list column would be | ||
|
|
||
| ``` | ||
| tibble(.quantile = numeric(), .pred_quantile = numeric()) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll just note that I find the use of "quantile" here (and in other places, used similarly) a bit confusing. I think that you are referring to a "probability level" for the quantile, such as 0.5 or 0.9. This is sometimes called also "quantile level". Somewhat confusingly, it's also sometimes called just "quantile". But then that can be easily confused with the associated quantile value itself. So I prefer something like "level" for that reason, to disambiguate.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can use "level". That should be intuitive for people. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We've been using: for specificity. It may also be worth considering giving the a class or using
|
||
| ``` | ||
|
|
||
| ## yardstick | ||
|
|
||
| We need one or more model performance metrics for quantile loss that work across all quantiles. They will need a new class too (say `“quantile_metric”`). | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the most obvious metric is just to add up the quantile loss over all quantile levels. This is given in (for example) equation (10) of these notes: https://www.stat.berkeley.edu/~ryantibs/statlearn-s23/lectures/calibration.pdf. As the notes explain, this is also equivalent (up to a factor of 2) to weighted interval score (WIS). That is defined in equation (8) and the equivalence is stated in equation (11). Some people in the field do not know of this equivalence and I've even seen papers reporting both quantile loss and WIS as separate columns of a predictive perrformance metrics table. So it'd be good for this package to clearly state they are equivalent! Perhaps in the documentation. WIS is nice because it offers a complementary view on the same equivalent metric. As explained in this paper: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008618, you can decompose the score by inspecting the two summands separately (summed up over all levels), and interpret these loosely as a spread/sharpness measure (first one) and calibration measure (second one). Some people like looking at that as consituents of error. Other than quantile loss and equivalently WIS, I'm not aware of any other score based on predicted quantiles that is proper. People in forecast scoring care a lot about propriety. However, I'll just note that it would be reasonable in general to additionally look at absolute error (AE) using the predicted 50% quantile (predicted median) if it's available in the list of quantiles, and also to look at coverage (empirical coverage of each quantile versus the nominal coverage). I think people often do this in practice. Finally, in an off-the-cuff remark, I'll note that I'm becoming more and more bothered by quantile loss/WIS as a general go-to metric. I think it can be too lenient for underdispersed forecasters. I have an ongoing project with students exploring this. I don't have anything concrete to recommend as an alternative at the moment, maybe we will at some future point. |
||
|
|
||
| ## tune | ||
|
|
||
| We will need an `estimate_quantiles()` (analogous to functions such as [`estimate_class_prob()`](https://github.com/tidymodels/tune/blob/main/R/grid_performance.R#L108C1-L108C20)). | ||
|
|
||
| Some specific callouts: | ||
|
|
||
| - https://github.com/tidymodels/tune/blob/main/R/grid_performance.R#L4 | ||
| - https://github.com/tidymodels/tune/blob/main/R/grid_performance.R#L98 | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this is the preferred way to do this, because of the cons listed above. Thanks for the testing branch. I'll try reimplementing some of our engines off this branch in the next few days and link back here describing any difficulties or benefits I encounter.