Post-processing write-up by topepo · Pull Request #27 · tidymodels/planning

topepo · 2023-05-12T18:46:49Z

No description provided.

DavisVaughan · 2023-05-15T13:05:05Z

+
+wflow_individ <- 
+  wflow_2 %>% 
+  add_prob_calibration(cal_object) %>% 


Maybe add a note about where cal_object comes from

DavisVaughan · 2023-05-15T13:06:44Z

+# 'object' is a pre-made calibration object
+# Data Inputs: class probabilities
+# Data Outputs: class probabilities and recomputed class predictions
+add_prob_calibration(object, .priority = 1.0)


All of these also have x as a first argument. Where x is the workflow

DavisVaughan · 2023-05-15T13:09:04Z

+# Potentially tunable
+# Data Inputs: class probabilities
+# Data Outputs: class predictions
+add_prob_threshold(threshold = numeric(), .priority = 2.0)


Might need a levels argument and an ordered argument? Like probably::make_two_class_pred()

DavisVaughan · 2023-05-15T13:12:12Z

+# Potentially tunable
+# Data Inputs: class probabilities
+# Data Outputs: class predictions
+add_cls_eq_zone(value = numeric(), threshold = numeric(), .priority = 3.0)


In probably probably::make_class_pred() had a buffer argument that created a range of [threshold - buffer[1], threshold + buffer[2]] where anything inside the buffer range was marked equivocal. Maybe you could use the buffer arg here?

Maybe also name it something similar to add_prob_threshold() like add_prob_threshold_buffered() where:

add_prob_threshold() always returns a factor (maybe ordered)

add_prob_threshold_buffered() always returns a <class_pred> from probably

DavisVaughan · 2023-05-15T13:16:18Z

+# User will have to always set the priority
+# Data Inputs: all predictions
+# Data Outputs: all predictions (only these columns are retained)
+add_post_mutate(..., .priority)


I would consider giving all of these functions a common prefix that differentiates them from the other add_*() functions, like add_post_*():

add_post_calibration() # do you need prob vs reg calibration? can you just "figure it out"? can it be an argument? add_post_threshold() add_post_threshold_buffered() add_post_mutate()

Conflicted on this. I agree this would be nice for tab completion, but is inconsistent with the naming convention for preprocessors: add_variables(), add_recipe(), add_formula().

DavisVaughan · 2023-05-15T13:20:52Z

+ - A container for the list of possible post-processors specified by the user.
+ - A validation system to resolve conflicts in type or priority. 
+ - An interface to apply the operations to the predicted values. 
+ - The requisite package dependencies (primarily the probably package)


I am mildly worried about the number of Imports that dev probably has. It is fairly high, and might be worth it to go back and see if some of them can be moved to Suggests as optional deps

DavisVaughan · 2023-05-15T13:23:00Z

+
+ - `new_stage_post()`, and `new_action_post()` are existing constructors. 
+
+We will require a `.fit_post(workflow, data)` that will execute _only_ the post-processing operations; the `workflows` object will already have trained the `pre` and `fit` stages. 


Since this is really an "internal" function, I think we can just assume that the workflow has already trained the pre and fit stages, i.e. we don't need to try to do any checks to see if that is true or not. It should only be used by workflows internally and tune

simonpcouch · 2023-05-16T16:49:49Z

Looking at this now—will merge some of Davis' suggestions and push some small edits. Will leave a review when finished!

Co-authored-by: Davis Vaughan <davis@rstudio.com>

simonpcouch

Few big-picture edits from me beyond Davis' comments!

simonpcouch · 2023-05-16T17:03:52Z

+ - An interface to apply the operations to the predicted values. 
+ - The requisite package dependencies (primarily the probably package).
+
+The dwai code may eventually make its way into workflows or probably. 


If this is our plan, I might argue we put this functionality in workflows or probably from the get-go. Feels a bit like this living its own package could eventually feel like technical debt.

topepo added 2 commits May 11, 2023 14:50

initial thoughts

c036f8e

render docs

a00afd6

topepo requested review from DavisVaughan and simonpcouch May 12, 2023 18:47

DavisVaughan reviewed May 15, 2023

View reviewed changes

simonpcouch and others added 2 commits May 16, 2023 12:50

apply suggestions from davis' code review

5918012

Co-authored-by: Davis Vaughan <davis@rstudio.com>

small copy edits

f5c00c1

simonpcouch approved these changes May 16, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post-processing write-up#27

Post-processing write-up#27
topepo wants to merge 4 commits into
mainfrom
post-proc

topepo commented May 12, 2023

Uh oh!

Uh oh!

DavisVaughan May 15, 2023

Uh oh!

Uh oh!

DavisVaughan May 15, 2023

Uh oh!

DavisVaughan May 15, 2023

Uh oh!

DavisVaughan May 15, 2023

Uh oh!

DavisVaughan May 15, 2023

Uh oh!

simonpcouch May 16, 2023

Uh oh!

DavisVaughan May 15, 2023

Uh oh!

DavisVaughan May 15, 2023

Uh oh!

simonpcouch commented May 16, 2023

Uh oh!

simonpcouch left a comment

Uh oh!

simonpcouch May 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		- `new_stage_post()`, and `new_action_post()` are existing constructors.

		We will require a `.fit_post(workflow, data)` that will execute _only_ the post-processing operations; the `workflows` object will already have trained the `pre` and `fit` stages.

Conversation

topepo commented May 12, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simonpcouch commented May 16, 2023

Uh oh!

simonpcouch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants