Skip to content

Allowing user defined stratifications #273

@edward-burn

Description

@edward-burn

Is your feature request related to a problem? Please describe.
It would be nice if as a user I could arbitrarily define my stratifications for TreatmentPatterns. We have a strata argument in other packages (like here https://darwin-eu-dev.github.io/CohortSurvival/articles/a01_Single_event_of_interest.html#with-stratification) which makes it easy to stratify on some study specific characteristics we can add as a variable to the cohort table.

Describe the solution you'd like
A strata argument which allows for extra variables added to the cohort table to be used for stratification

Describe alternatives you've considered
Running the function multiple times, subsetting down the cohort table to those people within my custom strata. (Rather inefficient and bespoke code needed)

Additional context
Would be nice if code like the below worked

library(CDMConnector)
#> Warning: package 'CDMConnector' was built under R version 4.4.2
library(TreatmentPatterns)
#> Warning: package 'TreatmentPatterns' was built under R version 4.4.2
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

cohortSet <- readCohortSet(
  path = system.file(package = "TreatmentPatterns", "exampleCohorts")
)
con <- DBI::dbConnect(
  drv = duckdb::duckdb(),
  dbdir = eunomia_dir())
#> Warning: `eunomia_dir()` was deprecated in CDMConnector 1.7.0.
#> ℹ Please use `eunomiaDir()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
cdm <- cdmFromCon(
  con = con,
  cdmSchema = "main",
  writeSchema = "main")
#> Note: method with signature 'DBIConnection#Id' chosen for function 'dbExistsTable',
#>  target signature 'duckdb_connection#Id'.
#>  "duckdb_connection#ANY" would also be valid
#> ! cdm name not specified and could not be inferred from the cdm source table
cdm <- generateCohortSet(
  cdm = cdm,
  cohortSet = cohortSet,
  name = "cohort_table",
  overwrite = TRUE
)
#> ℹ Generating 8 cohorts
#> ℹ Generating cohort (1/8) - acetaminophen✔ Generating cohort (1/8) - acetaminophen [628ms]
#> ℹ Generating cohort (2/8) - amoxicillin✔ Generating cohort (2/8) - amoxicillin [445ms]
#> ℹ Generating cohort (3/8) - aspirin✔ Generating cohort (3/8) - aspirin [429ms]
#> ℹ Generating cohort (4/8) - clavulanate✔ Generating cohort (4/8) - clavulanate [409ms]
#> ℹ Generating cohort (5/8) - death✔ Generating cohort (5/8) - death [318ms]
#> ℹ Generating cohort (6/8) - doxylamine✔ Generating cohort (6/8) - doxylamine [340ms]
#> ℹ Generating cohort (7/8) - penicillinv✔ Generating cohort (7/8) - penicillinv [311ms]
#> ℹ Generating cohort (8/8) - viralsinusitis✔ Generating cohort (8/8) - viralsinusitis [432ms]

cohorts <- cohortSet %>%
  # Remove 'cohort' and 'json' columns
  select(-"cohort", -"json", -"cohort_name_snakecase") %>%
  mutate(type = c("event", "event", "event", "event", "exit", "event", "event", "target")) %>%
  rename(
    cohortId = "cohort_definition_id",
    cohortName = "cohort_name",
  )

cdm$cohort_table <- cdm$cohort_table |> 
  PatientProfiles::addDemographics() |> 
  mutate(year_group = if_else(clock::get_year(cohort_start_date) > 2000, 
                              "a", "b")) |> 
  compute(name = "cohort_table", temporary = FALSE)

computePathways(
  cohorts = cohorts,
  cohortTableName = "cohort_table",
  cdm = cdm, 
  strata = list(c("age"),
                c("sex"), 
                c("year_group")))  
#> Error in computePathways(cohorts = cohorts, cohortTableName = "cohort_table", : unused argument (strata = list(c("age"), c("sex"), c("year_group")))

Created on 2025-01-17 with reprex v2.1.0

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions