Skip to content

Conversation

@aramisfacchinetti
Copy link

This pull request introduces significant enhancements to the Pareto/NBD model with dynamic covariates in the CLVTools package. The changes include new functionality for calculating the probability mass function (PMF) with dynamic covariates, additional helper functions, C++ integration for computational efficiency, and improved data preparation for covariate-based modelling.

New functionality for dynamic covariates:

  • Added a new function pnbd_dyncov_pmf_per_customer to calculate the PMF for the Pareto/NBD model with dynamic covariates for a single customer. This function integrates dynamic transaction and lifetime covariates into the PMF computation. (R/RcppExports.R, src/RcppExports.cpp, src/pnbd_dyncov_pmf.h) [1] [2] [3]

  • Introduced helper functions for calculating covariate effects (Bbar, Dbar, S1, S2_ij, etc.) and time boundaries (bu_i) used in the PMF computation. These functions support detailed modelling of customer behaviour over time. (R/RcppExports.R, src/pnbd_dyncov_pmf.h) [1] [2]

Data preparation and R integration:

  • Implemented pnbd_dyncov_prepare_data to preprocess transaction and covariate data for PMF computation. This function handles merging, transformations, and computation of time intervals and covariate effects. (R/pnbd_dyncov_pmf.R)

  • Added an R interface function pnbd_dyncov_pmf with an option to use either the R or C++ implementation for PMF calculation. This allows flexibility in balancing ease of use and computational performance. (R/pnbd_dyncov_pmf.R)

C++ integration for performance:

  • Introduced a C++ implementation of the dynamic covariate PMF computation to improve performance for large datasets. This includes the pnbd_dyncov_pmf_per_customer function and supporting components. (src/pnbd_dyncov_pmf.h, src/RcppExports.cpp) [1] [2] [3]

  • Added a wrapper for the GSL hypergeometric function 2F1 with error handling to support numerical stability in PMF calculations. (R/RcppExports.R, src/pnbd_dyncov_pmf.h) [1] [2]

Refactoring and renaming:

  • Renamed the function pnbd_dyncov_pmf to pnbd_dyncov_pmf_r to distinguish the R implementation from the new interface function. (R/pnbd_dyncov_pmf.R)

- Introduced a new header file `pnbd_dyncov_pmf.h` containing the declaration of functions and classes for handling dynamic covariates in the pnbd model.
- Implemented the `DynamicCovariates` class to manage and compute cumulative sums of covariate data.
- Added multiple PMF functions to calculate probabilities based on dynamic covariates, including per-customer calculations and hypergeometric functions.
- Ensured compatibility with Armadillo and GSL libraries for efficient mathematical computations.
- Introduced minimal working example for PNBD Dynamic Covariates PMF in `minimal_pmf_example.R`.
- Created a comprehensive benchmark script `pmf_example_benchmark.R` to compare R and Rcpp implementations of PMF.
- Added quick test script `quick_test_pmf.R` to verify function availability and basic functionality.
- Developed a simple demonstration script `simple_pmf_demo.R` showcasing PMF functions and their usage.
- Updated documentation for internal functions related to PMF calculations, ensuring clarity on parameters and outputs.
- Enhanced internal functions for better performance and stability in PMF calculations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant