Skip to content

Conversation

@0xMuluh
Copy link
Contributor

@0xMuluh 0xMuluh commented May 6, 2025

ping #93

Signed-off-by: Daena Rys <rysdaena8@gmail.com>
@0xMuluh 0xMuluh requested review from TuomasBorman and antagomir May 6, 2025 07:43
@0xMuluh
Copy link
Contributor Author

0xMuluh commented May 6, 2025

This wrapper currently defaults to using pairwise log-ratio (PLR) transformations via coda4microbiome::coda_coxnet() , consistent with the approach described in Calle et al., 2024.

plr was selected by the authors over clr, ilr, or alr for reasons related to interpretability, sparsity, and model perfomance in high-dimensional microbiome survival analysis. While this default appears justified and performs well empirically, it may not be optimal in all cases.

If plr proves sufficient across expected use cases, we can add the unit tests around this assumption.

If user needs or new evidence suggest flexibility is needed, we should refactor to allow other transformation via a parameter (e.g., transform.method = "plr", "clr", "ilr", etc.). or mia::transformAssay()

Currently the method returns a list of objects including ggplots and dfs/matrices. We can consider addSurvival() and store in metadata?

Open to discussion on whether this needs to be prioritized now or flagged for future maintenance..

Copy link
Collaborator

@TuomasBorman TuomasBorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this clear PR!

One thing to discuss is how much functionality we want to utilize from coda4microbiome package. If there are plans for broader interlinking, this PR might be fine. However, under the hood, coda_coxnet() is utilizing survival::coxph() and glmnet::cv.glmnet() and survminer for plotting.

The implementation in coda_coxnet() is not complex but includes some overhead that -- in my opinion -- does not necessarily fit well to our current design choices (transformation, printing, plotting).

Instead, this functionality for cox model could be implemented directly in miaTime package. We could take inspiration from the coda4microbiome package, but it seems that there is licence mismatch so this should be taken into account.

@antagomir
Copy link
Member

Thanks!

Indeed, I would not wrap coda4microbiome. We may like to test and run standard survival models (Cox) also with other transformation.

So let's keep separate:

  • data preparation (e.g. plr -> transformAssay)
  • statistical analysis / testing (e.g. Cox model, and glmnet if needed for feature selection)
  • summaries (e.g. tables)
  • visualizations (-> miaViz?)

This can take inspiration from coda4microbiome and cite it but Cox & GLM are more general and previously published methods. Moreover, the coda4microbiome paper did not include many comparisons to alternative transformations & methods. Our implementation should be modular and support different combinations of choices (e.g. transformation -> testing -> presentation).

The miaTime already includes data from the paper as example data, that could be used to verify the workflow (should yield the same result).

Signed-off-by: Daena Rys <rysdaena8@gmail.com>
@0xMuluh 0xMuluh requested a review from TuomasBorman August 22, 2025 11:44
Copy link
Contributor Author

@0xMuluh 0xMuluh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x <- .check_and_get_altExp(x, ...)
.check_input(time.col, list("character scalar"), colnames(colData(x)))
.check_input(status.col, list("character scalar"), colnames(colData(x)))
.check_assay_present(assay.type, x)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding pairwise logratios plr to transformAssay()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that mia PR #740 already provides this.

Signed-off-by: Daena Rys <rysdaena8@gmail.com>
Signed-off-by: Daena Rys <rysdaena8@gmail.com>
Copy link
Collaborator

@TuomasBorman TuomasBorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can you also add unit tests?

R/getSurvival.R Outdated
time.col,
status.col,
assay.type = "counts",
covar.cols = NULL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check argument names that we are already using. I think we are using col.var which would suit also here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also be defined as a formula

Signed-off-by: Daena Rys <rysdaena8@gmail.com>
@0xMuluh
Copy link
Contributor Author

0xMuluh commented Sep 1, 2025

Thanks! Can you also add unit tests?

unit tests added after #107

Copy link
Collaborator

@TuomasBorman TuomasBorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines +9 to +12
#' @description
#' Fit a (penalized) Cox proportional hazards model on microbiome data contained
#' in a SummarizedExperiment object. Data transformations (e.g. pairwise
#' log-ratios) should be handled upstream
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not evident what this method does.

Can you add Detail section?

R/getSurvival.R Outdated
Comment on lines 285 to 286
coefficients = coefs,
risk_scores = as.numeric(risk_scores),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coefficients coul dbe added to rowData and risk_scores to colData in addSurvival

res <- coefs[ abs(coefs) > coef.threshold ]
# Rescale the remaining coefficients
if( length(res) > 0L ){
res <- 2*res / sum(abs(res))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intention of multiplying the values with 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to have interpretable contrasts that live in space [-1, 1]. after the operation L1 norm is 2

@antagomir
Copy link
Member

In manpage description clearly cite the publication where the data is from (note that there is a special way to add citations in roxygen). Now you only refer to R package. But there is also formal publication to cite.

Cite all references in the manpage text so that it is clear why they are added on the reference list.

Running the manpage examples gives error

The assay contains already only strictly positive values. Pseudocount is not added.
Error in list(coefficients = coefs, risk_scores = risk_scores, c_index = c_index, :
argument 6 is empty

Either the function name should be getCox or something explicit, or if it is getSurvival then there should be a method argument that specific what survival method one wants to use.

@antagomir
Copy link
Member

These are time-to-event models. How about renaming status.col into event.col?

@antagomir
Copy link
Member

getSurvival has

A list with model summaries:

    • ‘coefficients’: estimated model coefficients

    • ‘selected_features’: nonzero features in penalized model

    • ‘risk_scores’: predicted risk scores

    • ‘cindex.apparent’: apparent concordance index

    • ‘cv.cindex.mean’: mean cross-validated C-index (if penalized)

    • ‘cv.cindex.sd’: SD of cross-validated C-index (if penalized)

    • ‘fit’: fitted model object (coxph or cv.glmnet)

-> Harmonize to use "_" as the separator for all (do not mix)

-> cindex.apparent could be C.apparent, or just C ?

-> For clarity I would rename cv.cindex.mean to C.mean.cv and same for the SD ting

-> should selected_features be just nonzero_features if these are the same thing

-> coefficients is usually just coef in many R methods -> harmonize?

@0xMuluh
Copy link
Contributor Author

0xMuluh commented Sep 12, 2025

In manpage description clearly cite the publication where the data is from (note that there is a special way to add citations in roxygen). Now you only refer to R package. But there is also formal publication to cite.

The data has not been used. Will use this data instead: b752170

@0xMuluh
Copy link
Contributor Author

0xMuluh commented Sep 12, 2025

Running the manpage examples gives error

The assay contains already only strictly positive values. Pseudocount is not added.
Error in list(coefficients = coefs, risk_scores = risk_scores, c_index = c_index, :
argument 6 is empty

Thanks, there was a typo , in the return list.

Signed-off-by: Daena Rys <rysdaena8@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants