Conversation
Zethson
left a comment
There was a problem hiding this comment.
edata.layers["tem_data"] = np.abs(edata.layers["tem_data"])
is ugly in the examples. Should this be a parameter of blobs?
Thank you!
We could then state that installing tensorly can speed this up - like pynndescent for scanpy neighbors. |
|
Implementation speedup now on par with tensorly (also adjusted first comment): |
Zethson
left a comment
There was a problem hiding this comment.
The whole docstring formatting is off. Dangling words of new sentences, single word lines, ....
What's your autoformatter? Or is this just AI bullshit?
| for each cluster it identifies which NCP component best represents that | ||
| cluster, selects the top variables of that component, and visualises their | ||
| mean trajectories over the time axis — all from the raw data, not the | ||
| low-rank approximation. |
| are selected. | ||
| 3. Mean probability trajectories over the time axis are plotted for those | ||
| variables, averaged across all observations in the cluster. | ||
| One panel is drawn per unique value in ``edata.obs[cluster_key]``, |
| key: Key under which NCP results are stored (matches ``key_added`` in | ||
| :func:`~ehrapy.tools.ncp`). | ||
| n_top_diseases: Number of top-loaded variables to show per cluster. | ||
| sigmoid_transform: Apply a sigmoid transformation to the layer values |
| random_state: Seed for initialisation. | ||
|
|
||
| Returns: | ||
| weights |
There was a problem hiding this comment.
Formatting is wrong here! Does this render without issues?
| factors | ||
| List of non-negative factor matrices, one per tensor mode. | ||
| """ | ||
| # Derive the array namespace from the tensor so the algorithm is |
There was a problem hiding this comment.
This is AI generated useless shit. Please remove all useless AI comments - also below
|
|
||
| Uses :func:`tensorly.decomposition.non_negative_parafac`. | ||
| CP (CANDECOMP/PARAFAC) decomposition factorises a 3-way tensor | ||
| ``X ∈ ℝ^{I×J×K}`` into a sum of ``rank`` outer products:: |
There was a problem hiding this comment.
Just below you use a math tag to redner it but not here. Why?
| {F_{\text{mode}} \, \mathrm{KR}(F_{-\text{mode}})^\top | ||
| \mathrm{KR}(F_{-\text{mode}}) + \varepsilon} | ||
|
|
||
| where :math:`\mathcal{X}_{(\text{mode})}` is the mode-*n* matricisation of |
| init: Initialisation strategy passed to :func:`~tensorly.decomposition.non_negative_parafac` (``"random"`` or ``"svd"``). | ||
| All values must be non-negative (use ``sigmoid_transform=True`` for | ||
| logit layers, or ``np.abs`` / clipping beforehand). | ||
| rank: Number of components (rank of the decomposition). Each component |
|
Thank you for all the work and speeding it up! |
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
PR Checklist
docsis updatedDescription of changes
Fixes #1034 .
Technical details
Removes
tensorlydependency, and replace with a custom implemention in numpy.In a nutshell
Compare ehrapy vs tensorly on synthetic data
Compare convergence
Convergence and reconstruction
Comment: tensorly converged in less iterations.
Compare runtime
Comment: tensorly is matched in speed in order of magnitude
Compare ability to recover known signal
ehrapy implementation recovered factors
tensorly implementation recovered factors
Comment: both tensorly and ehrapy recover known longitudinal structure.
Additional context