Conversation
…processing accordingly
Making feature branch up to date with depedency package fixes
… omit some references to it
|
I made a branch off of this branch: https://github.com/nadeemlab/smprofiler/tree/merge_main_s3_etl_cache In that branch I merged in |
|
Since this PR modifies the ETL, in the local testing one needs to be sure to run make force-rebuild-data-loaded-imagesbefore running tests, in case any old images are present in your docker cache. This will force the ETL to run on the test datasets. |
|
On that previously mentioned branch, all the data-loaded images build without errors (in my environment). |
Merge main s3 etl cache
…idation (#444) * start cleaning up unused or unnecessary code in refactored areas, and do some related deprecations, and start getting builds and tests to pass * Include all counts in initial etl * start adding context docs * Start deprecating old parsing loop, add docs, and break down new parsing loop into readable units * continue refactor, adding docs * More deprecations and notes * Deprecate intermediate caching steps and start deprecating feature matrix extraction implementation, old * Implement read operations on cache store * Start reducing channel metadata passing to just feature order * Reimplement feature matrix extractor to start from saved payloads * Deprecate "autocomputed" squidpy metrics, just use normal accessor * Further deprecations * More deprecations * move insert count / cache counts * typing * fix circular import * Restore small bitwise operation * fix typos * linting * some version bumps * Update expressions matrix test * Simplify feature matrix extraction output, add cache store exists check * Version bump for major dependencies, expected versions list * Update expected record counts after removing big tables * Update some tests, typos etc * Simplify headers test * Fix 0 indexing * Fix bug that included IDs as channel values subsampler * Fix function signatures * version bumps * Deprecate obsolete test * Deprecate obsolete test * Deprecate obsolete test * Deprecate obsolete test * Fix functional call signature * Update test for continuous intensity channel matrix integrity * Deprecate portion of test * Comment * Linting * Use new cache store in cells data accessor; put together additional byte-level ops into compressed matrix handling module; partially normalize the ondemand-workers access to cells data to use same system as feature matrix aggregation, less duplication * Reduce duplicatoin * Linter * Deprecate specialized blob type for umap, re-use existing * Linter * Linter * Fix typo * Version bump and include change log notes. * Renaming * Update version * Deprecate constraint drop/recreate module * Deprecate constraint drop/recreate module * Add rough scale-down for continuous channel data in UMAP case, conform to other samples * Deprecate recording/output of unused channel metadata
|
I just merged a large refactor into this branch, including many deprecations. After conforming this branch to main it will be ready to merge. |
Related to #419
initial attempt to create S3 cache alternative and slightly refactor on-dmand processing accordingly