-
Notifications
You must be signed in to change notification settings - Fork 25
feat!: SPRAS revision #320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
66 commits
Select commit
Hold shift + click to select a range
b0327a2
feat: spras_revision
tristan-f-r 8cec738
style: fmt
tristan-f-r 5683392
test: summary
tristan-f-r af90ce0
docs(test_summary): mention preprocessing motivation
tristan-f-r 6141874
test(analysis/summary): use input from /input instead
tristan-f-r 440a2d4
docs(test/analysis): mention dual integration testing
tristan-f-r d9e852b
test(analysis/summary): use test/analysis provided gold standard
tristan-f-r abb0eb9
style: fmt
tristan-f-r 60185fc
chore: don't repeat docs inside analysis configs
tristan-f-r e6bd6a0
feat: get working with cytoscape
tristan-f-r f9a3081
style: fmt
tristan-f-r 77fc3b4
test: remove nondet from analysis
tristan-f-r 0592850
fix: get input pathways at runtime
tristan-f-r 0b6413d
Merge branch 'umain' into hash
tristan-f-r 1817157
fix: rm run
tristan-f-r c077d91
Merge branch 'main' into hash
tristan-f-r 50f2195
fix: correct for pydantic
tristan-f-r d3a088b
fix: attach spras revision inside gs_values
tristan-f-r 8e3b898
chore: drop re import
tristan-f-r 1ada504
Merge branch 'main' into hash
tristan-f-r 34a40ad
fix: correct tests
tristan-f-r 5d2c6d0
Merge branch 'main' into hash
tristan-f-r ef15781
Merge branch 'main' into hash
tristan-f-r 8d5019b
fix: correct Snakefile
tristan-f-r 9949572
fix: use correct gs variable
tristan-f-r 3cd25e8
Merge branch 'main' into hash
tristan-f-r 0965a68
test: correct config
tristan-f-r a169505
fix: correct name again
tristan-f-r eec09f2
Merge branch 'main' into hash
tristan-f-r a8d71bd
test: fix files
tristan-f-r e12fc75
apply suggestions
tristan-f-r 977bf5a
clean, fix: strip project_directory
tristan-f-r 8500bcb
fix: correct equality on not SPRAS pyproject.toml
tristan-f-r 112db39
chore: grammar
tristan-f-r c7262ed
chore: move attach_spras_revision out of Snakefile
tristan-f-r f69a0f3
Merge branch 'main' into hash
tristan-f-r 72e30bf
fix: properly resolve merge conflict
tristan-f-r c71b652
fix: undo mistaken merge conflict
tristan-f-r 6b941e0
chore: drop unnecessary self.datasets initialization
tristan-f-r fbf0ceb
feat: dynamic spras versioning
tristan-f-r edc0369
chore: error handling on setup.pu
tristan-f-r 3a1251d
docs: note on git commit hashes
tristan-f-r d330d6a
chore: drop git magic
tristan-f-r 5e31d06
feat: correctly parse RECORD
tristan-f-r dba2b45
style: fmt
tristan-f-r 90b4e1f
feat: optional spras revision
tristan-f-r fd5a490
docs: osdf_immutable info; ci: debug
tristan-f-r 210897b
ci: ??????
tristan-f-r 816dd28
fix: don't use distribution files, opt for purepath
tristan-f-r cd78a2a
style: fmt
tristan-f-r b025b7d
fix: tag iff osdf immutable, correct functools.partial sig
tristan-f-r 8ce8c31
apply suggestions
tristan-f-r 9bbf7cf
docs: info on spras revision, change names
tristan-f-r 9ce6241
docs: clarify confusing symbol choice
tristan-f-r f7cabd8
refactor: move revision out
tristan-f-r eddcf67
fix: spelling err
tristan-f-r 9ab902a
docs: on editable spras installs
tristan-f-r 4b37700
docs: design
tristan-f-r 46fff30
docs(design): notes about record files
tristan-f-r 39f6cbc
docs(design): flag typo
tristan-f-r 809dfb3
Merge remote-tracking branch 'upstream/main' into hash
tristan-f-r 0b57f8c
Merge branch 'umain' into hash
tristan-f-r d7bf7df
refactor(Snakefile): isolate algorithm assignment
tristan-f-r 2799cc1
docs(design): use correct parameter name
tristan-f-r 5250f6a
docs: osdf design clarification
tristan-f-r a42000e
chore(test/analysis): drop unused config settings
tristan-f-r File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| SPRAS Designs | ||
| ============= | ||
|
|
||
| SPRAS makes a few high-level design decisions. We motivate them here. | ||
|
|
||
| .. Right now, this only talks about immutable outputs. In the future, this may include, and is not limited to: | ||
| .. container-agonistic volumes, directionality, parameter tuning, and typed configs/algorithms. | ||
|
|
||
| Immutable Outputs | ||
| ----------------- | ||
|
|
||
| During benchmarking runs, SPRAS data is uploaded to the `Open Science | ||
| Data Federation <https://osg-htc.org/services/osdf>`__. OSDF enforces an | ||
| immutable file structure, where files can never be deleted or rewritten. | ||
| By default, SPRAS does not have immutable files. However, in SPRAS | ||
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| configurations, the ``immutable_files`` parameter can be enabled to make | ||
| files fully immutable where no file with the same file name will be | ||
| written with different data. | ||
|
|
||
| To do this, SPRAS tags all datasets, gold standards, and algorithms with | ||
| a version hash, which is effectively the current version of how SPRAS | ||
| processes that data in-code. | ||
|
|
||
| In implementation, this version hash is the hash of the `RECORD | ||
| <https://packaging.python.org/en/latest/specifications/recording-installed-packages/#the-record-file>`__ file, | ||
| which contains hashes of all 'installed' files. When SPRAS is not installed | ||
| in development mode (i.e. without the ``--editable`` flag), the ``RECORD`` file | ||
| hashes all Python source files, leading to the desired effect that | ||
| the version hash changes when the source code changes. In development mode, | ||
| the ``RECORD`` file does not change when source code is changed. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| """ | ||
| The revision is an optional hash associated to all files in the designated output directory | ||
| to make sure that file _names_ are immutable. We attach the revision to three labels: | ||
|
|
||
| - Datasets | ||
| - Gold standards | ||
| - Algorithms | ||
|
|
||
| In the future, the spras revision may change depending on what files are effected (e.g specific algorithms | ||
| will have specific revisions that change as they get updated) to avoid unnecessary running in the | ||
| Reed-CompBio/spras-benchmarking repository. | ||
|
|
||
| This is an optional feature, as the `spras_revision` function below is dependent on a RECORD file | ||
| (described in the docstring associated with `spras_revision`.) | ||
|
|
||
| We provide the convenient attach_spras_revision used in ./config.py, and `detach_spras_revision` used to get | ||
| rid of the revision for algorithms specifically. | ||
| """ | ||
|
|
||
| import functools | ||
| import hashlib | ||
| import importlib.metadata | ||
| import sysconfig | ||
| from pathlib import Path | ||
|
|
||
|
|
||
| @functools.cache | ||
| def spras_revision() -> str: | ||
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| """ | ||
| Gets the current revision of SPRAS. | ||
|
|
||
| Note: This is not dependent on the SPRAS release version number nor the git commit, but rather solely on the PyPA RECORD file, | ||
| (https://packaging.python.org/en/latest/specifications/recording-installed-packages/#the-record-file), which contains | ||
| hashes of all of the installed SPRAS files [excluding RECORD itself], and is also included in the package distribution. | ||
| This means that, when developing SPRAS, `spras_revision` will be updated when spras is initially installed. However, for editable | ||
| pip installs (e.g. from `pip install -e .`), the `spras_revision` will not be updated, | ||
| as the RECORD file only contains metadata: https://setuptools.pypa.io/en/latest/userguide/development_mode.html. | ||
tristan-f-r marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| """ | ||
| try: | ||
| site_packages_path = sysconfig.get_path("purelib") # where .dist-info is located. | ||
|
|
||
| record_path = Path( | ||
| site_packages_path, | ||
| f"spras-{importlib.metadata.version('spras')}.dist-info", | ||
| "RECORD" | ||
| ) | ||
| with open(record_path, 'rb', buffering=0) as f: | ||
| # Truncated to the magic value 8, the length of the short git revision. | ||
| return hashlib.file_digest(f, 'sha256').hexdigest()[:8] | ||
| except importlib.metadata.PackageNotFoundError as err: | ||
| raise RuntimeError('spras is not an installed pip-module: did you forget to install SPRAS as a module?') from err | ||
|
|
||
|
|
||
| def attach_spras_revision(immutable_files: bool, label: str) -> str: | ||
| """ | ||
| Attaches the SPRAS revision to a label. | ||
| This function signature may become more complex as specific labels get versioned. | ||
|
|
||
| @param label: The label to attach the SPRAS revision to. | ||
| @param immutable_files: if False, this function is equivalent to `id`. | ||
| """ | ||
| if immutable_files is False: return label | ||
| # We use the `_` separator here instead of `-` as summary, analysis, and gold standard parts of the | ||
| # Snakemake workflow process file names by splitting on hyphens to produce new jobs. | ||
| # If this was separated with a hyphen, we would mess with that string manipulation logic. | ||
| return f"{label}_{spras_revision()}" | ||
|
|
||
| def detach_spras_revision(immutable_files: bool, attached_label: str) -> str: | ||
| """The inverse of `attach_spras_revision`.""" | ||
| if immutable_files is False: return attached_label | ||
| # `rpartition` starts at the end: detach_spras_revision(b, attach_spras_revision(b, s)) = s for all b, s. | ||
| return attached_label.rpartition("_")[0] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.