Feature/add vospace file management for canfar SRCnet nodes#83
Feature/add vospace file management for canfar SRCnet nodes#83DrWhatson wants to merge 13 commits into
Conversation
shinybrar
left a comment
There was a problem hiding this comment.
In general @DrWhatson this is great work and I really want to thank you for your contributions to the project.
Here are some general outstanding issues,
-
File
uv.lockhas been deleted, which seems to be either a workaround of merge conflicts or an error, but it needs to be in the package. -
There is a lot of complexity introduced to make a remote server appear to be a posix system. We had lengthy discussions and decided that any reference to remote will require a server identify URI. Here is the general flow,
# Remote file listing
canfar storage ls can:/
canfar storage ls swe:/
canfar storage ls esp:/
canfar storage ls uk-ral:
# Copy will always require a uri for remote, and absolute / referenced path nomenclature for local files
canfar storage cp swe:/path/to/file.py /path/to/file.py
canfar storage rm can:/path/to/file.py
canfar storage rm -R esp:/path/
canfar storage mkdir -p uk-ral:/path/to/new/directory
Except the cp command, all commands should require a prefix URI identifier.
canfar storage mv swe:/path/to/file.py ./file.py should always be a cp -> rm, when doing from local machine to the platform, essentially performing an upload does it delete the local copies?
- I recogonize there needs to be some work down to configure the URI Identifier, e.g.
cavern:/etc., but for now can you add some documentation on how things are configure.
While canfar storage is valuable, but the current implementation is too monolithic and will be expensive to maintain. I recommend convernt the storage CLI into a a package split by command / use case, e.g. under canfar/cli/storage/
- commands.py (Typer Wiring)
- [copy | ls | rm | mkdir | mv].py
- helpers.py for shared logic
- policies.py for retry, include, exclude, cutoff etc.
- types.py for typed models
- adapters.py for performing local FS and VOSpace FS operations.
While these are big changes, the split will improve readability, testability, and long-term safety without changing user-facing behavior.
|
|
||
| storage_cli = typer.Typer( | ||
| name="storage", | ||
| help="Manage files in Cavern storage", |
There was a problem hiding this comment.
Should be Manage files in CANFAR
| The `canfar vos` command group provides comprehensive file management capabilities for VOSpace, CANFAR's persistent storage system. These commands use the same authentication context as session management, providing a unified experience. | ||
|
|
||
| !!! info "VOSpace Paths" | ||
| VOSpace paths use the `vos:` prefix, e.g., `vos:/data/file.txt` or `vos:` for the root directory. | ||
|
|
||
| ### `canfar vos ls` |
There was a problem hiding this comment.
Implementation exposes canfar storage, but docs still instruct canfar vos
| ### `canfar vos cat` | ||
|
|
||
| Display contents of a VOSpace file. | ||
|
|
||
| ```bash | ||
| canfar vos cat [OPTIONS] URI | ||
| ``` | ||
|
|
||
| **Arguments:** | ||
|
|
||
| - `URI` (required): VOSpace file to display | ||
|
|
||
| #### Options | ||
|
|
||
| | Option | Description | | ||
| |--------|-------------| | ||
| | `--head` | Display only the header (useful for FITS files) | | ||
| | `--debug` | Enable debug logging | | ||
|
|
||
| !!! example "Display File Contents" | ||
| ```bash | ||
| canfar vos cat vos:/data/readme.txt | ||
| ``` | ||
|
|
||
| !!! example "Display FITS Header" |
There was a problem hiding this comment.
Neither canfar vos cat or canfar storage cat are registered in the main CLI. Is this a missing functionality or error?
| canfar vos cat --head vos:/data/image.fits | ||
| ``` | ||
|
|
||
| ### `canfar vos ln` |
| canfar vos ln https://example.com/data.fits vos:/data/external_link | ||
| ``` | ||
|
|
||
| ### `canfar vos lock` |
| - Context: client/context.md | ||
| - Overview: client/overview.md | ||
| - HTTPClient: client/client.md | ||
| - VOSpaceClient: client/vospace.md |
There was a problem hiding this comment.
Should be called StorageClient
| if not node_path.endswith('/'): | ||
| if client.get_node(node_path).islink(): | ||
| log.info(f"Deleting link {node_path}") | ||
| client.delete(node_path) | ||
| console.print(f"[green]Deleted link {node_path}[/green]") | ||
| elif client.isfile(node_path): | ||
| log.info(f"Deleting {node_path}") | ||
| client.delete(node_path) | ||
| console.print(f"[green]Deleted {node_path}[/green]") | ||
| elif client.isdir(node_path): |
There was a problem hiding this comment.
rm branches on a trailing slash. Is it correct to assume that vos:/dir may pass without deletion / error but vos:/dir/ will error out correctly? This is a destructive command and should be deterministic..
| for source_pattern in source: | ||
| if head and not client.is_remote_file(source_pattern): | ||
| log.error("--head only works for source files in Cavern") | ||
| continue |
| @storage_cli.command("cp") | ||
| def copy_files( |
There was a problem hiding this comment.
This function is entirely too complex from a maintainability and readability POV. Its mixing parsing, path resolution, traversals, retries, transfer execution and also print presentation in the same function. This needs to be refactored into simpler functions, e.g. under /canfar/cli/storage/helpers.py while most of the top level functions can be canfar/cli/storage/copy.py.
| original_get_endpoints = self._vos_client.get_endpoints | ||
| # Track which endpoints have been authenticated | ||
| authenticated_endpoints: set = set() | ||
|
|
||
| def wrapped_get_endpoints(uri): | ||
| """Wrapper that configures Bearer-only auth on endpoints.""" | ||
| endpoint = original_get_endpoints(uri) | ||
|
|
||
| if endpoint.resource_id not in authenticated_endpoints: | ||
| try: | ||
| # Set token on subject for capability URL lookups | ||
| endpoint.conn.subject.token = token | ||
|
|
||
| # Set token on session for Bearer auth header | ||
| session = endpoint.conn.ws_client._get_session() | ||
| session.token = token |
There was a problem hiding this comment.
This seems a bit brittle and most likely needs to be isolated behind a adapter contract.
Added vostools wrapped under "storage" sub command. Automatically uses token stored in the config context
Tests and documentation also added