Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/doc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# https://github.com/actions/checkout/issues/1471
# Clobber local unannotated tag
- name: Fetch tags
run: git fetch --prune --unshallow --tags --force
- uses: mamba-org/setup-micromamba@v2
with:
environment-name: docs
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@ dist
coverage.xml
jupyter_execute
docs/implementations/data
docs/implementations/data_l2_lr_ssh
docs/implementations/data_l3_lr_ssh
1 change: 1 addition & 0 deletions docs/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Remote file system listing can be quite long. Implementations are usually
shipped with layouts for an improved listing speed. See the
{ref}`Layout <layout>` introduction if listing performance becomes an issue.

(disable-layouts)=

## Disable layouts

Expand Down
61 changes: 54 additions & 7 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ given criterias

```{code-cell}
from fcollections.implementations import NetcdfFilesDatabaseSwotLRL2

fc = NetcdfFilesDatabaseSwotLRL2(path)
fc.list_files(cycle_number=1)
```
Expand Down Expand Up @@ -83,6 +84,8 @@ ds = fc.query(selected_variables=['ssha'])
list(ds.variables)
```

### Filter types

Each implementation has its own filters. By order of availability, the user
should consult:

Expand All @@ -95,23 +98,42 @@ should consult:
fc.query?
```

### Filter values

Possible values for a given filter can be displayed

```{code-cell}
fc.filter_values('version')
```

Only filters whose information are contained in the intermediated folders can be
scanned in a quick way, other will trigger a full scan. As such, to ensure
optimal performance, this method should be called with the layouts enabled, with
files organized with folders (see the [advanced section](#disable-layouts)), and
on filters whose information is encoded in the folders.

## Access metadata

The database can display information about the variables and attributes
contained in the files' collection using the ``variables_info`` method

```{code-cell}
fc.variables_info(subset='Expert')
# Use the enumeration name for filtering a specific subset
fc.variables_info()
```

It will offer a simple collapsible tree view with multiple levels of nesting
depending on the data you manipulate

In order to return consistent metadata, the method ensures that only one
homogeneous subset is selected. In case you handle unmixable data (for example
Expert and Unsmoothed datasets), you must give proper filters on the subset
partitioning keys ``fc.unmixer.partition_keys``. If these filters are missing,
an error with the possible choices will be raised.
## Subsets

### Errors on mixed subsets

In order to return consistent results, most methods must work on an homogeneous
subset of data. In case multiple subsets are mixed (for example Expert and
Unsmoothed datasets), proper filters matching the partitioning keys must be
given. If these filters are missing, an error with the possible choices will be
raised.

```{code-cell}
:tags: [raises-exception]
Expand All @@ -123,7 +145,32 @@ ds.to_netcdf(f'{path}/SWOT_L2_LR_SSH_Unsmoothed_001_012_20240101T030000_20240101
fc.variables_info()
```

### Compatibility matrix

The following table summarizes which methods can work on mixed data. Most
methods need homogeneous data and will require filtering the subset.

| Method | Works on mixed data ? |
|--------------------|-----------------------|
| ``list_files`` | Yes |
| ``variables_info`` | No |
| ``filter_values`` | No |
| ``query`` | No |
| ``map`` | No |

### Listing subsets

Subsets that are on the file system can be listed using the
{meth}`subsets <fcollections.core.FilesDatabase.subsets>` property.

```{code-cell}
fc.subsets
```

One of the returned choices must be selected and used as a filter to work on an
homogeneous dataset.

```{code-cell}
# Use the enumeration name for filtering
# Use the enumeration name for filtering a specific subset
fc.variables_info(subset='Expert')
```
2 changes: 1 addition & 1 deletion docs/implementations/l2_lr_ssh.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ There are currently three modes for stacking the half orbits
are cropped and we need an additional dimension to reflect the spatial jump

```{code-cell}
fc = NetcdfFilesDatabaseSwotLRL2("data")
fc = NetcdfFilesDatabaseSwotLRL2("data_l2_lr_ssh")
ds = fc.query(stack='CYCLES', cycle_number=[9, 10, 11], pass_number=10, subset='Basic')
ds.ssha_karin_2.data
```
Expand Down
2 changes: 1 addition & 1 deletion docs/implementations/l3_lr_ssh.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ There are currently three modes for stacking the half orbits
are cropped and we need an additional dimension to reflect the spatial jump

```{code-cell}
fc = NetcdfFilesDatabaseSwotLRL3("data")
fc = NetcdfFilesDatabaseSwotLRL3("data_l3_lr_ssh")
ds = fc.query(stack='CYCLES', version='2.0.1', cycle_number=[1, 2, 3], pass_number=10, subset='Basic')
ds.ssha_filtered.data
```
Expand Down
2 changes: 1 addition & 1 deletion docs/implementations/scripts/pull_data_l2_lr_ssh.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
logging.basicConfig()
logging.getLogger("altimetry_downloader_aviso").setLevel("INFO")

DATA_DIR = Path(__file__).resolve().parent.parent / "data"
DATA_DIR = Path(__file__).resolve().parent.parent / "data_l2_lr_ssh"
DATA_DIR.mkdir(exist_ok=True)

if __name__ == "__main__":
Expand Down
2 changes: 1 addition & 1 deletion docs/implementations/scripts/pull_data_l3_lr_ssh.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
logging.basicConfig()
logging.getLogger("altimetry_downloader_aviso").setLevel("INFO")

DATA_DIR = Path(__file__).resolve().parent.parent / "data"
DATA_DIR = Path(__file__).resolve().parent.parent / "data_l3_lr_ssh"
DATA_DIR.mkdir(exist_ok=True)

if __name__ == "__main__":
Expand Down
Loading
Loading