tonkintaylor · harell · Sep 12, 2025 · Sep 12, 2025 · Sep 12, 2025 · Sep 12, 2025
diff --git a/.github/ISSUE_TEMPLATE/bug-report.md b/.github/ISSUE_TEMPLATE/bug-report.md
@@ -0,0 +1,20 @@
+---
+name: Bug Report
+about: Describe a bug or anything else that seems wrong
+title: ''
+labels: 'bug'
+
+---
+
+**Background**
+Give some context for how you encountered the problem. What were you doing?
+
+**Summary of the problem**
+A clear and concise description of the problem that seems to be happening.
+
+**Expected behaviour**
+What did you expect to happen instead?
+
+**Reproduction**
+Are you able to reproduce the error or is it intermittent? Which version of YOUR_PACKAGE_NAME
+are you using? Which operating system are you using?
diff --git a/.github/ISSUE_TEMPLATE/enhancement.md b/.github/ISSUE_TEMPLATE/enhancement.md
@@ -0,0 +1,12 @@
+---
+name: Enhancement Request
+about: A high-level description of a feature or other enhancement you'd like
+title: ''
+labels: ''
+---
+
+**Motivation**
+A clear and concise description of what benefit you would get from this enhancement.
+
+**Summary**
+A clear and concise description of what you want to happen, at a high level.
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -0,0 +1,95 @@
+# Custom Instructions
+
+## General Guidelines
+
+- We are using Windows for development and Linux for deployment/CI. When running commands in terminal, use Windows commands for local development (like ";" instead of "&&" to separate commands) and Linux commands when running in CI environments.
+- Every time I ask you to fix linter errors and provide the error messages, update the Linter section in `.github/copilot-instructions.md` accordingly. Use concise, oneliner instruction. Ensure your future responses avoid repeating the same errors.
+- Never create notebooks (ipynb files) unless asked explicitly.
+
+## Testing  
+
+- Write unit tests using `pytest` inside `tests/`, structured based on `src/`.  
+  - Example: `src/x/y/z` → `tests/x/y/test_z.py`  
+- Use test fixtures and group tests into classes when appropriate.  
+- After modifying a function, run its unit tests using pytest.
+- Run tests in virtual environment: `.\.venv\Scripts\activate; python -m pytest tests/path/to/test_file.py -v`
+
+## Python
+
+- We use Python 3.12, so ensure that the code is compatible and up to date with this version.
+- When adding a new package that requires installation, list it under dependencies in `pyproject.toml`, then run `tasks\dev_sync.ps1`.
+- Limit line length to 100 characters.
+- We are using uv to install packages.
+- Never create functions that return more than one output value.
+- Never use entry points via CLI tooling (e.g., click, argparse entry scripts) unless explicitly instructed.
+- Never return tuples; use dictionaries for multiple return values.
+- Do not add exceptions to functions unless explicitly requested.
+- Prefer to type hint strictly with the likes of `Literal["a", "b"]` instead of hinting broader types like `str`. This means the constraints on the input arguments to a function can reside in the type annotation rather than the docstring. Consider @validate_call (from pydantic import validate_call) to avoid boilerplate case-checking in such cases.
+- Refrain from backslash unescaping in raw strings (e.g., `r"\\path"` should be `r"\path"`).
+- When writing scripts, always use "Scripting Style" (Top-Level Code) unless stated otherwise. Write code directly at the module level instead of wrapping in functions or `if __name__ == "__main__":` blocks.
+- We do not use `if __name__ == "__main__":` guards in this codebase.
+- For scripts that need to access package files (e.g., templates, data files):
+
+  ```python
+  from importlib.resources import files
+
+  # Define package path as a module-level constant
+  PACKAGE_PATH = files('package.subpackage.module')
+
+  # Use joinpath for accessing files
+  file_path = Path(str(PACKAGE_PATH.joinpath('filename')))
+  ```
+
+  This ensures consistent path resolution in both interactive and script modes.
+
+## Documenting Functions
+
+- Remove dtype specifications from all `Args:` sections (e.g., `text (str):` → `text:`)
+- Use "Args:" instead of "Parameters:" for consistency
+
+## Code Structure & Data Handling
+
+- Use value objects stored in `src/<package_name>/domain/value_object.py`, implemented using Pandera > v0.2 and Pydantic. Prefer passing DataFrame when possible.  
+- Use Pandera > v0.2 syntax, such as `DataFrameModel` when possible.
+- When importing internal modules, do not include the "src" folder in the import path as it is already defined in pyproject.toml
+
+## Linter
+
+- For file-level linter suppressions, use `# ruff: noqa: RULE1, RULE2` format (not `# ruff noqa:`)
+- For line-level suppressions, use `# noqa: RULE1, RULE2` format
+- Use `pathlib.Path` for all filesystem operations instead of `os.path`. Path objects provide a more readable and maintainable object-oriented interface (e.g., `Path('dir') / 'file.txt'` instead of `os.path.join()`, `path.exists()` instead of `os.path.exists()`, etc.)
+- Exception messages must not use string literals directly, assign to variables first
+- Try-except patterns: Use bare `raise` in except blocks to preserve the original traceback, put return statements in the `else` block when using try-except (e.g., `try: result = process() except Exception: raise else: return result`)
+- Remove trailing whitespace from blank lines (W293)
+- Move statements after try blocks into else blocks when the statements depend on the try block's success (TRY300)
+- Use logging.exception instead of logging.error in except blocks (TRY400)
+- Do not include the exception object in logging.exception calls (TRY401)
+- Use keyword arguments for boolean parameters (FBT003) instead of positional arguments
+- Make boolean default arguments keyword-only using `*` to prevent positional passing (FBT002)
+- Avoid variable names that shadow Python builtins (A001)
+- Use snake_case for all variables in global scope (N816), not camelCase or mixedCase
+- Remove unused code instead of commenting it out (ERA001)
+- Break long docstring lines at logical points to stay under the 88 character limit (E501)
+- Include a blank line between docstring summary and description (D205)
+- Add type annotations to function signatures (ANN201, ANN204)
+- Add `# noqa: N806` comment for scikit-learn convention of uppercase X in tests
+- Use `dict` instead of `Dict` for type annotation (UP006)
+- Avoid importing deprecated types like `typing.Dict` (UP035)
+- Remove unused code instead of commenting it out (ERA001)
+- Never use `from __future__ import annotations` in any file; Python 3.11+ does not require it.
+
+## Testing
+
+- Use `pytest` for testing, and ensure all tests are passing before committing changes.
+- Do not perform equality checks with floating point values; instead, use `pytest.approx`.
+- Use only one assert statement per test function to ensure clarity and simplicity.
+
+## Documentation & Workflow Management
+
+- Document reusable knowledge (e.g., library versions, fixes, corrections) in the `Lessons` section of `.github/scratchpad.md`.  
+- Use `.github/scratchpad.md` to organise tasks:  
+  - Clear old tasks when starting a new one.  
+  - Plan steps and track progress using TODO markers:  
+    - [X] Task 1  
+    - [ ] Task 2  
+  - Update task progress, especially after milestones.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,125 @@
+name: CI
+permissions:
+  contents: read
+  pull-requests: write
+on:
+  workflow_dispatch:
+  push:
+    branches: ['master', 'develop']
+    paths-ignore:
+      - 'docs/**'
+      - '**/*.md'
+      - 'mkdocs.yml'
+  pull_request:
+    branches: ['master', 'develop']
+    paths-ignore:
+      - 'docs/**'
+      - '**/*.md'
+      - 'mkdocs.yml'
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  tests:
+    runs-on: ${{ matrix.os }}
+    env:
+      PYTHONIOENCODING: utf-8
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+
+      - name: Setup git user config
+        run: |
+          git config --global user.name placeholder
+          git config --global user.email placeholder@example.com
+
+      - name: Set up uv
+        uses: astral-sh/setup-uv@f94ec6bedd8674c4426838e6b50417d36b6ab231 # v5.3.1
+        with:
+          version: "0.8.3" # Sync with pyproject.toml
+          enable-cache: true
+
+      - name: Set up Python
+        uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Setup dependencies
+        run: |
+          uv python pin ${{ matrix.python-version }}
+          uv export --no-managed-python --no-group doc --resolution ${{ matrix.resolution }} > ci-requirements.txt 
+          uv pip install --system -r ci-requirements.txt
+
+      - name: Run pre-commit
+        if: matrix.pre-commit
+        run: |
+          uv run --frozen pre-commit run --all-files
+
+      - name: Run pytest
+        uses: pavelzw/pytest-action@510c5e90c360a185039bea56ce8b3e7e51a16507 # v2.2.0
+        if: matrix.pytest
+        with:
+          custom-arguments: --cov --junitxml=junit.xml -o junit_family=legacy --cov-report=xml
+
+      - name: Create test reports directory
+        if: matrix.pytest && matrix.os == 'ubuntu-latest' && matrix.python-version == '3.13'
+        run: mkdir -p ./test-reports
+
+      - name: Upload coverage reports
+        if: matrix.pytest && matrix.os == 'ubuntu-latest' && matrix.python-version == '3.13'
+        uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882 # v4.4.3
+        with:
+          name: coverage-reports
+          path: test-reports/coverage.xml
+
+    strategy:
+      matrix:
+        os: ["ubuntu-latest", "macos-latest", "windows-latest"]
+        python-version: ["3.10", "3.11", "3.12", "3.13"]
+        resolution: ["highest"]
+        pre-commit: [true]
+        pytest: [true]
+        include:
+          - os: "ubuntu-latest"
+            python-version: "3.10"
+            resolution: "lowest-direct"
+            pre-commit: false
+            pytest: true
+
+  code-analysis:
+    name: Analyse Code Quality
+    runs-on: ubuntu-latest
+    needs: tests
+    if: always() && needs.tests.result == 'success'
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+        with:
+          fetch-depth: 0  # Shallow clones should be disabled for better relevancy of analysis
+
+      - name: Download coverage reports
+        uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8
+        with:
+          name: coverage-reports
+          path: test-reports/
+        continue-on-error: true
+
+      - name: Create SonarQube properties
+        run: |
+          cat > sonar-project.properties << EOF
+          sonar.projectKey=${{ vars.SONAR_PROJECT_KEY }}
+          sonar.language=py
+          sonar.python.version=3.13
+          sonar.sources=./src
+          sonar.tests=./tests
+          sonar.python.coverage.reportPaths=./test-reports/coverage.xml
+          sonar.exclusions=**/Dockerfile,**/notebooks/**,**/scripts/**
+          sonar.verbose=false
+          EOF
+
+      - name: Run SonarQube analysis
+        uses: SonarSource/sonarqube-scan-action@884b79409bbd464b2a59edc326a4b77dc56b2195 # v3.1.0
+        env:
+          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
+          SONAR_HOST_URL: ${{ vars.SONAR_HOST_URL }}
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
@@ -0,0 +1,32 @@
+
+name: Release to PyPI
+permissions:
+  contents: read
+on:
+  push:
+    tags:
+      - 'v*'
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    environment: release
+    permissions:
+      id-token: write
+    steps:
+        - name: Checkout code
+          uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+
+        - name: Set up uv
+          uses: astral-sh/setup-uv@3b9817b1bf26186f03ab8277bab9b827ea5cc254 # v3.2.0
+          with:
+            version: "0.8.3" # Sync with pyproject.toml
+
+        - name: "Set up Python"
+          uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
+          with:
+            python-version: 3.13
+
+        - name: Release
+          run: |
+            uv build
+            uv publish --trusted-publishing always
diff --git a/.gitignore b/.gitignore
@@ -205,3 +205,7 @@ cython_debug/
 marimo/_static/
 marimo/_lsp/
 __marimo__/
+.github/scratchpad.md
+
+# Auto-generated version file
+src/**/_version.py