Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
807f8f6
CU-869ddh1jv: Move test models to a separate central folder
github-actions[bot] May 22, 2026
f9d11f0
CU-869ddh1jv: Add pooch dependency in preparation for centralised fet…
github-actions[bot] May 22, 2026
64f4fd7
CU-869ddh1jv: Add resource fetcher
github-actions[bot] May 22, 2026
befa9f8
CU-869ddh1jv: Add some defensiveness to resource fetcher
github-actions[bot] May 22, 2026
23ba05b
CU-869ddh1jv: Create enum for defined resources
github-actions[bot] May 22, 2026
a6b960c
CU-869ddh1jv: Allow using defined resource name
github-actions[bot] May 22, 2026
7578151
CU-869ddh1jv: Change logic when using defined resource name
github-actions[bot] May 22, 2026
e0241b9
CU-869ddh1jv: Use centralised test paths
github-actions[bot] May 22, 2026
204916c
CU-869ddh1jv: Use centralised path for models in conversion tests
github-actions[bot] May 22, 2026
158ae94
CU-869ddh1jv: Add comment regarding duplicate files
github-actions[bot] May 22, 2026
679514c
CU-869ddh1jv: Remove local test-time model packs
github-actions[bot] May 22, 2026
370afae
CU-869ddh1jv: Add duplicate resource fetch to medcat-den
github-actions[bot] May 22, 2026
1efc53a
CU-869ddh1jv: Propagate project name in resource fetcher
github-actions[bot] May 22, 2026
fe6dd1f
CU-869ddh1jv: Add small workflow to check test-time resource fetcher …
github-actions[bot] May 22, 2026
c282dca
CU-869ddh1jv: Use resource fetch at test time for medcat-den
github-actions[bot] May 22, 2026
54d47ad
CU-869ddh1jv: Add workflow to add test models to releases
github-actions[bot] May 22, 2026
f3f331f
CU-869ddh1jv: Add workflow jon to add test models to medcat release
github-actions[bot] May 22, 2026
95fbd70
CU-869ddh1jv: Add workflow job to add test models to medcat-den release
github-actions[bot] May 22, 2026
4a623e9
CU-869ddh1jv: Fix issue with test utils sync check in workflow
github-actions[bot] May 22, 2026
5391667
CU-869ddh1jv: Fix test upload step in releas workflow
github-actions[bot] May 22, 2026
9eb5594
CU-869ddh1jv: Fix test upload step comment in medcat-den release work…
github-actions[bot] May 22, 2026
3dde906
CU-869ddh1jv: Fix test upload step in medcat release workflow
github-actions[bot] May 22, 2026
5f21cab
CU-869ddh1jv: Fix workflow (add runs-on) for medcat-den
mart-r May 22, 2026
18641a5
CU-869ddh1jv: Fix workflow for medcat to upload test models (add runs…
mart-r May 22, 2026
d0d2058
CU-869ddh1jv: Add missing dev-time dependency of pooch to medcat-den
github-actions[bot] May 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .github/workflows/medcat-den_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,13 @@ jobs:
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages_dir: medcat-den/dist

# test-time models for download
upload-test-models:
runs-on: ubuntu-latest
needs: test-and-publish-to-PyPI
steps:
- name: Upload test models to release
uses: ./.github/workflows/upload-test-models.yml
with:
tag: ${{ github.ref_name }}
7 changes: 7 additions & 0 deletions .github/workflows/medcat-v2_main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,14 @@
run:
working-directory: ./medcat-v2
jobs:
test-resource-utils:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Check test utils are in sync
run: diff tests/resource_fetch.py ../medcat-den/tests/resource_fetch.py

base-install-imports:

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
Expand Down
10 changes: 10 additions & 0 deletions .github/workflows/medcat-v2_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -205,3 +205,13 @@ jobs:
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: medcat-v2/dist

# test-time models for download
upload-test-models:
runs-on: ubuntu-latest
needs: release
steps:
- name: Upload test models to release
uses: ./.github/workflows/upload-test-models.yml
with:
tag: ${{ github.ref_name }}
22 changes: 22 additions & 0 deletions .github/workflows/upload-test-models-to-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# .github/workflows/upload-test-models.yml
name: Upload test models to release

on:
workflow_call:
inputs:
tag:
required: true
type: string

jobs:
upload-test-models:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Upload test models to release
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
gh release upload ${{ inputs.tag }} medcat-test-models/* --clobber

1 change: 1 addition & 0 deletions medcat-den/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ dev = [
"ruff",
"mypy",
"diskcache-stubs",
"pooch",
]

[project.urls]
Expand Down
8 changes: 3 additions & 5 deletions medcat-den/tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,10 @@

from medcat.cat import CAT

from .resource_fetch import get_resource

MODEL_PATH = os.path.join(
os.path.dirname(__file__), "resources", "mct2_model_pack.zip")
V1_MODEL_PATH = os.path.join(
os.path.dirname(MODEL_PATH), "mct_v1_model_pack.zip"
)
MODEL_PATH = get_resource("mct2_model_pack.zip", 'medcat_den')
V1_MODEL_PATH = get_resource("mct_v1_model_pack.zip", 'medcat_den')


# unpack
Expand Down
78 changes: 78 additions & 0 deletions medcat-den/tests/resource_fetch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# NOTE: this file is designed to be copied across the following sub-folders
# 1. medcat-v2/tests/resource_fetch.py
# 2. medcat-den/tests/resource_fetch.py
# So if you make changes here, copy them over to the others as well.
#
# NB! This does mean we have duplicate code. But to me the alternatives
# are note better:
# a) keep and install a separate local project - not portable
# b) publish and install from PyPI - extra maintenance burden


import os
import pooch
import importlib
from enum import Enum


_REPO_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..'))
_CENTRAL_RESOURCES = os.path.join(_REPO_ROOT, 'medcat-test-models')

class DefinedResource(Enum):
v1_model = "mct_v1_model_pack.zip"
v2_model = "mct2_model_pack.zip"


def _get_version(project_name: str = 'medcat') -> str:
# NOTE: plan to use this for medcat-den as well
try:
pkg = importlib.import_module(project_name)
ver = getattr(pkg, '__version__')
if ver is None:
raise
return "%2F".join((project_name, f"v{ver}"))
except ImportError:
raise RuntimeError(
f"Could not determine version for '{project_name}'. "
f"Is the package installed?"
)


def _download_resource(version: str, relative_path: str) -> str:
url = f"https://github.com/CogStack/cogstack-nlp/releases/download/{version}/{relative_path}"
try:
return pooch.retrieve(
url=url,
known_hash=None,
path=pooch.os_cache('medcat_tests'),
fname=relative_path,
)
except Exception as e:
raise FileNotFoundError(
f"Test resource '{relative_path}' not found locally in '{_CENTRAL_RESOURCES}' "
f"and could not be fetched from release {version!r}. "
f"If developing locally, ensure 'medcat-test-models/' exists at the repo root. "
f"Original error: {e}"
) from e


def get_resource(relative_path: str | DefinedResource, project_name: str = 'medcat') -> str:
"""
Returns a local path to the requested test resource.
Prefers the central repo location (medcat-test-models/) if available,
falls back to downloading from the corresponding release via pooch.
"""
# allow passing string version of defined resoure (e.g v1_model)
try:
relative_path = DefinedResource[relative_path]
except KeyError:
pass # treat as a literal path
if isinstance(relative_path, DefinedResource):
relative_path = relative_path.value
central_path = os.path.join(_CENTRAL_RESOURCES, relative_path)

if os.path.exists(central_path):
return central_path

version = _get_version(project_name)
return _download_resource(version, relative_path)
1 change: 1 addition & 0 deletions medcat-v2/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ dev = [
"types-tqdm",
"types-setuptools",
"types-PyYAML",
"pooch",
]
spacy = [
"spacy",
Expand Down
12 changes: 6 additions & 6 deletions medcat-v2/tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
import os
import shutil

from .resource_fetch import get_resource


RESOURCES_PATH = os.path.join(os.path.dirname(__file__), "resources")
EXAMPLE_MODEL_PACK_ZIP = os.path.join(RESOURCES_PATH, "mct2_model_pack.zip")
UNPACKED_EXAMPLE_MODEL_PACK_PATH = os.path.join(
RESOURCES_PATH, "mct2_model_pack")
V1_MODEL_PACK_PATH = os.path.join(RESOURCES_PATH, "mct_v1_model_pack.zip")
UNPACKED_V1_MODEL_PACK_PATH = os.path.join(
RESOURCES_PATH, "mct_v1_model_pack")
EXAMPLE_MODEL_PACK_ZIP = get_resource("mct2_model_pack.zip")
UNPACKED_EXAMPLE_MODEL_PACK_PATH = EXAMPLE_MODEL_PACK_ZIP.removesuffix(".zip")
V1_MODEL_PACK_PATH = get_resource("mct_v1_model_pack.zip")
UNPACKED_V1_MODEL_PACK_PATH = V1_MODEL_PACK_PATH.removesuffix(".zip")


# unpack model pack at start so we can access stuff like Vocab
Expand Down
78 changes: 78 additions & 0 deletions medcat-v2/tests/resource_fetch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# NOTE: this file is designed to be copied across the following sub-folders
# 1. medcat-v2/tests/resource_fetch.py
# 2. medcat-den/tests/resource_fetch.py
# So if you make changes here, copy them over to the others as well.
#
# NB! This does mean we have duplicate code. But to me the alternatives
# are note better:
# a) keep and install a separate local project - not portable
# b) publish and install from PyPI - extra maintenance burden


import os
import pooch
import importlib
from enum import Enum


_REPO_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..'))
_CENTRAL_RESOURCES = os.path.join(_REPO_ROOT, 'medcat-test-models')

class DefinedResource(Enum):
v1_model = "mct_v1_model_pack.zip"
v2_model = "mct2_model_pack.zip"


def _get_version(project_name: str = 'medcat') -> str:
# NOTE: plan to use this for medcat-den as well
try:
pkg = importlib.import_module(project_name)
ver = getattr(pkg, '__version__')
if ver is None:
raise
return "%2F".join((project_name, f"v{ver}"))
except ImportError:
raise RuntimeError(
f"Could not determine version for '{project_name}'. "
f"Is the package installed?"
)


def _download_resource(version: str, relative_path: str) -> str:
url = f"https://github.com/CogStack/cogstack-nlp/releases/download/{version}/{relative_path}"
try:
return pooch.retrieve(
url=url,
known_hash=None,
path=pooch.os_cache('medcat_tests'),
fname=relative_path,
)
except Exception as e:
raise FileNotFoundError(
f"Test resource '{relative_path}' not found locally in '{_CENTRAL_RESOURCES}' "
f"and could not be fetched from release {version!r}. "
f"If developing locally, ensure 'medcat-test-models/' exists at the repo root. "
f"Original error: {e}"
) from e


def get_resource(relative_path: str | DefinedResource, project_name: str = 'medcat') -> str:
"""
Returns a local path to the requested test resource.
Prefers the central repo location (medcat-test-models/) if available,
falls back to downloading from the corresponding release via pooch.
"""
# allow passing string version of defined resoure (e.g v1_model)
try:
relative_path = DefinedResource[relative_path]
except KeyError:
pass # treat as a literal path
if isinstance(relative_path, DefinedResource):
relative_path = relative_path.value
central_path = os.path.join(_CENTRAL_RESOURCES, relative_path)

if os.path.exists(central_path):
return central_path

version = _get_version(project_name)
return _download_resource(version, relative_path)
Binary file removed medcat-v2/tests/resources/mct2_model_pack.zip
Binary file not shown.
Binary file removed medcat-v2/tests/resources/mct_v1_model_pack.zip
Binary file not shown.
4 changes: 2 additions & 2 deletions medcat-v2/tests/utils/legacy/test_conversion_all.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@
import unittest.mock

from .test_convert_vocab import TESTS_PATH
from ... import V1_MODEL_PACK_PATH


class ConversionFromZIPTests(unittest.TestCase):
MODEL_FOLDER = os.path.join(TESTS_PATH, "resources",
"mct_v1_model_pack.zip")
MODEL_FOLDER = V1_MODEL_PACK_PATH

@classmethod
def setUpClass(cls):
Expand Down
Loading