Skip to content

Latest commit

 

History

History
380 lines (298 loc) · 10.5 KB

File metadata and controls

380 lines (298 loc) · 10.5 KB

Maintaining the MHub Segmentation Skill

This document explains how to update the skill's cached data to reflect changes in MHub models, SegDB segments, or workflow configurations.

When to Update

Consider updating the skill when:

  • New models are added to MHub (check mhub.ai/models)
  • SegDB adds new anatomical segments
  • Model configurations change significantly
  • You notice missing or outdated information

MHub typically adds 5-10 new models per year, so quarterly updates are usually sufficient.

Quick Update (Cache Refresh Only)

If you just need to refresh the model list from the MHub API:

cd /path/to/mhub-segmentation
python scripts/mhub_helper.py refresh

This updates:

  • data/models_cache.json (raw API response)
  • data/models_summary.json (processed index)

Requirements: Network access, requests package (pip install requests)

Full Update Process

For a comprehensive update including all cached data:

1. Update MHub Models Cache

# Fetch latest from MHub API
curl -s "https://mhub.ai/api/v2/models/detailed" \
  -H "accept: */*" \
  -H "origin: https://mhub.ai" \
  -H "referer: https://mhub.ai/" \
  > data/models_cache.json

# Rebuild the summary index
python3 << 'EOF'
import json
from datetime import datetime

with open('data/models_cache.json') as f:
    raw = json.load(f)

models = raw['data']

summary = {
    "cache_date": datetime.now().strftime("%Y-%m-%d"),
    "source": "https://mhub.ai/api/v2/models/detailed",
    "model_count": len(models),
    "models": {},
    "by_modality": {},
    "by_segment": {},
    "all_segments": []
}

all_segments = set()
for m in models:
    name = m['name']
    segments = m.get('segmentations', [])
    modalities = m.get('modalities', [])
    
    summary["models"][name] = {
        "label": m.get('label', name),
        "description": m.get('description', ''),
        "modalities": modalities,
        "segments": segments,
        "segment_count": len(segments),
        "predictions": m.get('predictions', []),
        "category": m.get('categories', ['Unknown'])[0],
        "license_code": m.get('licence', {}).get('model', 'Unknown'),
        "license_weights": m.get('licence', {}).get('weights', 'Unknown'),
        "cite": m.get('cite', ''),
        "inputs": m.get('inputs', []),
        "docker_image": f"mhubai/{name}:latest"
    }
    
    for mod in modalities:
        if mod not in summary["by_modality"]:
            summary["by_modality"][mod] = []
        summary["by_modality"][mod].append(name)
    
    for seg in segments:
        all_segments.add(seg)
        if seg not in summary["by_segment"]:
            summary["by_segment"][seg] = []
        summary["by_segment"][seg].append(name)

summary["all_segments"] = sorted(all_segments)

with open('data/models_summary.json', 'w') as f:
    json.dump(summary, f, indent=2)

print(f"Updated: {len(models)} models, {len(all_segments)} segments")
EOF

2. Update Default Workflow Configs

Fetch the default YAML config for each model:

# Read model names and fetch their configs
python3 << 'EOF'
import json
import subprocess
import os
from pathlib import Path

with open('data/models_summary.json') as f:
    models = json.load(f)['models']

config_dir = Path('assets/workflow-templates/defaults')
config_dir.mkdir(parents=True, exist_ok=True)

for name in models:
    url = f"https://raw.githubusercontent.com/MHubAI/models/main/models/{name}/config/default.yml"
    output_path = config_dir / f"{name}.yml"
    
    result = subprocess.run(
        ['curl', '-s', '-f', url],
        capture_output=True,
        text=True
    )
    
    if result.returncode == 0 and result.stdout.strip():
        output_path.write_text(result.stdout)
        print(f"✓ {name}")
    else:
        print(f"✗ {name} (no config found)")

print(f"\nUpdated {len(list(config_dir.glob('*.yml')))} configs")
EOF

3. Update SegDB Cache

pip install segdb --upgrade --break-system-packages

python3 << 'EOF'
import json
from datetime import datetime
from segdb.lookup import db

# Get type codes and categories
types_dict = db.types.to_dict('index')
categories_dict = db.categories.to_dict('index')

segments = {}
for seg_id, row in db.segmentations.iterrows():
    color_str = row.get('color', '128,128,128')
    color_parts = [int(c) for c in str(color_str).split(',')]
    
    anat_region = row.get('anatomic_region', '')
    type_info = types_dict.get(anat_region, {})
    
    cat_id = row.get('category', '')
    cat_info = categories_dict.get(cat_id, {})
    
    segments[seg_id] = {
        "name": row.get('name', seg_id),
        "category": cat_id,
        "category_code": cat_info.get('CodeValue', ''),
        "category_scheme": cat_info.get('CodingSchemeDesignator', ''),
        "category_meaning": cat_info.get('CodeMeaning', ''),
        "type_id": anat_region,
        "type_code": type_info.get('CodeValue', ''),
        "type_scheme": type_info.get('CodingSchemeDesignator', ''),
        "type_meaning": type_info.get('CodeMeaning', ''),
        "modifier": row.get('modifier', None),
        "color_rgb": color_parts
    }

summary = {
    "cache_date": datetime.now().strftime("%Y-%m-%d"),
    "source": "segdb Python package",
    "segment_count": len(segments),
    "segments": segments,
    "categories": db.categories.to_dict('index'),
    "modifiers": db.modifiers.to_dict('index')
}

with open('data/segdb_cache.json', 'w') as f:
    json.dump(summary, f, indent=2, default=str)

print(f"Updated: {len(segments)} SegDB segments")
EOF

4. Update Cache Dates in Documentation

After updating caches, update the date references in:

  1. SKILL.md - Update the "Cache date" line:

    **Cache date:** 2025-01-29  <!-- Update this -->
  2. README.md - Update the cache table:

    | MHub Models | 30 | mhub.ai API | 2025-01-29 |  <!-- Update -->

Verification

After updating, verify the caches are valid:

python scripts/mhub_helper.py info

Expected output:

Models cache:
  Date: 2025-XX-XX
  Models: XX
  Segments tracked: XXX

SegDB cache:
  Date: 2025-XX-XX
  Segments: XXX

Test key functionality:

# Model lookup
python scripts/mhub_helper.py model totalsegmentator

# Segment lookup
python scripts/mhub_helper.py segment LIVER

# Config generation
python scripts/mhub_helper.py config lungmask --pattern flat

Automated Update Script

For convenience, here's a complete update script:

#!/bin/bash
# update_skill.sh - Update all MHub skill caches

set -e
cd "$(dirname "$0")"

echo "=== Updating MHub Segmentation Skill ==="
echo ""

# 1. Models cache
echo "1. Fetching MHub models..."
curl -s "https://mhub.ai/api/v2/models/detailed" \
  -H "accept: */*" \
  -H "origin: https://mhub.ai" \
  -H "referer: https://mhub.ai/" \
  > data/models_cache.json

# 2. Refresh via helper (rebuilds summary)
echo "2. Rebuilding model summary..."
python scripts/mhub_helper.py refresh

# 3. Fetch default configs
echo "3. Fetching default workflow configs..."
python3 -c "
import json
import subprocess
from pathlib import Path

with open('data/models_summary.json') as f:
    models = json.load(f)['models']

config_dir = Path('assets/workflow-templates/defaults')
for name in models:
    url = f'https://raw.githubusercontent.com/MHubAI/models/main/models/{name}/config/default.yml'
    result = subprocess.run(['curl', '-s', '-f', url], capture_output=True, text=True)
    if result.returncode == 0:
        (config_dir / f'{name}.yml').write_text(result.stdout)
        print(f'  ✓ {name}')
"

# 4. Update SegDB (requires pip install segdb)
echo "4. Updating SegDB cache..."
pip install -q segdb --upgrade --break-system-packages 2>/dev/null || pip install -q segdb --upgrade
python3 -c "
import json
from datetime import datetime
from segdb.lookup import db

types_dict = db.types.to_dict('index')
categories_dict = db.categories.to_dict('index')
segments = {}

for seg_id, row in db.segmentations.iterrows():
    color_parts = [int(c) for c in str(row.get('color', '128,128,128')).split(',')]
    anat_region = row.get('anatomic_region', '')
    type_info = types_dict.get(anat_region, {})
    cat_id = row.get('category', '')
    cat_info = categories_dict.get(cat_id, {})
    
    segments[seg_id] = {
        'name': row.get('name', seg_id),
        'category': cat_id,
        'category_code': cat_info.get('CodeValue', ''),
        'category_scheme': cat_info.get('CodingSchemeDesignator', ''),
        'category_meaning': cat_info.get('CodeMeaning', ''),
        'type_id': anat_region,
        'type_code': type_info.get('CodeValue', ''),
        'type_scheme': type_info.get('CodingSchemeDesignator', ''),
        'type_meaning': type_info.get('CodeMeaning', ''),
        'modifier': row.get('modifier', None),
        'color_rgb': color_parts
    }

with open('data/segdb_cache.json', 'w') as f:
    json.dump({
        'cache_date': datetime.now().strftime('%Y-%m-%d'),
        'source': 'segdb Python package',
        'segment_count': len(segments),
        'segments': segments,
        'categories': db.categories.to_dict('index'),
        'modifiers': db.modifiers.to_dict('index')
    }, f, indent=2, default=str)

print(f'  ✓ {len(segments)} segments')
"

# 5. Verify
echo ""
echo "=== Update Complete ==="
python scripts/mhub_helper.py info

echo ""
echo "Remember to update cache dates in SKILL.md and README.md!"

Save this as update_skill.sh in the skill root and run with:

chmod +x update_skill.sh
./update_skill.sh

Data Sources Reference

Data Source API/Package
Model list MHub https://mhub.ai/api/v2/models/detailed
Model configs GitHub https://raw.githubusercontent.com/MHubAI/models/main/models/{name}/config/default.yml
Segment codes SegDB pip install segdbfrom segdb.lookup import db
Documentation GitHub https://github.com/MHubAI/documentation

Troubleshooting

API returns empty or error

The MHub API requires specific headers. Ensure you include:

-H "origin: https://mhub.ai" -H "referer: https://mhub.ai/"

New model missing from configs

Check if the model exists in the MHub models repo:

curl -s "https://api.github.com/repos/MHubAI/models/contents/models" | grep '"name"'

SegDB import fails

Update to the latest version:

pip install segdb --upgrade

Segment ID not in SegDB

Some MHub models use composite IDs (e.g., LIVER+NEOPLASM_MALIGNANT_PRIMARY) that aren't directly in SegDB. The helper script handles these by splitting on + and looking up components separately.