Skip to content

Conversation

@marcbal77
Copy link
Member

@marcbal77 marcbal77 commented Aug 5, 2025

Description

This PR introduces a minimal framework for integrating third-party API models into biolearn, with Hurdle Bio's Inflammage model as the first implementation.

Key Changes

  • Add class to for API-based predictions
  • Integrate into for consistent usage
  • Implement privacy-first approach with explicit user consent before sending data
  • Add automatic imputation for missing CpG sites (uses 0.5 for missing values)
  • Include comprehensive error handling with user-friendly messages

Security & Privacy Features

  • API keys managed via environment variables () or direct input
  • Explicit consent required before sending data to external servers
  • Updated to exclude credential files
  • Clear warnings about third-party data sharing

Documentation

  • User guide in with setup instructions
  • Working example in
  • Placeholder file for Hurdle CpG sites list

Testing

  • Comprehensive test suite in
  • Mock API calls to avoid requiring credentials in tests
  • Tests cover consent flow, error handling, and API integration

Usage Example

# Set API key
export HURDLE_API_KEY="your_key"

# Use the model
from biolearn.model_gallery import ModelGallery

gallery = ModelGallery()
model = gallery.get("HurdleInflammage")
predictions = model.predict(methylation_data)

@marcbal77 marcbal77 changed the title feat: Add support for third-party API models with Hurdle implementation feat: Add support for third-party API models Aug 5, 2025
@marcbal77 marcbal77 force-pushed the feature/third-party-api-models branch from 2a46602 to 4be828e Compare August 16, 2025 02:33
This commit introduces a minimal framework for integrating third-party API models
into biolearn, with Hurdle Bio's Inflammage model as the first implementation.

Key changes:
- Add HurdleAPIModel class to model.py for API-based predictions
- Integrate HurdleAPIModel into ModelGallery for consistent usage
- Implement privacy-first approach with explicit user consent
- Add automatic imputation for missing CpG sites
- Include comprehensive error handling with user-friendly messages

Security & Privacy:
- API keys managed via environment variables or direct input
- Explicit consent required before sending data to external servers
- Updated .gitignore to exclude credential files

Documentation:
- Add user guide in docs/hurdle_api_guide.md
- Include working example in examples/hurdle_api_example.py
- Placeholder for Hurdle CpG sites list

Testing:
- Add test suite for HurdleAPIModel functionality
- Mock API calls to avoid requiring credentials in tests

This implementation minimizes changes to the existing codebase while providing
a robust foundation for future API model integrations.
- Add actual CpG sites to Hurdle_CpGs.csv.example file (first 100 sites)
- Fix test_consent_denied decorator issue (missing mock_input parameter)
- Skip HurdleAPIModel in main test_model.py suite (requires API credentials)
- Update test data to include proper CpG site indices for Hurdle tests
- Fix file path to use Hurdle_CpGs.csv.example instead of .csv
- Format biolearn/test/test_hurdle_model.py
- Format biolearn/test/test_model.py
- Add real CpG sites to example file, improve error handling and validation
- Simplify consent mechanism, reduce test redundancy, add type hints
- Move documentation to doc/ folder, clean up examples
- Update HurdleAPIModel implementation to match latest codebase
- Fix test imports and references after rebase
- Maintain compatibility with existing model architecture
- Add self._consent_given flag to track consent state
- Only ask for consent if not already given
- Remove duplicate base_url initialization
- Fixes failing test test_consent_only_asked_once
- Format base_url assignment as multi-line for consistency
- Fixes CI formatting check failure
…ntation

- Rename HurdleInflammage to HurdleInflammAge
- Switch to production API (use_production=True)
- Update registration URL to https://dashboard.hurdle.bio/register
- Add non-commercial use disclaimer to documentation and docstring
- Rename Hurdle_CpGs.csv.example to Hurdle_CpGs.csv
- Document 0.5 imputation for missing CpG sites
- Update example script with better error handling
- Update all references in tests and documentation
@marcbal77 marcbal77 force-pushed the feature/third-party-api-models branch from 0c2c5ef to ad8b326 Compare October 14, 2025 07:19
- Model now errors if any required CpG sites are missing
- Provides informative error with count and examples of missing sites
- Directs users to use ModelGallery imputation methods
- Updated documentation to explain missing data handling
- Added test for missing CpG error
- All tests passing (143 passed, 4 skipped)
@marcbal77 marcbal77 self-assigned this Oct 17, 2025
@marcbal77 marcbal77 added the enhancement New feature or request label Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant