Last updated: 2026-02-18
Priority: HIGH | Effort: Medium | Status: Done
Add ability to download results as CSV/HTML report from the Results tab.
- CSV exports: coefficients, population, jury, generation tracking
- HTML report: self-contained with embedded charts and metrics
- Backend endpoints for CSV sections and full HTML report
- Frontend: Export dropdown in Results tab header
Priority: HIGH | Effort: Medium | Status: Done
Enhance the existing Comparative sub-tab with detailed side-by-side analysis.
- Side-by-side metrics tables for 2+ selected jobs
- Diff highlighting for changed parameters
- Config diff viewer showing only differing parameters
- Performance delta visualization
Priority: HIGH | Effort: High | Status: Done
Live progress tracking during training.
- ConsolePanel with ANSI-rendered log output and HTTP polling
- Real-time progress bar: current generation / max_epochs
- Generation, k-value, and language display
- Minimizable console with status badge
Priority: HIGH | Effort: Medium | Status: Done
Generate downloadable Python/R notebooks from completed job results.
- Python notebook (.ipynb): loads data, runs gpredomicspy, displays results with matplotlib
- R notebook (.Rmd): loads data with gpredomicsR, generates ggplot2 visualizations
- Pre-filled with actual parameter values from completed jobs
- Download buttons in Results tab export dropdown
Priority: MEDIUM | Effort: Low | Status: Done
Interactive first-use walkthrough highlighting key features.
- Lightweight modal-based tour (OnboardingTour.vue) — no external dependencies
- 6 steps: Welcome → Create project → Upload data → Configure → Launch → Results
- "Don't show again" checkbox stored in localStorage
- Reset tour option in Profile → Preferences
- Auto-shows for first-time users after login
Priority: MEDIUM | Effort: Low | Status: Done
Notify user when a long-running job completes or fails.
- Uses browser Notification API with permission request on project dashboard load
- Triggers on job status transition: running → completed/failed
- Shows AUC and k in notification body for completed jobs
- Enable/disable toggle in Profile → Preferences
- Auto-close after 10 seconds, click-to-focus
Priority: MEDIUM | Effort: Medium | Status: Done
Launch multiple analysis jobs with parameter sweeps.
- Batch Mode toggle in Parameters tab with sweep parameter grid builder
- 7 sweepable params: seed, algo, language, data_type, population_size, max_epochs, k_max
- Cartesian product of sweep values (max 50 jobs per batch)
batch_idcolumn on jobs with migration v10- Backend:
POST /batchcreates N jobs,GET /batchesreturns batch summaries with best AUC - Jobs named
[Batch] param=value ...for easy identification - Real-time job count preview and max-50 validation
Priority: MEDIUM | Effort: Low | Status: Done
Organize datasets with tags and enable search/filter.
- Added
tags: JSONfield to Dataset model with migration v9 - Tag CRUD endpoints (create, update, suggestions, filter)
- Search bar and tag filter dropdown in Dataset Library
- Clickable tag chips, inline tag editor with datalist autocomplete
- Pre-defined tags: benchmark, clinical, metagenomic, 16S, shotgun, WGS, etc.
Priority: LOW | Effort: Medium | Status: Done
Interactive CSS animation showing the Predomics pipeline on the landing page.
- 5-step pipeline: Data Input → Feature Selection → Evolutionary Search → Model Evaluation → Jury Voting
- CSS keyframe fadeSlideIn animation with staggered delays
- Arrow connectors between steps, responsive layout
- Each step has icon, label, and description
Priority: LOW | Effort: Low | Status: Done
Real-world use cases with results on the landing page.
- 4 use cases: Cirrhosis (Qin 2014), Cancer (Zeller 2014), Treatment Response (Gopalakrishnan 2018), Metabolic Disease (Karlsson 2013)
- Each card has icon, description, key metrics (AUC, features, samples), and publication reference
- Hover effects, responsive grid layout
Priority: MEDIUM | Effort: High | Status: Done
Expanded test suite to cover critical paths — 86% coverage on testable backend code.
- Backend: 207 tests (69 new) covering all 11 routers — datasets (94%), projects (97%), sharing (99%), admin (98%), auth (100%), export (93%), health (100%), analysis (73%), data_explore (90%), samples (97%)
- Frontend: 59 tests (all new) — Vitest 4 + Vue Test Utils + jsdom: parameter definitions (13), notification utility (13), Pinia stores (14), Vue components (14), router config (5)
- Edge cases: concurrent jobs, batch runs, error recovery, export helpers
- Coverage config:
pyproject.tomlwithconcurrency = ["greenlet", "thread"]for accurate async tracking - Remaining uncovered: DB migrations (main.py, PostgreSQL-specific), subprocess worker (worker.py), gpredomicspy internals
Priority: LOW | Effort: Low | Status: Done
Auto-generate OpenAPI documentation.
- Fixed SPA catch-all to not intercept
/docs,/redoc,/openapi.jsonpaths - FastAPI Swagger UI available at
/docs, ReDoc at/redoc - OpenAPI JSON schema at
/openapi.json
Priority: MEDIUM | Effort: Low | Status: Done
Comprehensive DEPLOYMENT.md covering all deployment scenarios.
- Docker Compose single-server deployment with production overrides
- Full environment variable reference table
- NGINX reverse proxy configuration with security headers
- SSL/TLS setup with Let's Encrypt and certbot
- PostgreSQL configuration and managed database migration
- Backup & restore procedures (database + files) with automated cron script
- Kubernetes deployment manifests with Ingress and cert-manager
- Health check and monitoring guidance
- Troubleshooting section
Priority: HIGH | Effort: Medium | Status: Done
Full system backup and restore via portable tar.gz archives.
- Backend service: JSON export of all DB tables in FK dependency order
- Archive includes: database JSON, dataset files, job results, admin defaults, manifest
- Restore modes: replace (wipe + restore) or merge (skip existing records)
- Admin UI: create backup, list/download/delete backups, upload & restore
- 5 new admin API endpoints
Priority: HIGH | Effort: Medium | Status: Done
Real-time log streaming via WebSocket with HTTP polling fallback.
- WebSocket endpoint:
/ws/jobs/{project_id}/{job_id}?token=JWT - Tail-f style log streaming with 0.5s refresh
- JWT authentication via query parameter
- Automatic fallback to HTTP polling if WebSocket fails
- Status change broadcasts (running → completed/failed)
Priority: MEDIUM | Effort: Low | Status: Done
Preview dataset file contents without downloading.
- Backend: CSV/TSV parsing with first N rows, column types, basic stats (min/max/mean/std)
- Frontend modal with scrollable table, sticky headers, row numbers
- Stats footer row, file metadata badges
- Preview buttons in Dataset Library file rows
Priority: MEDIUM | Effort: Low | Status: Done
Improve response times and initial load performance.
- In-memory TTL cache decorator for expensive backend endpoints
- GZip response compression middleware (min 1KB)
- Dynamic import for Plotly.js (~3MB) — loaded on demand in Results tab
Priority: MEDIUM | Effort: Low | Status: Done
Structured error responses and user-facing notifications.
- Backend: structured
{error: {code, message}}JSON responses for all HTTP errors - Generic catch-all 500 handler with server-side logging
- Toast notification system: composable + ToastContainer component
- Axios response interceptor for global error display
- Exponential backoff retry utility for transient failures
Priority: HIGH | Effort: Low | Status: Done
Richer charts in the Best Model sub-tab for exploring feature contributions.
- Coefficient direction chart: horizontal bars colored by sign (green=positive, red=negative)
- Feature contribution waterfall: cumulative sorted contributions using Plotly waterfall trace
- Per-sample contribution heatmap: button-triggered, computes coefficient × feature values matrix
- All data sourced from existing results JSON — no backend changes needed
Priority: MEDIUM | Effort: Low | Status: Done
Admin-managed parameter presets available to all users.
- File-based JSON storage at
data/templates.json(same pattern as admin defaults) - CRUD endpoints: GET/POST/PUT/DELETE + public GET (no auth)
- Admin UI: create/delete templates from current default config
- Parameters tab: "Load Template" dropdown applies preset config values
Priority: HIGH | Effort: Medium | Status: Done
Track all user actions with timestamps, queryable by admins.
AuditLogmodel: user_id, action, resource_type, resource_id, details (JSON), ip_address- 14 action constants: login, register, job.launch/delete, dataset.upload/delete, project.create/delete, share.create/revoke, admin operations
- Instrumented across 6 routers (auth, analysis, datasets, projects, sharing, admin)
- Admin UI: paginated audit log table with action filter
- Migration v11
Priority: HIGH | Effort: Medium | Status: Done
Self-service password reset with optional SMTP email delivery.
PasswordResetTokenmodel with bcrypt-hashed tokens and 1-hour expiryemail_verifiedfield on User model (migration v12)- aiosmtplib integration (optional dep — graceful ImportError handling)
- Dev mode: returns reset token directly when no SMTP configured
- Frontend: ForgotPasswordView + ResetPasswordView with router guard
- Admin: direct password reset for any user
Priority: MEDIUM | Effort: Medium | Status: Done
Programmatic access via API keys as alternative to JWT Bearer tokens.
ApiKeymodel with bcrypt-hashed keys, 8-char prefix, last_used_at tracking- Keys shown only once on creation
- Dual auth:
get_current_useraccepts bothAuthorization: BearerandX-API-Keyheader - Frontend: create/list/revoke API keys in Profile view
- Migration v13
Priority: MEDIUM | Effort: Medium | Status: Done
HTTP POST callbacks to external URLs when jobs complete or fail.
Webhookmodel with HMAC-SHA256 signing via secret- Delivery with configurable retries and exponential backoff (httpx)
- CRUD + test endpoint (send test payload)
- Fired from background job runner on completion/failure
- Frontend: create/list/delete/test webhooks in Profile view
- Migration v14
Priority: MEDIUM | Effort: Medium | Status: Done
Automatic snapshots on file changes with restore capability.
DatasetVersionmodel: version_number, files_snapshot (JSON), created_by, note- Auto-snapshot on file upload and file delete
- Version history endpoint with restore to any previous version
- Frontend: "History" button per dataset with version list and restore
- Migration v15
Priority: HIGH | Effort: Low | Status: Done
Per-user and per-IP rate limits using slowapi (in-memory, no Redis required).
- User-or-IP key extraction from JWT, API key prefix, or client IP
- Configurable limits: auth (10/min), API (100/min), upload (20/min), admin (30/min)
- Limiter decorators on auth, upload, and admin endpoints
- 429 toast handling in frontend Axios interceptor
- Global enable/disable via config setting
Priority: HIGH | Effort: Medium | Status: Done
Score new samples against trained models without re-training.
- Shared prediction service (
services/prediction.py): coefficient extraction, data_type transform, score computation, threshold classification - Upload X_valid.tsv + optional Y_valid.tsv via multipart form
- Validation metrics: AUC, accuracy, sensitivity, specificity, confusion matrix
- Per-sample prediction table with scores and classifications
- Backend:
POST /api/analysis/{id}/jobs/{jid}/validateendpoint - Frontend: ValidateModal in Best Model sub-tab with file upload and results display
Priority: HIGH | Effort: Medium | Status: Done
Serve trained models as live prediction endpoints.
POST /api/predict/{job_id}— JSON body with{features: {name: [values]}, sample_names: [...]}- Returns scores, predicted_classes, threshold, matched/missing features
- Reuses shared
prediction.pyservice from F27 - API key authentication (
X-API-Keyheader) for programmatic access - Frontend: "Prediction API" section in Best Model sub-tab with endpoint URL, curl example, and copy button
Priority: HIGH | Effort: Medium | Status: Done
Auto-generate publication-ready PDF summarizing analysis findings.
- 3-page PDF report using reportlab (
services/pdf_report.py) - Page 1: Performance metrics (AUC, accuracy, sensitivity, specificity), jury summary
- Page 2: Feature table with coefficients, taxonomy annotations, functional properties
- Page 3: Configuration summary, generation tracking highlights
- Backend:
GET /api/export/{pid}/jobs/{jid}/pdfendpoint - Frontend: "PDF Biomarker Report" option in Export dropdown
Priority: HIGH | Effort: High | Status: Done
Compare models trained on different datasets for the same phenotype.
- Backend:
GET /api/meta-analysis/searchable-jobs— cross-project completed job search - Backend:
POST /api/meta-analysis/compare— feature overlap, concordance, meta-AUC - Frontend: new "Meta-Analysis" top-level view with job picker (chip-based, 2-10 jobs)
- Metrics comparison table with best-value highlighting
- Feature overlap chart (horizontal bars by cohort count)
- Concordance matrix: feature × job grid colored by coefficient sign (green/red/grey)
- Meta-AUC card (weighted average across cohorts)
- "Meta-Analysis" link in navbar
Priority: MEDIUM | Effort: Medium | Status: Done
Per-sample feature contribution breakdown for model interpretability.
useShapValuescomposable:computeShapMatrix()computes SHAP values (feature × coef × sample value)- Beeswarm plot: horizontal strip per feature, x=SHAP value, color=feature value (Viridis)
- Force plot: waterfall for single sample showing cumulative feature contributions
- Dependence plot: scatter of feature value vs SHAP value, colored by class
- Feature importance ordering by mean |SHAP|
- Entirely client-side using existing barcode-data API
- "Feature Explanations" section in Best Model sub-tab with 3 switchable views
Priority: MEDIUM | Effort: Low | Status: Done
Threaded notes/discussion per project.
ProjectCommentmodel: id, project_id, user_id, content (Text), created_at, updated_at- CRUD endpoints: create, list, update (author only), delete (author or project owner)
- CommentsSidebar component: slide-out panel with comment list, add/edit/delete
- User initials avatar, timestamps, Ctrl+Enter submit shortcut
- "Notes" button in project dashboard header
- Migration v17
Priority: MEDIUM | Effort: Low | Status: Done
Share results via read-only public links (no login required).
PublicSharemodel: id, project_id, token (64-char, unique indexed), created_by, expires_at, is_active- Authenticated endpoints: create, list, revoke links (project owner only)
- Unauthenticated endpoints:
GET /api/public/{token}for project info + jobs,GET /api/public/{token}/jobs/{jid}/resultsfor full results - PublicShareModal: create links with expiry options (7/30/90 days or never), copy URL, revoke
- PublicShareView: guest-accessible page with project summary, job cards, metrics grid, feature display
- "Public Link" button in project dashboard header
- Router guard allows both authenticated and unauthenticated access
- Migration v18
Priority: MEDIUM | Effort: Low | Status: Done
Global view of all projects with summary statistics and recent activity.
- Backend:
GET /api/dashboard/— aggregates counts (projects, datasets, running/completed/failed jobs, shared) - Active jobs section with status badges and links to project console
- Recent completions mini-table (project name, AUC, k, date)
- Activity feed from audit_logs with action icons, resource links, timeAgo formatting
- Summary cards grid: Projects, Datasets, Running, Completed, Failed, Shared
- "Dashboard" link in navbar (before Projects)
Priority: MEDIUM | Effort: Medium | Status: Done
Automated end-to-end testing using Playwright.
playwright.config.mjsat repo root with dark theme, 1440×900 viewport- 10 E2E test cases in
tests/e2e/e2e.spec.mjs:- Landing page loads and shows brand
- User registration flow
- Login with credentials
- Create project
- Upload dataset files
- Dashboard displays summary cards
- Meta-analysis page accessible
- Public share page loads for guests
- Health API endpoint returns ok
- Swagger docs accessible
- Root
package.jsonwithtest:e2escript and@playwright/testdependency - CI integration:
e2e-testjob in GitHub Actions after Docker build
Priority: LOW | Effort: Low | Status: Done
Auto-deploy on tag push with container registry publishing.
.github/workflows/release.ymltriggered onv*tags- Multi-repo checkout (predomicsapp, gpredomics, gpredomicspy)
- Login to GHCR, build and push with
docker/build-push-action - Tags:
ghcr.io/{owner}/predomicsapp:latest+:{version} - GitHub Actions cache (
type=gha) for faster rebuilds - OCI metadata labels in Dockerfile (title, description, source, license)
Priority: LOW | Effort: High | Status: Done
Multi-language support starting with French.
vue-i18n@9integration with JSON locale filesfrontend/src/i18n/module withcreateI18n()setup- English (
en.json) and French (fr.json) locale files (~100 strings each) - Namespaces: nav, home, login, dashboard, projects, results, meta, common
- Language selector button (EN/FR toggle) in navbar next to theme toggle
localStorage.localepersistence across sessions- Navbar links translated via
$t('nav.dashboard')etc. - Backend:
Accept-Languageheader parsing incore/errors.pyfor translated error messages
Priority: MEDIUM | Effort: Medium | Status: Done
Add t-SNE and UMAP as alternative dimensionality reduction methods alongside PCoA.
- Unified
/api/data-explore/{pid}/ordinationendpoint withmethodparam (pcoa, tsne, umap) - Backend:
compute_tsne()andcompute_umap()indata_analysis.pyusing precomputed distance matrices - Method-specific parameters: perplexity (t-SNE), n_neighbors/min_dist (UMAP)
- Shared
_load_and_prepare()helper with subsampling (max 1000 samples) - Dynamic axis labels: "PCo1 (X%)" for PCoA, "t-SNE 1" / "UMAP 1" for others
- PERMANOVA and 95% confidence ellipses for all methods
- Added to DataTab, ResultsTab (Population), and DataExploreTab
- Dependencies:
scikit-learn>=1.0,umap-learn>=0.5
Priority: MEDIUM | Effort: Low | Status: Done
Add the Blaise confidence interval method as alternative to standard Wald CI for FBM filtering.
- Dropdown in Population and Co-Presence tabs to choose between Standard CI and Blaise CI
- Blaise formula:
threshold = r - (0.5/n + 1.96 * sqrt(r*(1-r)/n))— adds continuity correction - Shared
fbmMethodref between Population and Co-Presence tabs - i18n: "Standard CI" / "Blaise CI" in EN/FR
Priority: MEDIUM | Effort: Medium | Status: Done
Enhance co-presence analysis with model fitness weighting (Shasha Cui internship, 2017).
- Toggle checkbox in Co-Presence controls for accuracy-weighted mode
- Per-node mean accuracy:
meanAccuracy[feature] = sum(model.fit) / count - Per-edge accuracy-weighted counts:
wObs = sum(model.fit for co-occurring pairs) - Weighted expected:
wExp = (accSum_i * accSum_j) / totalAccSum - Weighted ratio:
wRatio = wObs / wExp - Prevalence chart: secondary axis with mean accuracy diamonds
- Network: node size by accuracy, edge width by weighted ratio, accuracy in hover
- Stats table: "W. Obs" and "W. Ratio" columns when weighted mode is on
- Hypergeometric test unchanged (stays on binary integer counts for statistical validity)
Priority: MEDIUM | Effort: Medium | Status: Done
Bookmark individual models from any job's population into a persistent curated collection.
useModelBasket.jscomposable with localStorage persistence per project (max 50 models)- Star toggle (★/☆) in Population table to add/remove models
- Basket sub-tab with count badge in Results tab nav bar
- Bookmarked models table with expandable feature chips, remove button
- Metrics comparison grouped bar chart (AUC, Accuracy, Sensitivity, Specificity per model)
- Feature overlap horizontal stacked bar chart with positive/negative direction
- Consensus features display (features common to all basket models)
- Feature coefficient heatmap (features × models, blue-white-red colorscale)
- Cross-job support: basket items store data snapshots, survive job deletion
Priority: MEDIUM | Effort: Low | Status: Done
Add dropdown to choose between community detection methods in the Ecosystem network.
- Backend:
_detect_communities(G, method, seed)dispatcher incoabundance.py - Supported algorithms: Louvain (default), Greedy modularity maximization, Label propagation
community_methodparameter added tocompute_coabundance_network()and cache key- Endpoint validation: rejects unknown methods with HTTP 400
- Frontend: dropdown in EcosystemTab controls, auto-refetch on change
- Graceful fallback to singleton communities on algorithm failure
- i18n: "Community" / "Greedy modularity" / "Label propagation" in EN/FR
Priority: MEDIUM | Effort: Medium | Status: Done
Select a network module and zoom into a higher-resolution sub-network for detailed niche analysis.
- Zoom button per module in the sidebar (magnifying glass icon)
- Extracts member species, re-calls
/coabundance-networkwithfeatures=membersand halved thresholds displayedNetworkcomputed property for seamless switching between full and zoomed views- Breadcrumb navigation bar showing module ID, member count, and "Back to full network" button
zoomOut()restores the full network; parameter changes auto-clear zoom state- Updated
renderNetwork()to usedisplayedNetwork.valuethroughout - Stats bar, chart visibility, and taxonomy legend all react to zoom state
- i18n: "Zoom into this module" / "Back to full network" in EN/FR
Priority: MEDIUM | Effort: Medium | Status: Done
QC scatter plot comparing per-class Spearman correlations to detect prevalence filtering artifacts.
- Backend:
compute_aberrant_correlations()indata_analysis.py— computes full Spearman correlation matrices per class, extracts upper-triangle pairs - Prevalence filtering (default 30%), subsampling to 500 features if needed, max 5000 pairs
- Returns
{pairs: [{f1, f2, r0, r1}], n_features, n_pairs, feature_names} - New endpoint:
GET /{project_id}/aberrant-correlationswithmin_prevalence_pctparam - Frontend: new "Aberrant Correlation Diagnostic" sub-tab in DataExploreTab
- Plotly scatter: x = Class 0 rho, y = Class 1 rho, continuous colorscale by deviation from diagonal
- Dashed y=x reference line, equal aspect ratio axes [-1.05, 1.05]
- i18n: aberrant correlation labels in EN/FR
Priority: HIGH | Effort: Medium | Status: Done
Side-by-side patient vs. control co-abundance ecosystems highlighting common and condition-specific interactions.
- New endpoint:
GET /{project_id}/dual-network— computes both class networks in one call - Edge comparison: identifies common edges (present in both), class-0-specific, and class-1-specific
- Each edge annotated with
sharedboolean flag for coloring - Frontend: "Dual network" checkbox toggle in EcosystemTab controls
- Side-by-side Plotly panels in a CSS grid (responsive: stacks on mobile)
- Comparison stats header: common edge count + specific counts per class
- Edge coloring: shared = gray (subtle), class-0-specific = teal, class-1-specific = orange
- Auto-refetch on parameter changes when dual mode is active
- i18n: "Dual network" / "Common edges" / "Controls" / "Patients" in EN/FR
Priority: MEDIUM | Effort: Medium | Status: Done
Visualize how network modules reorganize between patient and control ecosystems via an alluvial/Sankey diagram.
- Backend: module correspondence computation added to
get_dual_networkendpoint indata_explore.py - Species→module mapping per class, intersection of common species, flow counting between module pairs
sankey_linksarray of{source_module, target_module, value}in dual-network response- Frontend: Plotly Sankey trace in
EcosystemTab.vuebelow dual-network panels - Node labels combine class label, module ID, and dominant phylum
- Link colors: semi-transparent source module color (handles both rgb() and hex)
- Rendered automatically when dual mode is active and sankey_links has data
- i18n: "Module Correspondence" / "Correspondance des modules" in EN/FR
Priority: MEDIUM | Effort: Medium | Status: Done
Upload and display external network JSON files (e.g. from SCAPIS) with FBM signature annotation.
- Backend: 4 new endpoints in
data_explore.py:POST /{project_id}/external-networks— upload JSON (max 10MB, validates nodes/edges arrays)GET /{project_id}/external-networks— list uploaded networksGET /{project_id}/external-networks/{network_id}— retrieve full network dataDELETE /{project_id}/external-networks/{network_id}— delete (editor role)
- Storage:
data/projects/{project_id}/networks/{uuid}.json - Frontend: external network section in EcosystemTab with select dropdown, upload button, delete button
- Rendering: uses fixed positions (x, y from JSON) if available, organic layout fallback
- FBM annotation: species matching FBM population features highlighted with gold color, diamond shape, orange border
- Edge styling: positive = solid gray, negative = dashed red, width scaled by |weight|
- i18n: "External Network" / "Réseau externe" in EN/FR
Priority: MEDIUM | Effort: Medium | Status: Done
Filter the Family of Best Models by network module membership to keep only ecologically coherent models.
- Backend:
POST /{project_id}/fbm-module-filterendpoint indata_explore.py - Loads job results population, computes co-abundance network to get module assignments
- Per-model metrics: module coverage (fraction in selected modules), module coherence (fraction in same module)
- Filters models with ≥50% features in selected modules, sorted by coverage then fitness
- Frontend: "FBM Module Filter" section in EcosystemTab with module checkboxes, apply/clear buttons
- Results table with rank, fit, k, language, coverage bar, coherence, in-module count
- CSV export of filtered models
- Event integration: emits
module-filter-appliedto parent ResultsTab - Population table: blue highlight + "M" badge for filtered models
- i18n: 12 keys in EN/FR (filter title, description, coverage, coherence, export, etc.)
Priority: MEDIUM | Effort: High | Status: Done
Curated, searchable database of published biomarker signatures for cross-study comparison and reuse.
- Backend: new
signature_zoo.pyrouter with 7 endpoints:GET /api/signature-zoo/— list all signatures with optional disease/method/search filtersGET /api/signature-zoo/compare— compare 2+ signatures (Jaccard overlap, performance, common features)GET /api/signature-zoo/{id}— get single signaturePOST /api/signature-zoo/— create signature (auth required)PUT /api/signature-zoo/{id}— update (auth required)DELETE /api/signature-zoo/{id}— delete (admin only)POST /api/signature-zoo/import-from-job— create from completed job's best model
- File-based JSON storage at
data/signature_zoo.jsonwith fcntl file locking - 4 seed signatures: Cirrhosis (Qin 2014), CRC (Zeller 2014), Obesity (Le Chatelier 2013), T2D (Karlsson 2013)
- Frontend:
SignatureZooView.vuewith card grid, filters (disease/method/text), detail modal, compare mode - Compare mode: Plotly performance bars, Jaccard overlap matrix, common features, feature presence chart
- Route
/signature-zooin router.js, navbar link in App.vue - 35 i18n keys in EN/FR
Updated: 2026-03-31
50. Regression Support ✅ gpredomics#15, gpredomics#79
Priority: HIGH | Effort: High | Status: Done
Data.y changed from Vec to Vec. 4 regression fitness functions (spearman, pearson, rmse, mutual_information). Regression-aware display, test metrics, CV fold splitting, voting skip.
51. 4 New Optimization Algorithms ✅ gpredomics#61, #62, #65, #54
Priority: HIGH | Effort: High | Status: Done
SA (Simulated Annealing), ILS (Iterated Local Search), LASSO/Elastic Net, ACO (Ant Colony Optimization). All integrated in web app.
52. MCMC Gibbs Variable Selection ✅ gpredomics#73, #70
Priority: HIGH | Effort: High | Status: Done
Joint feature+coefficient sampling replacing SBS. LASSO prescreen, parallel chains, golden section optimizer. MCMC from OOM/minutes to 5-11 seconds.
53. Code Audit & Fixes ✅ gpredomics#75
Priority: HIGH | Effort: Medium | Status: Done
12 critical + 14 medium issues fixed. HashMap→BTreeMap for determinism. ClassificationMetrics refactor. -280 lines dead code removed.
54. Metadata Upload & Variable Selection ✅ predomicsapp#1
Priority: HIGH | Effort: Medium | Status: Done
Upload metadata TSV, select numeric column as regression y, auto-switch to regression mode. Backend APIs for metadata column parsing and y extraction.
Priority: MEDIUM | Effort: Low | Status: Done
Exposed prevalence, adj_pvalue, selection method in Parameters tab. Documented adaptive FDR relaxation.
Priority: MEDIUM | Effort: Low | Status: Done
When strict FDR selects < 10 features, alpha relaxes progressively (0.05→0.1→0.2→0.5) with warnings. Fallback to top 10 by raw p-value.
57. Wetlab Protocol Dataset ✅ gpredomics#60
Priority: MEDIUM | Effort: Low | Status: Done
Paired study (459 subjects, 2 extraction protocols). 1,981 MSPs. Subject-level train/test split. Metadata with age, sex, BMI, Gram+/- counts, gene_count.
Priority: MEDIUM | Effort: Low | Status: Done
Bundled demos in samples/ (baked in image). User workspace in data/ (persistent volume for K8s).
Priority: MEDIUM | Effort: Medium | Status: Done
17-page PDF documentation. Vignette tutorial. All 7 algorithm docs with references. Fully documented param.yaml.
60. Multi-user Workspace Management predomicsapp#4
Priority: HIGH | Effort: High | Status: Open
Job concurrency limits (semaphore), per-user disk quotas, job timeouts, dataset deduplication, admin dashboard.
61. Data Scanning Console Feedback predomicsapp#5
Priority: MEDIUM | Effort: Low | Status: Open
Show scanning progress during dataset load: feature count, class distribution, prevalence stats, warnings.
62. Optuna Hyperparameter Optimization gpredomics#77
Priority: MEDIUM | Effort: Medium | Status: Open
Bayesian hyperparameter tuning via Optuna in gpredomicspy. Search space for k_penalty, population_size, algorithm choice, etc.
63. Multiclass Classification (OVO/OVA) gpredomics#52
Priority: HIGH | Effort: High | Status: Open
One-vs-All and One-vs-One strategies for multi-class problems. Orchestrate K binary gpredomics runs.
64. Clinical Data Integration gpredomics#51
Priority: MEDIUM | Effort: Medium | Status: Open
Stacking, calibration, stratification of omics scores with clinical variables.
65. MCMC SBS Fix gpredomics#81
Priority: MEDIUM | Effort: Medium | Status: Open
SBS Bayesian evaluation produces inverted AUC after optimization changes.
66. Auto-discover Feature Weights gpredomics#78
Priority: LOW | Effort: Medium | Status: Open
Compare feature distributions between folds to auto-discover sampling weights.
67. More Heuristics gpredomics#63, #64, #66
Priority: LOW | Effort: Medium | Status: Open
PSO (Particle Swarm), EDA/UMDA, Bayesian Optimization.
68. Island Model GA gpredomics#33
Priority: LOW | Effort: High | Status: Open
Separate environments with periodic migration for diversity preservation.