Releases: madgik/exaflow
1.0.0
Exaflow 1.0.0 Release Notes
Project & Architecture
-
Exareme2 → Exaflow rename:
The project has been renamed to Exaflow, reflecting its evolution into an umbrella orchestration engine. Exaflow now orchestrates federated workflows across Exareme3 and Flower, clearly separating orchestration from execution and learning backends. -
Ecosystem restructuring:
Established a modular architecture where:- Exaflow acts as the orchestration layer
- Exareme3 provides the framework for federated learning with the aggregation server paradigm
- Flower operates in parallel to exareme3 as a standalone framework
This restructuring improves separation of concerns and long-term extensibility.
Framework Migration (Exareme2 → Exareme3)
-
Database engine migration:
Deprecated MonetDB in favor of DuckDB. DuckDB simplifies deployment, reduces operational overhead, and provides better support for embedded analytical workloads in federated nodes. -
Messaging layer replacement:
Removed Celery + RabbitMQ in favor of a gRPC-based communication layer. This introduces strongly typed APIs, lower latency communication, and a more maintainable distributed architecture. -
Federated topology redesign:
Replaced the local-global paradigm with a dedicated aggregation server model. This clarifies node responsibilities, improves scalability, and aligns the platform with modern federated learning deployment patterns.
Federated Algorithm Library
-
Core algorithm extraction:
Extracted all core federated algorithm logic into a standalone federated learning library. This decouples algorithm implementation from orchestration and execution layers. -
sklearn-style interface:
The new federated library follows a scikit-learn–like API (fit,predict,transform), enabling:- Easier algorithm development
- Cleaner testing and benchmarking
- Reusability outside Exaflow
- Independent versioning and evolution
-
Improved modularity:
The separation between orchestration, execution framework, and algorithm layer significantly improves maintainability and extensibility of the platform.
0.28.0
Exareme2 0.28.0 Release Notes
Controller & API
- Datasets → Variables endpoint: Added an endpoint that exposes the available variables per dataset. This helps clients/UX discover columns/fields dynamically and simplifies validation of algorithm parameters that depend on dataset schema. (PR #528). ([GitHub][1])
Deployment / Invoke
- Faster, more resilient data loading: Reworked the
tasks.pydata-loading flow used byinv load-data. Improves logging and robustness when preparing/combining datasets across workers; handles partial failures more gracefully. (PR #525). ([GitHub][2])
CI / Release Tooling
0.27.0
Exareme2 0.27.0 Release Notes
Kubernetes / Deployment
- Managed clusters: Integrated Submariner and refreshed managed-cluster configs. Updated
StorageClassfromceph-corbo-cephfstoceph-corbo-cephfs-retainto preserve data on restarts. Simplified Helm/K8s templates (cleanerifusage). Controller workers’ DNS field now supports multiple entries. Aggregation Server can be enabled/disabled via a flag. (PR [#520](#520))
Controller & API
- Toggle deployments:
Introduced aComponentTypeenum (AGGREGATION_SERVER,MONETDB,FLOWER) to centralize supported components.
Controllers/clients now filter available algorithms based on active component(s).
(PR [#518](#518))
Algorithms
- Deployment targeting: Algorithm specs must now explicitly declare which
ComponentType(s) they support. Specs without this field are considered to support none and won’t be exposed. (PR [#518](#518))
Exaflow Algorithm Engine
- Aggregation Server can be toggled on/off at deploy time to fit the cluster topology (managed/meshed networking via Submariner). (PR [#520](#520))
CI / Tooling
0.26.0
Exareme2 0.26.0 Release Notes
Controller & API
- Refactored controller execution: generic (non‑algorithm) pieces stay in the controller; algorithm‑specific logic moved into dedicated strategies chosen by a factory.
algorithm_typewas removed from the algorithm request payload—update any clients accordingly. ([GitHub][1])
Exaflow Algorithm Engine
- Integrated an Aggregation Server into ExaFlow: new
ExaflowWithAggregationServerStrategy, creation/cleanup of aggregation state, and passing anAggregationClientwith a unique request ID through the steps. ([GitHub][1])
Algorithms
- Added XGBoost implemented on Flower, with spec, client/server code, and tests. ([GitHub][1])
Runtime & Dependencies
- Upgraded Python to 3.10 and bumped
mipdbto 3.0.9 along with other dependency updates. ([GitHub][1])
Worker / DB Naming
- Dropped the strict alphanumeric check on
worker_id; non‑alphanumeric chars are sanitized via_to_alnum(), andalphanumeric_worker_idis used in final table names. ([GitHub][1])
CI / Tooling
- Replaced deprecated CodeClimate configuration with Qlty. ([GitHub][1])
0.25.0
Exareme2 0.25.0 Release Notes
Kubernetes & Helm
-
Refactor storage paths to use worker-specific locations: Unified storage configuration with new worker-specific keys and updated Helm charts and templates.
-
Fix for MonetDB volume mount (#511)
- Corrected
bootstrap.shand Kubernetes mount configurations to ensure proper mounting of the MonetDB volume.
- Corrected
-
Managed-cluster deployment support (#513, #514)
- Added support for specifying different storage classes in Kubernetes manifests.
- Converted both globalnode and localnode components from Deployments to StatefulSets for stable stateful workloads.
- Tweaked controller and globalnode manifests to ensure compatibility in managed-cluster environments.
SMPC Test Suite
-
Disable standalone SMPC tests
- Fully disabled the standalone SMPC test suite and removed related coverage setup.
-
Remove SMPC tests’ coverageLocations
- Cleaned up residual coverage path references now unused by the SMPC tests.
-
Suppress controller logs during SMPC tests
- Turned off verbose logging in the controller to reduce noise when running SMPC tests.
Exaflow Algorithm Engine
-
Initial implementation of Exaflow (#507)
- Introduced the Exaflow algorithm execution flow.
- Added
steps_service.pyandsteps_api.pyfor Celery-backed step execution. - Implemented the core execution engine and task handler within the controller.
- Provided base algorithm definitions and a sample
compute_averagealgorithm.
Data Ingestion & CSV Handling
-
Refactor dataset CSV path handling (#508)
- Removed CSV path logic from the Worker Landscape Aggregator component.
- Delegated CSV retrieval to a new
worker_info_serviceCelery task. - Updated Flower and Exaflow components to leverage the shared
inputdata_utilsmodule.
0.24.0
0.23.1
0.23.0
Release notes - 0.23 September 24 Release
Task
MIP-941 Flower - Create test to validate proper parallel execution
MIP-946 Flower - (Spike) Test algorithms validity
MIP-956 Logs - Find a service to aggregate logs
MIP-977 Logs - Integrate ELK stack to consume portal-backend logs
MIP-979 Logs - Integrate ELK stack to consume exareme2 logs
0.22.0
Changelog
0.22.0 - 22/7/2024
General System Refactoring
- Renamed NODE to WORKER and restructured WORKER packages (#477)
- Worker is packaged by feature, including
exareme2/flowerpackages. - Fixed version incompatibility issues between
pre-commitconfig andpoetry. - Resolved a bug in Kubernetes templates.
- Worker is packaged by feature, including
- System tables are loaded from SQLite (#488)
- Refactored Task Handlers (#476)
- Restructured the tests to align with the project structure. (#479)
- Refactored Algorithm Specifications (#480)
- Specifications are now loaded via JSON files.
- Added type specifications to algorithms in preparation for integrating Flower.
Algorithms
- Dev/mip 897/add log transformation (#487)
- Added PCA with transformations
Flower Integration
-
Integrated Flower Federated Learning Framework (#478)
- New Controller for Flower Execution:
- Introduced a controller with modules for managing Flower workflow and algorithm execution.
- Added Flower-Compatible Algorithms:
- Logistic Regression with MIP Data: Utilizes MIP data with parameter customization.
- Logistic Regression with MNIST Data: Tailored for MNIST data operations.
- Robust Process Management Module:
- Enhanced process control with functions for sending signals, checking status, managing zombie processes, and process termination with retry capabilities.
- The controller uses this module to initiate, monitor, and safely terminate Flower execution processes, ensuring better oversight of Flower's client and server components.
- Lowered local workers in production tests due to increased resource usage by Flower and controller.
- Implemented process garbage collection at the start of each Flower algorithm.
- New Controller for Flower Execution:
-
Error Handling: (#482)
-
Dev/mip 902/flower logging (#486)
- Added a Flower logger similar to the worker.
-
Retry mechanism for Flower client-server connections. (#484)
-
Implemented DataFrame filter for Flower inputdata processing. (#485)
-
Test Data Loaded on GlobalWorker (#489)
- Integrated Mipdb into GlobalWorker for dataset monitoring by Worker Landscape Aggregator (WLA).
- Added test datasets, excluded from Exareme2 flow.
-
Inputdata: Split datasets to training datasets and validation datasets. (#490)
-
Updated deployment logic:
- Kubernetes: Ensure GlobalWorker deployment even in single-worker setups.
- Flower: Configure server to run on GlobalWorker.
0.21.2
Changlog:
- Relaxed the validation of the filter values.