Skip to content

Releases: madgik/exaflow

1.0.0

03 Mar 13:46

Choose a tag to compare

Exaflow 1.0.0 Release Notes

Project & Architecture

  • Exareme2 → Exaflow rename:
    The project has been renamed to Exaflow, reflecting its evolution into an umbrella orchestration engine. Exaflow now orchestrates federated workflows across Exareme3 and Flower, clearly separating orchestration from execution and learning backends.

  • Ecosystem restructuring:
    Established a modular architecture where:

    • Exaflow acts as the orchestration layer
    • Exareme3 provides the framework for federated learning with the aggregation server paradigm
    • Flower operates in parallel to exareme3 as a standalone framework

    This restructuring improves separation of concerns and long-term extensibility.


Framework Migration (Exareme2 → Exareme3)

  • Database engine migration:
    Deprecated MonetDB in favor of DuckDB. DuckDB simplifies deployment, reduces operational overhead, and provides better support for embedded analytical workloads in federated nodes.

  • Messaging layer replacement:
    Removed Celery + RabbitMQ in favor of a gRPC-based communication layer. This introduces strongly typed APIs, lower latency communication, and a more maintainable distributed architecture.

  • Federated topology redesign:
    Replaced the local-global paradigm with a dedicated aggregation server model. This clarifies node responsibilities, improves scalability, and aligns the platform with modern federated learning deployment patterns.


Federated Algorithm Library

  • Core algorithm extraction:
    Extracted all core federated algorithm logic into a standalone federated learning library. This decouples algorithm implementation from orchestration and execution layers.

  • sklearn-style interface:
    The new federated library follows a scikit-learn–like API (fit, predict, transform), enabling:

    • Easier algorithm development
    • Cleaner testing and benchmarking
    • Reusability outside Exaflow
    • Independent versioning and evolution
  • Improved modularity:
    The separation between orchestration, execution framework, and algorithm layer significantly improves maintainability and extensibility of the platform.


0.28.0

31 Oct 13:56
c1d4eb3

Choose a tag to compare

Exareme2 0.28.0 Release Notes

Controller & API

  • Datasets → Variables endpoint: Added an endpoint that exposes the available variables per dataset. This helps clients/UX discover columns/fields dynamically and simplifies validation of algorithm parameters that depend on dataset schema. (PR #528). ([GitHub][1])

Deployment / Invoke

  • Faster, more resilient data loading: Reworked the tasks.py data-loading flow used by inv load-data. Improves logging and robustness when preparing/combining datasets across workers; handles partial failures more gracefully. (PR #525). ([GitHub][2])

CI / Release Tooling

  • Fix image publishing: Adjusted the release workflow so images are correctly pushed when tagging (tested against the branch/tag in Actions). This unblocks automated image publication for new releases. (PR #526). ([GitHub][3])

0.27.0

10 Oct 08:02
fe75cab

Choose a tag to compare

Exareme2 0.27.0 Release Notes

Kubernetes / Deployment

  • Managed clusters: Integrated Submariner and refreshed managed-cluster configs. Updated StorageClass from ceph-corbo-cephfs to ceph-corbo-cephfs-retain to preserve data on restarts. Simplified Helm/K8s templates (cleaner if usage). Controller workers’ DNS field now supports multiple entries. Aggregation Server can be enabled/disabled via a flag. (PR [#520](#520))

Controller & API

  • Toggle deployments:
    Introduced a ComponentType enum (AGGREGATION_SERVER, MONETDB, FLOWER) to centralize supported components.
    Controllers/clients now filter available algorithms based on active component(s).
    (PR [#518](#518))

Algorithms

  • Deployment targeting: Algorithm specs must now explicitly declare which ComponentType(s) they support. Specs without this field are considered to support none and won’t be exposed. (PR [#518](#518))

Exaflow Algorithm Engine

  • Aggregation Server can be toggled on/off at deploy time to fit the cluster topology (managed/meshed networking via Submariner). (PR [#520](#520))

CI / Tooling

  • Reduced noise in prod_env_tests by removing redundant kubectl get pods and the dedicated Aggregation Server readiness check. (PR [#518](#518))

0.26.0

28 Jul 11:26

Choose a tag to compare

Exareme2 0.26.0 Release Notes

Controller & API

  • Refactored controller execution: generic (non‑algorithm) pieces stay in the controller; algorithm‑specific logic moved into dedicated strategies chosen by a factory.
  • algorithm_type was removed from the algorithm request payload—update any clients accordingly. ([GitHub][1])

Exaflow Algorithm Engine

  • Integrated an Aggregation Server into ExaFlow: new ExaflowWithAggregationServerStrategy, creation/cleanup of aggregation state, and passing an AggregationClient with a unique request ID through the steps. ([GitHub][1])

Algorithms

  • Added XGBoost implemented on Flower, with spec, client/server code, and tests. ([GitHub][1])

Runtime & Dependencies

  • Upgraded Python to 3.10 and bumped mipdb to 3.0.9 along with other dependency updates. ([GitHub][1])

Worker / DB Naming

  • Dropped the strict alphanumeric check on worker_id; non‑alphanumeric chars are sanitized via _to_alnum(), and alphanumeric_worker_id is used in final table names. ([GitHub][1])

CI / Tooling

  • Replaced deprecated CodeClimate configuration with Qlty. ([GitHub][1])

0.25.0

20 Jun 08:18
7d5a24b

Choose a tag to compare

Exareme2 0.25.0 Release Notes

Kubernetes & Helm

  • Refactor storage paths to use worker-specific locations: Unified storage configuration with new worker-specific keys and updated Helm charts and templates.

  • Fix for MonetDB volume mount (#511)

    • Corrected bootstrap.sh and Kubernetes mount configurations to ensure proper mounting of the MonetDB volume.
  • Managed-cluster deployment support (#513, #514)

    • Added support for specifying different storage classes in Kubernetes manifests.
    • Converted both globalnode and localnode components from Deployments to StatefulSets for stable stateful workloads.
    • Tweaked controller and globalnode manifests to ensure compatibility in managed-cluster environments.

SMPC Test Suite

  • Disable standalone SMPC tests

    • Fully disabled the standalone SMPC test suite and removed related coverage setup.
  • Remove SMPC tests’ coverageLocations

    • Cleaned up residual coverage path references now unused by the SMPC tests.
  • Suppress controller logs during SMPC tests

    • Turned off verbose logging in the controller to reduce noise when running SMPC tests.

Exaflow Algorithm Engine

  • Initial implementation of Exaflow (#507)

    • Introduced the Exaflow algorithm execution flow.
    • Added steps_service.py and steps_api.py for Celery-backed step execution.
    • Implemented the core execution engine and task handler within the controller.
    • Provided base algorithm definitions and a sample compute_average algorithm.

Data Ingestion & CSV Handling

  • Refactor dataset CSV path handling (#508)

    • Removed CSV path logic from the Worker Landscape Aggregator component.
    • Delegated CSV retrieval to a new worker_info_service Celery task.
    • Updated Flower and Exaflow components to leverage the shared inputdata_utils module.

0.24.0

19 Mar 08:20

Choose a tag to compare

Release notes - 0.24.0 March 2025 Release

Bug

  • Fixes for kuberentes single node deployment

Task

  • Update all python docker base images to 3.8.19-slim-bullseye
  • Add dynamic way of setting namespace in k8s deployment.

0.23.1

28 Nov 15:01

Choose a tag to compare

Release notes - 0.23.1 September 24 Patchfix Release

Changelog:

  • Minor fix on celery logger.

0.23.0

01 Nov 17:37

Choose a tag to compare

Release notes - 0.23 September 24 Release

Task

MIP-941 Flower - Create test to validate proper parallel execution

MIP-946 Flower - (Spike) Test algorithms validity

MIP-956 Logs - Find a service to aggregate logs

MIP-977 Logs - Integrate ELK stack to consume portal-backend logs

MIP-979 Logs - Integrate ELK stack to consume exareme2 logs

0.22.0

22 Jul 11:19
fcb3188

Choose a tag to compare

Changelog

0.22.0 - 22/7/2024

General System Refactoring

  • Renamed NODE to WORKER and restructured WORKER packages (#477)
    • Worker is packaged by feature, including exareme2/flower packages.
    • Fixed version incompatibility issues between pre-commit config and poetry.
    • Resolved a bug in Kubernetes templates.
  • System tables are loaded from SQLite (#488)
  • Refactored Task Handlers (#476)
    • Restructured the tests to align with the project structure. (#479)
  • Refactored Algorithm Specifications (#480)
    • Specifications are now loaded via JSON files.
    • Added type specifications to algorithms in preparation for integrating Flower.

Algorithms

  • Dev/mip 897/add log transformation (#487)
    • Added PCA with transformations

Flower Integration

  • Integrated Flower Federated Learning Framework (#478)

    • New Controller for Flower Execution:
      • Introduced a controller with modules for managing Flower workflow and algorithm execution.
    • Added Flower-Compatible Algorithms:
      • Logistic Regression with MIP Data: Utilizes MIP data with parameter customization.
      • Logistic Regression with MNIST Data: Tailored for MNIST data operations.
    • Robust Process Management Module:
      • Enhanced process control with functions for sending signals, checking status, managing zombie processes, and process termination with retry capabilities.
      • The controller uses this module to initiate, monitor, and safely terminate Flower execution processes, ensuring better oversight of Flower's client and server components.
      • Lowered local workers in production tests due to increased resource usage by Flower and controller.
      • Implemented process garbage collection at the start of each Flower algorithm.
  • Error Handling: (#482)

  • Dev/mip 902/flower logging (#486)

    • Added a Flower logger similar to the worker.
  • Retry mechanism for Flower client-server connections. (#484)

  • Implemented DataFrame filter for Flower inputdata processing. (#485)

  • Test Data Loaded on GlobalWorker (#489)

    • Integrated Mipdb into GlobalWorker for dataset monitoring by Worker Landscape Aggregator (WLA).
    • Added test datasets, excluded from Exareme2 flow.
  • Inputdata: Split datasets to training datasets and validation datasets. (#490)

  • Updated deployment logic:

    • Kubernetes: Ensure GlobalWorker deployment even in single-worker setups.
    • Flower: Configure server to run on GlobalWorker.

0.21.2

16 Apr 10:27

Choose a tag to compare

Changlog:

  • Relaxed the validation of the filter values.