Skip to content

Add federated mixed effects algorithms and tests#570

Closed
AndreasKtenidis wants to merge 51 commits intomasterfrom
feature/mixed-effects
Closed

Add federated mixed effects algorithms and tests#570
AndreasKtenidis wants to merge 51 commits intomasterfrom
feature/mixed-effects

Conversation

@AndreasKtenidis
Copy link
Copy Markdown
Collaborator

No description provided.

…flow into feature/mixed-effects

# Conflicts:
#	tests/standalone_tests/federated_algorithms/mixed_effects/test_glmm_binary.py
#	tests/standalone_tests/federated_algorithms/mixed_effects/test_glmm_ordinal.py
#	tests/standalone_tests/federated_algorithms/mixed_effects/test_lmm.py
@AndreasKtenidis AndreasKtenidis force-pushed the feature/mixed-effects branch from 1767074 to 6136ef5 Compare April 3, 2026 12:00
KFilippopolitis and others added 19 commits April 6, 2026 11:08
…pecs

Nominal variables should not be treated as integers. This commit updates the JSON specifications for various algorithms (Naive Bayes, Logistic Regression, etc.) to ensure strict typing for nominal inputs.
…pecs

Updates K-Means and Linear Regression specifications to accept both integer and real values for numerical input fields, ensuring broader compatibility with input data.
Enhance logistic_regression.py to gracefully handle invalid input data.
…DuckDB loader

Replaces generic categorical coercion with explicit type mapping in `DuckDBCSVLoader`. This improves type safety and removes the need for `udf_service.py` workarounds. Also adds comprehensive tests for type loading and removes obsolete coercion tests.
Enhances `DuckDBCSVLoader` to correctly handle various CSV delimiters. Adds `test_csv_delimiters.py` to verify delimiter handling.
- Move _build_column_types inside _create_primary_data_table for better encapsulation
- Update unit tests to reflect API change
- Modify aggregation server template to conditionally apply nodeSelector based on managed_cluster value
Use dict-based lookups to prevent pandas from treating string
keys like "0" as positional indices. Fixes duplicated counts.
ThanKarab and others added 26 commits April 6, 2026 11:08
Fix the duckdb connection conflicts due to non read-only connections.
Added where clause, group_by and aggregate.
The controller will no longer have knowledge of the actual metadata,
only the worker. This was changed to be compatible with future
pipelines.
Hard-coded column names have been centralized in variable file.
Remove preprocessing step compatible algorithms logic.
@ThanKarab ThanKarab closed this Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants