Skip to content

Commit c94f6da

Browse files
authored
Clustering Parameter Refinements & Unified Slot Assignment (#552)
* Improve documentation and improve CHANGELOG.md * FIx CHangelog and change to v6.0.0 * FIx CHangelog and change to v6.0.0 * FIx CHangelog and change to v6.0.0 * Enhanced Clustering Control New Parameters Added to cluster() Method | Parameter | Type | Default | Purpose | |-------------------------|-------------------------------|----------------------|--------------------------------------------------------------------------------------------------------------------| | cluster_method | Literal[...] | 'k_means' | Clustering algorithm ('k_means', 'hierarchical', 'k_medoids', 'k_maxoids', 'averaging') | | representation_method | Literal[...] | 'meanRepresentation' | How clusters are represented ('meanRepresentation', 'medoidRepresentation', 'distributionAndMinMaxRepresentation') | | extreme_period_method | Literal[...] | 'new_cluster_center' | How peaks are integrated ('None', 'append', 'new_cluster_center', 'replace_cluster_center') | | rescale_cluster_periods | bool | True | Rescale clusters to match original means | | random_state | int | None | None | Random seed for reproducibility | | predef_cluster_order | np.ndarray | list[int] | None | None | Manual clustering assignments | | **tsam_kwargs | Any | - | Pass-through for any tsam parameter | Clustering Quality Metrics Access via fs.clustering.metrics after clustering - returns a DataFrame with RMSE, MAE, and other accuracy indicators per time series. Files Modified 1. flixopt/transform_accessor.py - Updated cluster() signature and tsam call 2. flixopt/clustering/base.py - Added metrics field to Clustering class 3. tests/test_clustering/test_integration.py - Added tests for new parameters 4. docs/user-guide/optimization/clustering.md - Updated documentation * Dimension renamed: original_period → original_cluster Property renamed: n_original_periods → n_original_clusters * Problem: Expanded FlowSystem from clustering didn't have the extra timestep that regular FlowSystems have. Root Cause: In expand_solution(), the solution was only indexed by original_timesteps (n elements) instead of original_timesteps_extra (n+1 elements). Fix in flixopt/transform_accessor.py: 1. Reindex solution to timesteps_extra (line 1296-1298): - Added expanded_fs._solution.reindex(time=original_timesteps_extra) for consistency with non-expanded FlowSystems 2. Fill extra timestep for charge_state (lines 1300-1333): - Added special handling to properly fill the extra timestep for storage charge_state variables using the last cluster's extra timestep value 3. Updated intercluster storage handling (lines 1340-1388): - Modified to work with original_timesteps_extra instead of just original_timesteps - The extra timestep now correctly gets the final SOC boundary value with proper decay applied Tests updated in tests/test_cluster_reduce_expand.py: - Updated 4 assertions that check solution time coordinates to expect 193 (192 + 1 extra) instead of 192 * - 'variable' is treated as a special valid facet value (since it exists in the melted DataFrame from data_var names, not as a dimension) - When facet_row='variable' or facet_col='variable' is passed, it's passed through directly - In line(), when faceting by variable, it's not also used for color (avoids double encoding) * Add variable and color to auto resolving in fxplot * Added 'variable' to both priority lists and updated the logic to treat it consistently: flixopt/config.py: 'extra_dim_priority': ('variable', 'cluster', 'period', 'scenario'), 'x_dim_priority': ('time', 'duration', 'duration_pct', 'variable', 'period', 'scenario', 'cluster'), flixopt/dataset_plot_accessor.py: - _get_x_dim: Now takes n_data_vars parameter; 'variable' is available when > 1 - _resolve_auto_facets: 'variable' is available when len(data_vars) > 1 and respects exclude_dims Behavior: - 'variable' is treated like any other dimension in the priority system - Only available when there are multiple data_vars - Properly excluded when already used (e.g., for x-axis) * Improve plotting, especially for clustering * Drop cluster index when expanding * Fix storage expansion * Improve clustering * fix scatter plot faceting * ⏺ Fixed the documentation in the notebook: 1. Cell 32 (API Reference table): Updated defaults to 'hierarchical', 'medoidRepresentation', and None 2. Cell 16: Swapped the example to show k_means as the alternative (since hierarchical is now default) 3. Cell 17: Updated variable names to match 4. Cell 33 (Key Takeaways): Clarified that random_state is only needed for non-deterministic methods like 'k_means' The code review * 1. Error handling for accuracyIndicators() - Added try/except with warning log and empty DataFrame fallback, plus handling empty DataFrames when building the metrics Dataset 2. Random state to tsam - Replaced global np.random.seed() with passing seed parameter directly to tsam's TimeSeriesAggregation 3. tsam_kwargs conflict validation - Added validation that raises ValueError if user tries to override explicit parameters via **tsam_kwargs (including seed) 4. predef_cluster_order validation - Added dimension validation for DataArray inputs, checking they match the FlowSystem's period/scenario structure 5. Out-of-bounds fix - Clamped last_original_cluster_idx to n_original_clusters - 1 to handle partial clusters at the end * 1. DataFrame truth ambiguity - Changed non_empty_metrics.get(first_key) or next(...) to explicit if metrics_df is None: check 2. removed random state * Fix pie plot animation frame and add warnings for unassigned dims * Change logger warning to regular warning * ⏺ The centralized slot assignment system is now complete. Here's a summary of the changes made: Changes Made 1. flixopt/config.py - Replaced three separate config attributes (extra_dim_priority, dim_slot_priority, x_dim_priority) with a single unified dim_priority tuple - Updated CONFIG.Plotting class docstring and attribute definitions - Updated to_dict() method to use the new attribute - The new priority order: ('time', 'duration', 'duration_pct', 'variable', 'cluster', 'period', 'scenario') 2. flixopt/dataset_plot_accessor.py - Created new assign_slots() function that centralizes all dimension-to-slot assignment logic - Fixed slot fill order: x → color → facet_col → facet_row → animation_frame - Updated all plot methods (bar, stacked_bar, line, area, heatmap, scatter, pie) to use assign_slots() - Removed old _get_x_dim() and _resolve_auto_facets() functions - Updated docstrings to reference dim_priority instead of x_dim_priority 3. flixopt/statistics_accessor.py - Updated _resolve_auto_facets() to use the new assign_slots() function internally - Added import for assign_slots from dataset_plot_accessor Key Design Decisions - Single priority list controls all auto-assignment - Slots are filled in fixed order based on availability - None means a slot is not available for that plot type - 'auto' triggers auto-assignment from priority list - Explicit string values override auto-assignment * Add slot_order to config * Add new assign_slots() method * Add new assign_slots() method * Fix heatmap and convert all to use fxplot * Fix heatmap * Fix heatmap * Fix heatmap * Fix heatmap * Squeeze signleton dims in heatmap()
1 parent 7dcacfd commit c94f6da

24 files changed

Lines changed: 1798 additions & 1022 deletions

CHANGELOG.md

Lines changed: 130 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,12 @@ If upgrading from v2.x, see the [v3.0.0 release notes](https://github.com/flixOp
5151
5252
Until here -->
5353

54-
## [5.1.0] - Upcoming
54+
## [6.0.0] - Upcoming
5555

56-
**Summary**: Time-series clustering for faster optimization with configurable storage behavior across typical periods. Improved weights API with always-normalized scenario weights.
56+
**Summary**: Major release introducing time-series clustering with storage inter-cluster linking, the new `fxplot` accessor for universal xarray plotting, and removal of deprecated v5.0 classes. Includes configurable storage behavior across typical periods and improved weights API.
57+
58+
!!! warning "Breaking Changes"
59+
This release removes `ClusteredOptimization` and `ClusteringParameters` which were deprecated in v5.0.0. Use `flow_system.transform.cluster()` instead. See [Migration](#migration-from-clusteredoptimization) below.
5760

5861
### ✨ Added
5962

@@ -121,6 +124,44 @@ charge_state = fs_expanded.solution['SeasonalPit|charge_state']
121124
Use `'cyclic'` for short-term storage like batteries or hot water tanks where only daily patterns matter.
122125
Use `'independent'` for quick estimates when storage behavior isn't critical.
123126

127+
**FXPlot Accessor**: New global xarray accessors for universal plotting with automatic faceting and smart dimension handling. Works on any xarray Dataset, not just flixopt results.
128+
129+
```python
130+
import flixopt as fx # Registers accessors automatically
131+
132+
# Plot any xarray Dataset with automatic faceting
133+
dataset.fxplot.bar(x='component')
134+
dataset.fxplot.area(x='time')
135+
dataset.fxplot.heatmap(x='time', y='component')
136+
dataset.fxplot.line(x='time', facet_col='scenario')
137+
138+
# DataArray support
139+
data_array.fxplot.line()
140+
141+
# Statistics transformations
142+
dataset.fxstats.to_duration_curve()
143+
```
144+
145+
**Available Plot Methods**:
146+
147+
| Method | Description |
148+
|--------|-------------|
149+
| `.fxplot.bar()` | Grouped bar charts |
150+
| `.fxplot.stacked_bar()` | Stacked bar charts |
151+
| `.fxplot.line()` | Line charts with faceting |
152+
| `.fxplot.area()` | Stacked area charts |
153+
| `.fxplot.heatmap()` | Heatmap visualizations |
154+
| `.fxplot.scatter()` | Scatter plots |
155+
| `.fxplot.pie()` | Pie charts with faceting |
156+
| `.fxstats.to_duration_curve()` | Transform to duration curve format |
157+
158+
**Key Features**:
159+
160+
- **Auto-faceting**: Automatically assigns extra dimensions (period, scenario, cluster) to `facet_col`, `facet_row`, or `animation_frame`
161+
- **Smart x-axis**: Intelligently selects x dimension based on priority (time > duration > period > scenario)
162+
- **Universal**: Works on any xarray Dataset/DataArray, not limited to flixopt
163+
- **Configurable**: Customize via `CONFIG.Plotting` (colorscales, facet columns, line shapes)
164+
124165
### 💥 Breaking Changes
125166

126167
- `FlowSystem.scenario_weights` are now always normalized to sum to 1 when set (including after `.sel()` subsetting)
@@ -132,12 +173,94 @@ charge_state = fs_expanded.solution['SeasonalPit|charge_state']
132173

133174
### 🗑️ Deprecated
134175

176+
The following items are deprecated and will be removed in **v7.0.0**:
177+
178+
**Classes** (use FlowSystem methods instead):
179+
180+
- `Optimization` class → Use `flow_system.optimize(solver)`
181+
- `SegmentedOptimization` class → Use `flow_system.optimize.rolling_horizon()`
182+
- `Results` class → Use `flow_system.solution` and `flow_system.statistics`
183+
- `SegmentedResults` class → Use segment FlowSystems directly
184+
185+
**FlowSystem methods** (use `transform` or `topology` accessor instead):
186+
187+
- `flow_system.sel()` → Use `flow_system.transform.sel()`
188+
- `flow_system.isel()` → Use `flow_system.transform.isel()`
189+
- `flow_system.resample()` → Use `flow_system.transform.resample()`
190+
- `flow_system.plot_network()` → Use `flow_system.topology.plot()`
191+
- `flow_system.start_network_app()` → Use `flow_system.topology.start_app()`
192+
- `flow_system.stop_network_app()` → Use `flow_system.topology.stop_app()`
193+
- `flow_system.network_infos()` → Use `flow_system.topology.infos()`
194+
195+
**Parameters:**
196+
135197
- `normalize_weights` parameter in `create_model()`, `build_model()`, `optimize()`
136198

199+
**Topology method name simplifications** (old names still work with deprecation warnings, removal in v7.0.0):
200+
201+
| Old (v5.x) | New (v6.0.0) |
202+
|------------|--------------|
203+
| `topology.plot_network()` | `topology.plot()` |
204+
| `topology.start_network_app()` | `topology.start_app()` |
205+
| `topology.stop_network_app()` | `topology.stop_app()` |
206+
| `topology.network_infos()` | `topology.infos()` |
207+
208+
Note: `topology.plot()` now renders a Sankey diagram. The old PyVis visualization is available via `topology.plot_legacy()`.
209+
210+
### 🔥 Removed
211+
212+
**Clustering classes removed** (deprecated in v5.0.0):
213+
214+
- `ClusteredOptimization` class - Use `flow_system.transform.cluster()` then `optimize()`
215+
- `ClusteringParameters` class - Parameters are now passed directly to `transform.cluster()`
216+
- `flixopt/clustering.py` module - Restructured to `flixopt/clustering/` package with new classes
217+
218+
#### Migration from ClusteredOptimization
219+
220+
=== "v5.x (Old - No longer works)"
221+
```python
222+
from flixopt import ClusteredOptimization, ClusteringParameters
223+
224+
params = ClusteringParameters(hours_per_period=24, nr_of_periods=8)
225+
calc = ClusteredOptimization('model', flow_system, params)
226+
calc.do_modeling_and_solve(solver)
227+
results = calc.results
228+
```
229+
230+
=== "v6.0.0 (New)"
231+
```python
232+
# Cluster using transform accessor
233+
fs_clustered = flow_system.transform.cluster(
234+
n_clusters=8, # was: nr_of_periods
235+
cluster_duration='1D', # was: hours_per_period=24
236+
)
237+
fs_clustered.optimize(solver)
238+
239+
# Results on the clustered FlowSystem
240+
costs = fs_clustered.solution['costs'].item()
241+
242+
# Expand back to full resolution if needed
243+
fs_expanded = fs_clustered.transform.expand_solution()
244+
```
245+
137246
### 🐛 Fixed
138247

139248
- `temporal_weight` and `sum_temporal()` now use consistent implementation
140249

250+
### 📝 Docs
251+
252+
**New Documentation Pages:**
253+
254+
- [Time-Series Clustering Guide](https://flixopt.github.io/flixopt/latest/user-guide/optimization/clustering/) - Comprehensive guide to clustering workflows
255+
256+
**New Jupyter Notebooks:**
257+
258+
- **08c-clustering.ipynb** - Introduction to time-series clustering
259+
- **08c2-clustering-storage-modes.ipynb** - Comparison of all 4 storage cluster modes
260+
- **08d-clustering-multiperiod.ipynb** - Clustering with periods and scenarios
261+
- **08e-clustering-internals.ipynb** - Understanding clustering internals
262+
- **fxplot_accessor_demo.ipynb** - Demo of the new fxplot accessor
263+
141264
### 👷 Development
142265

143266
**New Test Suites for Clustering**:
@@ -147,6 +270,11 @@ charge_state = fs_expanded.solution['SeasonalPit|charge_state']
147270
- `TestMultiPeriodClustering`: Tests for clustering with periods and scenarios dimensions
148271
- `TestPeakSelection`: Tests for `time_series_for_high_peaks` and `time_series_for_low_peaks` parameters
149272

273+
**New Test Suites for Other Features**:
274+
275+
- `test_clustering_io.py` - Tests for clustering serialization roundtrip
276+
- `test_sel_isel_single_selection.py` - Tests for transform selection methods
277+
150278
---
151279

152280
## [5.0.4] - 2026-01-05

docs/notebooks/01-quickstart.ipynb

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
"outputs": [],
3535
"source": [
3636
"import pandas as pd\n",
37+
"import plotly.express as px\n",
3738
"import xarray as xr\n",
3839
"\n",
3940
"import flixopt as fx\n",
@@ -58,7 +59,8 @@
5859
"metadata": {},
5960
"outputs": [],
6061
"source": [
61-
"timesteps = pd.date_range('2024-01-15 08:00', periods=4, freq='h')"
62+
"timesteps = pd.date_range('2024-01-15 08:00', periods=4, freq='h')\n",
63+
"print(f'Optimizing from {timesteps[0]} to {timesteps[-1]}')"
6264
]
6365
},
6466
{
@@ -86,8 +88,9 @@
8688
" name='Heat Demand [kW]',\n",
8789
")\n",
8890
"\n",
89-
"# Visualize the demand with fxplot accessor\n",
90-
"heat_demand.to_dataset().fxplot.bar(title='Heat Demand')"
91+
"# Visualize the demand with plotly\n",
92+
"fig = px.bar(x=heat_demand.time.values, y=heat_demand.values, labels={'x': 'Time', 'y': 'Heat Demand [kW]'})\n",
93+
"fig"
9194
]
9295
},
9396
{
@@ -200,18 +203,14 @@
200203
"metadata": {},
201204
"outputs": [],
202205
"source": [
206+
"total_costs = flow_system.solution['costs'].item()\n",
203207
"total_heat = float(heat_demand.sum())\n",
204208
"gas_consumed = total_heat / 0.9 # Account for boiler efficiency\n",
205209
"\n",
206-
"pd.DataFrame(\n",
207-
" {\n",
208-
" 'Total heat demand [kWh]': total_heat,\n",
209-
" 'Gas consumed [kWh]': gas_consumed,\n",
210-
" 'Total costs [EUR]': flow_system.solution['costs'].item(),\n",
211-
" 'Average cost [EUR/kWh_heat]': flow_system.solution['costs'].item() / total_heat,\n",
212-
" },\n",
213-
" index=['Value'],\n",
214-
").T"
210+
"print(f'Total heat demand: {total_heat:.1f} kWh')\n",
211+
"print(f'Gas consumed: {gas_consumed:.1f} kWh')\n",
212+
"print(f'Total costs: {total_costs:.2f} €')\n",
213+
"print(f'Average cost: {total_costs / total_heat:.3f} €/kWh_heat')"
215214
]
216215
},
217216
{

docs/notebooks/02-heat-system.ipynb

Lines changed: 48 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,9 @@
3232
"metadata": {},
3333
"outputs": [],
3434
"source": [
35+
"import numpy as np\n",
3536
"import pandas as pd\n",
37+
"import plotly.express as px\n",
3638
"import xarray as xr\n",
3739
"\n",
3840
"import flixopt as fx\n",
@@ -57,12 +59,33 @@
5759
"metadata": {},
5860
"outputs": [],
5961
"source": [
60-
"from data.tutorial_data import get_heat_system_data\n",
62+
"# One week, hourly resolution\n",
63+
"timesteps = pd.date_range('2024-01-15', periods=168, freq='h')\n",
6164
"\n",
62-
"data = get_heat_system_data()\n",
63-
"timesteps = data['timesteps']\n",
64-
"heat_demand = data['heat_demand']\n",
65-
"gas_price = data['gas_price']"
65+
"# Create realistic office heat demand pattern\n",
66+
"hours = np.arange(168)\n",
67+
"hour_of_day = hours % 24\n",
68+
"day_of_week = (hours // 24) % 7\n",
69+
"\n",
70+
"# Base demand pattern (kW)\n",
71+
"base_demand = np.where(\n",
72+
" (hour_of_day >= 7) & (hour_of_day <= 18), # Office hours\n",
73+
" 80, # Daytime\n",
74+
" 30, # Night setback\n",
75+
")\n",
76+
"\n",
77+
"# Reduce on weekends (days 5, 6)\n",
78+
"weekend_factor = np.where(day_of_week >= 5, 0.5, 1.0)\n",
79+
"heat_demand = base_demand * weekend_factor\n",
80+
"\n",
81+
"# Add some random variation\n",
82+
"np.random.seed(42)\n",
83+
"heat_demand = heat_demand + np.random.normal(0, 5, len(heat_demand))\n",
84+
"heat_demand = np.clip(heat_demand, 20, 100)\n",
85+
"\n",
86+
"print(f'Time range: {timesteps[0]} to {timesteps[-1]}')\n",
87+
"print(f'Peak demand: {heat_demand.max():.1f} kW')\n",
88+
"print(f'Total demand: {heat_demand.sum():.0f} kWh')"
6689
]
6790
},
6891
{
@@ -72,13 +95,15 @@
7295
"metadata": {},
7396
"outputs": [],
7497
"source": [
75-
"# Visualize the demand pattern with fxplot\n",
76-
"demand_ds = xr.Dataset(\n",
77-
" {\n",
78-
" 'Heat Demand [kW]': xr.DataArray(heat_demand, dims=['time'], coords={'time': timesteps}),\n",
79-
" }\n",
98+
"# Visualize the demand pattern with plotly\n",
99+
"demand_series = xr.DataArray(heat_demand, dims=['time'], coords={'time': timesteps}, name='Heat Demand [kW]')\n",
100+
"fig = px.line(\n",
101+
" x=demand_series.time.values,\n",
102+
" y=demand_series.values,\n",
103+
" title='Office Heat Demand Profile',\n",
104+
" labels={'x': 'Time', 'y': 'kW'},\n",
80105
")\n",
81-
"demand_ds.fxplot.line(title='Office Heat Demand Profile')"
106+
"fig"
82107
]
83108
},
84109
{
@@ -98,13 +123,15 @@
98123
"metadata": {},
99124
"outputs": [],
100125
"source": [
101-
"# Visualize gas price with fxplot\n",
102-
"price_ds = xr.Dataset(\n",
103-
" {\n",
104-
" 'Gas Price [EUR/kWh]': xr.DataArray(gas_price, dims=['time'], coords={'time': timesteps}),\n",
105-
" }\n",
126+
"# Time-of-use gas prices (€/kWh)\n",
127+
"gas_price = np.where(\n",
128+
" (hour_of_day >= 6) & (hour_of_day <= 22),\n",
129+
" 0.08, # Peak: 6am-10pm\n",
130+
" 0.05, # Off-peak: 10pm-6am\n",
106131
")\n",
107-
"price_ds.fxplot.line(title='Gas Price')"
132+
"\n",
133+
"fig = px.line(x=timesteps, y=gas_price, title='Gas Price [€/kWh]', labels={'x': 'Time', 'y': '€/kWh'})\n",
134+
"fig"
108135
]
109136
},
110137
{
@@ -282,16 +309,12 @@
282309
"metadata": {},
283310
"outputs": [],
284311
"source": [
312+
"total_costs = flow_system.solution['costs'].item()\n",
285313
"total_heat = heat_demand.sum()\n",
286314
"\n",
287-
"pd.DataFrame(\n",
288-
" {\n",
289-
" 'Total operating costs [EUR]': flow_system.solution['costs'].item(),\n",
290-
" 'Total heat delivered [kWh]': total_heat,\n",
291-
" 'Average cost [ct/kWh]': flow_system.solution['costs'].item() / total_heat * 100,\n",
292-
" },\n",
293-
" index=['Value'],\n",
294-
").T"
315+
"print(f'Total operating costs: {total_costs:.2f} €')\n",
316+
"print(f'Total heat delivered: {total_heat:.0f} kWh')\n",
317+
"print(f'Average cost: {total_costs / total_heat * 100:.2f} ct/kWh')"
295318
]
296319
},
297320
{

0 commit comments

Comments
 (0)