fix: apply feature selection in Predict() — fixes critical dimension mismatch bug by ooples · Pull Request #928 · ooples/AiDotNet

ooples · 2026-03-07T11:27:49Z

Summary

Fixes a critical bug where AiModelResult.Predict() did not apply the same feature selection that was used during training, causing dimension mismatch errors for all models built through AiModelBuilder with the default optimizer.

The Bug

NormalOptimizer (the default optimizer) randomly selects a subset of features during training via PrepareAndEvaluateSolution
The model is trained on this subset (e.g., 1 out of 5 columns)
OptimizationResult.SelectedFeatures stores column vectors, but Predict() never applied this selection to new input data
When Predict() passed all N columns to a model trained on M < N columns, it threw: "Number of columns must equal length of vector"

Impact

This affects every model built through AiModelBuilder using the default optimizer (which is the primary use case). The default MaximumFeatures=0 setting means the optimizer always selects just 1 feature, making this bug trigger 100% of the time for multi-feature datasets.

The Fix

Added SelectedFeatureIndices (List<int>) to OptimizationStepData, OptimizationResult, and ModelResult — stores actual column indices (the existing SelectedFeatures stores column vectors which require fragile heuristic-based index recovery)
Propagated indices through the optimizer flow: PrepareAndEvaluateSolution → UpdateBestSolution → CreateOptimizationResult
Applied feature selection in AiModelResult.Predict() — after preprocessing but before Model.Predict(), using OptimizerHelper.SelectFeatures() with the stored indices

Files Changed

File	Change
`OptimizationStepData.cs`	Added `SelectedFeatureIndices` property
`OptimizationResult.cs`	Added `SelectedFeatureIndices` + updated `DeepCopy`/`WithParameters`
`ModelResult.cs`	Added `SelectedFeatureIndices` property
`OptimizerHelper.cs`	Updated `CreateOptimizationResult` to accept indices
`OptimizerBase.cs`	Store & propagate indices in `PrepareAndEvaluateSolution`, `UpdateBestSolution`, `CreateOptimizationResult`
`AiModelResult.cs`	Core fix: Apply feature selection in `Predict()`

Test plan

12 new integration tests covering the full BuildAsync → Predict pipeline
BuildAsync_DefaultOptimizer_PredictDoesNotThrowDimensionMismatch — the exact bug repro
BuildAsync_SelectedFeatureIndices_AreStoredInOptimizationResult — verifies indices are stored
BuildAsync_FeatureSelectionReducesColumns_PredictHandlesIt — 10 features, verifies selection
BuildAsync_PredictTwice_ReturnsSameResults — determinism check
OptimizationResult_DeepCopy_PreservesSelectedFeatureIndices — deep copy safety
OptimizationResult_WithParameters_PreservesSelectedFeatureIndices — parameter update safety
BuildAsync_DirectModelPredict_WithoutBuilder_StillWorks — no regression for direct use
All 24 existing LinearRegressionIntegrationTests still pass

Closes #927

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Models now record numeric indices of input columns chosen during optimization and expose them in optimization and model results.
- Feature-index selections are propagated through result copies and updates so the chosen subset is preserved.
Bug Fixes
- Prediction now supports 1D input selection, preventing dimension-mismatch errors.
Tests
- New integration tests cover end-to-end feature-selection storage, application during prediction, determinism, bounds, and large-data scenarios.

…sion mismatch bug The NormalOptimizer (default) randomly selects a subset of features during training via PrepareAndEvaluateSolution, but AiModelResult.Predict() never applied the same feature selection to new input data. This caused the model to receive all columns while it was trained on fewer, throwing "Number of columns must equal length of vector" errors. Changes: - Add SelectedFeatureIndices (List<int>) to OptimizationStepData, OptimizationResult, and ModelResult to store actual column indices - Propagate indices through optimizer flow: PrepareAndEvaluateSolution → UpdateBestSolution → CreateOptimizationResult - Apply feature selection in AiModelResult.Predict() after preprocessing but before Model.Predict(), using OptimizerHelper.SelectFeatures() - Add 12 integration tests verifying the full BuildAsync → Predict pipeline Closes #927 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-03-07T11:27:53Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
aidotnet_website	Ready	Preview, Comment	Mar 7, 2026 4:52pm
aidotnet-playground-api	Ready	Preview, Comment	Mar 7, 2026 4:52pm

coderabbitai · 2026-03-07T11:28:11Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Tracks and persists numeric selected-feature indices through optimizers and results, applies those indices at prediction time (including tensor/1D handling), updates helper/optimizer APIs to propagate indices, and adds extensive integration tests validating end-to-end feature-selection behavior.

Changes

Cohort / File(s)	Summary
Data Model Extensions `src/Models/Results/OptimizationResult.cs`, `src/Models/Results/ModelResult.cs`, `src/Models/OptimizationStepData.cs`	Add public `SelectedFeatureIndices : List<int>` to OptimizationResult, ModelResult, and OptimizationStepData; initialize and deep-copy/propagate in constructors/methods. BLOCKING: verify no null/uninitialized cases and API compatibility for public properties.
Optimizer Integration `src/Optimizers/OptimizerBase.cs`	Propagate `SelectedFeatureIndices` through PrepareAndEvaluateSolution, UpdateBestSolution, UpdateAndApplyBestSolution; add overload to derive selection from total feature count. BLOCKING: ensure overrides in custom optimizers remain valid, bounds checks present, and no TODOs/stubs remain.
Helper Changes `src/Helpers/OptimizerHelper.cs`	Extend `CreateOptimizationResult` signature with optional `List<int>? bestSelectedFeatureIndices = null` and set `OptimizationResult.SelectedFeatureIndices`; adjust SelectFeatures to handle 1D tensors (select elements and return 1D tensor). BLOCKING: confirm 1D tensor handling is robust and production-ready (no placeholder logic).
Prediction Logic `src/Models/Results/AiModelResult.cs`	Add private ApplySelectedFeaturesForPrediction to apply `OptimizationResult.SelectedFeatureIndices` after preprocessing and before Model.Predict; invoked in `AiModelResult.Predict(TInput)` and InferenceSequence path. BLOCKING: validate identity short-circuit, strict index-range checks, and correct matrix/tensor shape handling.
Tests `tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`	Add comprehensive integration tests exercising storage/propagation and application of `SelectedFeatureIndices`, single/multi-row prediction, deterministic repeats, explicit optimizer paths, tensor-selection behavior, DeepCopy/WithParameters preservation, and large-data scenarios. BLOCKING: large test surface—ensure CI capacity, remove any test-only TODOs, and confirm determinism of randomized parts.
Call-site Adjustments `src/.../OptimizerHelper.CreateOptimizationResult` callers	Update internal call sites to pass optional `bestSelectedFeatureIndices` where applicable. BLOCKING: confirm no public-facing binary incompatibility where helper is consumed externally.

Sequence Diagram(s)

sequenceDiagram
    participant Builder as "AiModelBuilder"
    participant Optimizer as "OptimizerBase"
    participant Helper as "OptimizerHelper"
    participant Result as "OptimizationResult"
    participant Predictor as "AiModelResult"
    participant Model as "IFullModel"

    Builder->>Optimizer: Optimize(trainingX, y)
    Optimizer->>Optimizer: RandomlySelectFeatures(totalFeatures)
    Optimizer->>Helper: SelectFeatures(X, selectedIndices)
    Optimizer->>Model: Train(X_subset, y)
    Optimizer->>Helper: CreateOptimizationResult(bestStepData, bestSelectedFeatureIndices)
    Helper-->>Result: OptimizationResult (with SelectedFeatureIndices)

    Predictor->>Predictor: PreprocessingInfo.TransformFeatures(newX)
    Predictor->>Result: read SelectedFeatureIndices
    alt SelectedFeatureIndices non-empty
        Predictor->>Helper: SelectFeatures(normalizedX, SelectedFeatureIndices)
        Helper-->>Model: Model.Predict(X_subset)
    else
        Predictor-->>Model: Model.Predict(normalizedX)
    end
    Model-->>Predictor: predictions

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Fix issue 324 and create FeatureSelectorBase class #340 — Introduces feature-selector abstractions that return numeric indices; closely related changes to propagate and consume SelectedFeatureIndices.

Suggested labels

feature

Poem

✂️ Columns pruned, indices in tow,
From training fields to inference flow.
Predictions now match the taught routine,
Tests guard the shapes, no mismatch seen. 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically identifies the critical bug fix: applying feature selection in Predict() to resolve dimension-mismatch errors.
Linked Issues check	✅ Passed	All objectives from `#927` are met: feature selection indices persisted through OptimizationResult/ModelResult, propagated via OptimizerBase, applied in Predict(), DeepCopy/WithParameters preserve indices, and comprehensive integration tests validate the fix.
Out of Scope Changes check	✅ Passed	All changes directly address `#927`: adding SelectedFeatureIndices properties, propagating through optimizer flows, implementing feature selection in Predict(), and supporting tests—no extraneous modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/critical-predict-and-type-safety-bugs

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR fixes a critical bug where AiModelResult.Predict() did not apply the same feature selection used during training by the NormalOptimizer (and all OptimizerBase-derived optimizers), causing dimension mismatch errors ("Number of columns must equal length of vector") for all models built through AiModelBuilder. The fix introduces a new SelectedFeatureIndices (List<int>) property that stores the actual column indices selected during training and applies them to new input data in Predict().

Changes:

Added SelectedFeatureIndices property to OptimizationStepData, OptimizationResult, and ModelResult to carry feature selection indices through the optimizer pipeline
Propagated SelectedFeatureIndices through OptimizerBase.PrepareAndEvaluateSolution → UpdateBestSolution → CreateOptimizationResult, and updated OptimizerHelper.CreateOptimizationResult to accept the indices
Applied feature selection in AiModelResult.Predict() using the stored indices, plus 12 new integration tests covering the full BuildAsync → Predict pipeline

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/Models/OptimizationStepData.cs`	Added `SelectedFeatureIndices` property with default initializer
`src/Models/Results/ModelResult.cs`	Added `SelectedFeatureIndices` property to the struct
`src/Models/Results/OptimizationResult.cs`	Added `SelectedFeatureIndices` property; updated constructor, `DeepCopy()`, and `WithParameters()`
`src/Helpers/OptimizerHelper.cs`	Added optional `bestSelectedFeatureIndices` parameter to `CreateOptimizationResult`
`src/Optimizers/OptimizerBase.cs`	Stores and propagates `SelectedFeatureIndices` in `PrepareAndEvaluateSolution`, `UpdateBestSolution`, and `CreateOptimizationResult`
`src/Models/Results/AiModelResult.cs`	Core fix: applies feature selection using `SelectedFeatureIndices` in `Predict()` after preprocessing but before `Model.Predict()`
`tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`	12 new integration tests covering the bug fix and related scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Models/Results/AiModelResult.cs`:
- Around line 1833-1845: Add a single helper method
ApplySelectedFeaturesForPrediction(TInput input) in AiModelResult that checks
OptimizationResult?.SelectedFeatureIndices, uses
Helpers.InputHelper<T,TInput>.GetInputSize(input) and when needed calls
Helpers.OptimizerHelper<T,TInput,TOutput>.SelectFeatures(input, featureIndices)
and returns the (possibly) reduced input; then replace the inline
feature-selection block in AiModelResult.Predict() with a call to this new
method and update InferenceSequence.Predict() (the path that currently
preprocesses full input and calls _result.Model.Predict(...)) to call
AiModelResult.ApplySelectedFeaturesForPrediction(...) before invoking
_result.Model.Predict so both prediction paths reuse the same centralized logic.

In `@src/Models/Results/ModelResult.cs`:
- Around line 166-170: Add full XML documentation for the SelectedFeatureIndices
property in the ModelResult struct/class to match the rest of the file: expand
the <summary> to describe the property, add a <remarks> section explaining how
indices map to feature columns and when/why they are populated, and include a
"For Beginners" paragraph that gives a simple example of using
SelectedFeatureIndices (e.g., selecting columns 0,2,4 after optimization).
Ensure you update the XML tags immediately above the SelectedFeatureIndices
property so it follows the same documentation pattern as the other properties in
ModelResult.

In `@src/Models/Results/OptimizationResult.cs`:
- Line 512: WithParameters currently assigns SelectedFeatureIndices by reference
which is inconsistent with DeepCopy; update the WithParameters implementation in
OptimizationResult (method/class name: WithParameters on OptimizationResult) to
make a defensive copy of SelectedFeatureIndices (e.g., construct a new List<int>
from the existing SelectedFeatureIndices) so the new instance does not share the
same list reference and matches DeepCopy semantics.

In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`:
- Around line 23-24: Remove the unused constant LooseTolerance from the
AiModelBuilderPredictIntegrationTests class to eliminate dead code; locate the
field declaration "private const double LooseTolerance = 0.5;" and delete it (or
alternatively, if a looser numeric comparison is intended, replace usages of
Tolerance with LooseTolerance where appropriate in methods like
BuildAsync_PredictTwice_ReturnsSameResults), but prefer simply removing the
unused LooseTolerance constant.
- Around line 361-375: The test only checks value equality but not independence;
update the test
OptimizationResult_WithParameters_PreservesSelectedFeatureIndices to also verify
mutation isolation by mutating original.SelectedFeatureIndices after calling
WithParameters and asserting updated.SelectedFeatureIndices remains unchanged
(e.g., original.SelectedFeatureIndices.Add(99); Assert.Equal(new List<int>{1,3},
updated.SelectedFeatureIndices)); if you prefer to fix the implementation
instead, ensure OptimizationResult.WithParameters returns a new
OptimizationResult that clones SelectedFeatureIndices (copy the list) so the
returned instance does not share the same list reference.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 33f37fd1-2716-46a7-938d-6ff40840bb46

📥 Commits

Reviewing files that changed from the base of the PR and between 640251b and 362a433.

📒 Files selected for processing (7)

src/Helpers/OptimizerHelper.cs
src/Models/OptimizationStepData.cs
src/Models/Results/AiModelResult.cs
src/Models/Results/ModelResult.cs
src/Models/Results/OptimizationResult.cs
src/Optimizers/OptimizerBase.cs
tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

…, docs - Remove unused LooseTolerance constant - Fix off-by-one: use != instead of > for feature index count check - Apply feature selection in InferenceSession.Predict() (critical gap) - Add comprehensive docs to SelectedFeatureIndices property - Fix WithParameters to deep-copy collections (consistent with DeepCopy) - Add copy independence assertion to WithParameters test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

src/Models/Results/AiModelResult.cs (1)
2197-2207: 🛠️ Refactor suggestion | 🟠 Major

Extract this into the shared prediction helper.

This is the same feature-selection block as in AiModelResult.Predict(). Keep one private helper for both paths so the selection predicate and validation cannot drift again.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Models/Results/AiModelResult.cs` around lines 2197 - 2207, Extract the
duplicated feature-selection block into a single private helper method and call
it from both AiModelResult.Predict() and the other path using the same logic:
read _result.OptimizationResult?.SelectedFeatureIndices, call
Helpers.InputHelper<T, TInput>.GetInputSize(normalizedNewData) for validation,
and call Helpers.OptimizerHelper<T, TInput,
TOutput>.SelectFeatures(normalizedNewData, featureIndices) when sizes differ;
preserve the same null/Count checks and return (or set) the normalizedNewData
from the helper so both callers use the identical selection/validation logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Models/Results/AiModelResult.cs`:
- Around line 1836-1844: Replace the current heuristic that skips feature
selection based on counts with a proper identity-check: create a helper (e.g.,
ApplySelectedFeaturesForPrediction) that reads
OptimizationResult.SelectedFeatureIndices, returns input immediately if
null/empty, computes inputSize via Helpers.InputHelper<T,
TInput>.GetInputSize(input), then determines identity mapping by checking
featureIndices.Count == inputSize AND
featureIndices.SequenceEqual(Enumerable.Range(0, inputSize)); only skip
SelectFeatures when that identity check passes; otherwise validate indices are
in-range and call Helpers.OptimizerHelper<T, TInput,
TOutput>.SelectFeatures(input, featureIndices). Apply the same helper in the
session prediction path so both branches use identical logic.

In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`:
- Around line 234-237: The test FeatureSelectionReducesColumns is misleading
because it only checks selectedCount > 0 and <= 10 but not that reduction
occurred; update the assertions in AiModelBuilderPredictIntegrationTests.cs to
match intent: if you want to assert actual column reduction change the second
assertion to Assert.True(optimizationResult.SelectedFeatureIndices.Count < 10,
"Should select fewer features than the original 10"), otherwise rename the test
(and update the inline comment) to something like
FeatureSelectionHandlesSelectionGracefully and adjust the comment to say the
optimizer may select any subset (including all) and we only assert non-empty
selection via selectedCount/optimizationResult.SelectedFeatureIndices.Count.

---

Duplicate comments:
In `@src/Models/Results/AiModelResult.cs`:
- Around line 2197-2207: Extract the duplicated feature-selection block into a
single private helper method and call it from both AiModelResult.Predict() and
the other path using the same logic: read
_result.OptimizationResult?.SelectedFeatureIndices, call Helpers.InputHelper<T,
TInput>.GetInputSize(normalizedNewData) for validation, and call
Helpers.OptimizerHelper<T, TInput, TOutput>.SelectFeatures(normalizedNewData,
featureIndices) when sizes differ; preserve the same null/Count checks and
return (or set) the normalizedNewData from the helper so both callers use the
identical selection/validation logic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: db2dd249-403e-49db-9b42-e06f49bc8a89

📥 Commits

Reviewing files that changed from the base of the PR and between 362a433 and b5ebf53.

📒 Files selected for processing (4)

src/Models/Results/AiModelResult.cs
src/Models/Results/ModelResult.cs
src/Models/Results/OptimizationResult.cs
tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

…test SelectFeaturesTensor previously threw for 1D tensors. Now handles them by directly selecting elements at the specified indices. Added integration test proving Predict applies SelectedFeatureIndices correctly with a real FeedForwardNeuralNetwork. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`:
- Around line 386-442: The test method
InferenceSession_Predict_AppliesFeatureSelection is misnamed and has a weak
assertion; rename it to AiModelResult_Predict_AppliesTensorFeatureSelection to
reflect that it constructs AiModelResult and calls AiModelResult.Predict, and
replace or augment the Assert.NotNull(prediction) with a stronger assertion that
verifies the prediction shape/size (e.g., assert prediction length or output
dimension equals outputSize) using the existing outputSize variable so the test
ensures feature selection produced an output of the expected shape.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3339c2b3-a58b-415b-8b33-8753b1ce71b8

📥 Commits

Reviewing files that changed from the base of the PR and between b5ebf53 and c71e314.

📒 Files selected for processing (2)

src/Helpers/OptimizerHelper.cs
tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

Replaced inputSize == featureIndices.Count heuristic with proper identity-mapping check using SequenceEqual. A reordered selection like [2, 0, 1] has the same count but still needs reordering. Also validates indices are in-range before calling SelectFeatures. Renamed misleading test names and added shape assertion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

src/Models/Results/AiModelResult.cs (1)

2222-2232: ⚠️ Potential issue | 🔴 Critical

Blocking: session inference still bypasses the new selection helper.

InferenceSequence.Predict() still uses the old inputSize != featureIndices.Count shortcut. Line 2227 will skip valid reorderings like [2, 0, 1], and this path never gets the bounds validation now implemented in ApplySelectedFeaturesForPrediction(...). That leaves BeginInferenceSession().CreateSequence().Predict(...) behavior inconsistent with AiModelResult.Predict(...) and can return incorrect predictions through a public inference path.

Suggested fix

-            // Apply feature selection if the optimizer selected a subset of features during training.
-            var featureIndices = _result.OptimizationResult?.SelectedFeatureIndices;
-            if (featureIndices != null && featureIndices.Count > 0)
-            {
-                int inputSize = Helpers.InputHelper<T, TInput>.GetInputSize(normalizedNewData);
-                if (inputSize != featureIndices.Count)
-                {
-                    normalizedNewData = Helpers.OptimizerHelper<T, TInput, TOutput>
-                        .SelectFeatures(normalizedNewData, featureIndices);
-                }
-            }
+            normalizedNewData = _result.ApplySelectedFeaturesForPrediction(normalizedNewData);

Based on learnings and coding guidelines, "Every PR must contain production-ready code" and "Incomplete features: Half-implemented patterns where some code paths work but others silently do nothing" are blocking.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/Models/Results/AiModelResult.cs` around lines 2222 - 2232,
InferenceSequence.Predict() is still using the old fast-path check "inputSize !=
featureIndices.Count" which skips valid reorderings (e.g. [2,0,1]) and bypasses
the bounds validation in ApplySelectedFeaturesForPrediction(...); update
InferenceSequence.Predict() (the method invoked via
BeginInferenceSession().CreateSequence().Predict(...)) to call the same
ApplySelectedFeaturesForPrediction(normalizedNewData, featureIndices) helper (or
copy its exact validation + selection logic) instead of relying on inputSize
comparison so reorderings are applied and SelectedFeatureIndices bounds are
validated consistently with AiModelResult.Predict().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`:
- Around line 387-444: The test
AiModelResult_Predict_AppliesTensorFeatureSelection only checks a
length-reducing ascending subset; add a deterministic permutation case using
SelectedFeatureIndices = new List<int> {2,0,3,1} (or similar full-length
reordering) inside the same test (or a new test) to feed a known fullInput
Tensor<float> and assert that result.Predict(...) returns values in the
reordered order (check actual prediction values, not just shape) to ensure
AiModelResult.Predict applies reordering (referencing SelectedFeatureIndices,
AiModelResult.Predict, OptimizationResult and Tensor<float>/prediction.Shape to
locate where to change).
- Around line 312-334: The test
BuildAsync_DirectModelPredict_WithoutBuilder_StillWorks uses a collinear fixture
(second column = 2*first) and an incorrect formula comment (says y = x1 + 2*x2 +
1 while the data corresponds to a different mapping), and the assertions use
overly wide ranges; fix by replacing the input matrix created via CreateMatrix
so columns are non-collinear (e.g., choose independent values for the second
feature) or explicitly construct x so the intended mapping y = x1 + 2*x2 + 1
holds, then compute the exact expected predictions for newX and assert those
values with a tight tolerance (use Assert.Equal with a delta or narrow
Assert.InRange) against model.Predict; update the comment to match the fixture
and keep references to RidgeRegression<double>,
BuildAsync_DirectModelPredict_WithoutBuilder_StillWorks, CreateMatrix,
CreateVector, and Predict so reviewers can find the changes.

---

Duplicate comments:
In `@src/Models/Results/AiModelResult.cs`:
- Around line 2222-2232: InferenceSequence.Predict() is still using the old
fast-path check "inputSize != featureIndices.Count" which skips valid
reorderings (e.g. [2,0,1]) and bypasses the bounds validation in
ApplySelectedFeaturesForPrediction(...); update InferenceSequence.Predict() (the
method invoked via BeginInferenceSession().CreateSequence().Predict(...)) to
call the same ApplySelectedFeaturesForPrediction(normalizedNewData,
featureIndices) helper (or copy its exact validation + selection logic) instead
of relying on inputSize comparison so reorderings are applied and
SelectedFeatureIndices bounds are validated consistently with
AiModelResult.Predict().

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 181238a3-54fb-4a33-80ea-6b541a37df36

📥 Commits

Reviewing files that changed from the base of the PR and between c71e314 and e2c8e52.

📒 Files selected for processing (2)

src/Models/Results/AiModelResult.cs
tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/Models/Results/AiModelResult.cs:1844

The guard if (inputSize != featureIndices.Count) skips feature selection when the input already has the same number of features as selected indices. This is intended to handle the case where the user has already pre-selected features, but it creates a silent correctness risk: if the user passes a matrix with the same number of columns as selected features but those columns are not the correct selected features (e.g., the original training data had 5 features and 2 were selected, and the user happens to pass a 2-column matrix of different features), no feature selection is applied and the model receives the wrong columns without error. While this is a design tradeoff, there's no documentation of this assumption and no validation that the input size equals the selected count for valid reasons. Consider either documenting this behavior clearly or storing the original full feature count alongside SelectedFeatureIndices so that you can distinguish "pre-selected" vs. "wrong-size" inputs.


        // INFERENCE OPTIMIZATION PATH: apply configured inference optimizations for neural network models
        if (InferenceOptimizationConfig != null &&
            Model is NeuralNetworkBase<T> neuralModel &&
            normalizedNewData is Tensor<T> inputTensor)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Fixed direct-model test to use non-collinear features with correct formula comment (y = 2*x1 + 3*x2 + 1) and tighter assertion ranges - Added permutation test proving [2, 0, 1] selection is not treated as identity — verifies the SequenceEqual check works correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Callers that create OptimizationResult via object initializer (bypassing the constructor) leave CoefficientLowerBounds, CoefficientUpperBounds, FitnessHistory, SelectedFeatures, and SelectedFeatureIndices as null. Deep-copy calls in Clone/WithParameters would throw NullReferenceException. Now falls back to empty defaults when properties are null. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/Models/Results/OptimizationResult.cs (1)
477-494: ⚠️ Potential issue | 🟠 Major

DeepCopy() and WithParameters() still return inconsistent nested state.

TrainingResult, ValidationResult, TestResult, and FitDetectionResult are still reused by reference. That breaks isolation in DeepCopy(), and WithParameters() also leaves predictions/statistics from the old model attached to a different BestSolution. Deep-clone those nested results in DeepCopy(), and recompute or clear them in WithParameters() when the parameters change.

Also applies to: 502-519
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Models/Results/OptimizationResult.cs` around lines 477 - 494, DeepCopy()
currently shallow-copies nested result objects (TrainingResult,
ValidationResult, TestResult, FitDetectionResult), causing shared mutable state;
update OptimizationResult<T, TInput, TOutput>.DeepCopy() to deep-clone those
nested result objects (call their DeepCopy/Clone methods or construct new
instances copying all fields) instead of reusing references, and ensure
BestSolution is deep-copied before attaching any nested results. Likewise, in
WithParameters(...) detect when parameters change and either recompute dependent
predictions/statistics for
TrainingResult/ValidationResult/TestResult/FitDetectionResult based on the new
BestSolution or explicitly clear/reset those fields so stale predictions aren’t
carried to a different solution; update the WithParameters method to create
fresh result objects or null/empty them when parameters change.

♻️ Duplicate comments (2)

tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs (2)

508-523: ⚠️ Potential issue | 🔴 Critical

This permutation test still always passes on the broken branch.

predictionsMatch is computed and then ignored, so the only real assertion here is the output shape. A regression that skips reordering entirely still passes. Use a manual oracle instead: permute the input tensor yourself, run it through the identity-path model, and assert that output matches result.Predict(input).

🔧 Assert the actual permutation result

-        var identityPrediction = identityResult.Predict(input);
+        var manuallyPermutedInput = new Tensor<float>(new[] { featureCount });
+        manuallyPermutedInput[0] = 30.0f;
+        manuallyPermutedInput[1] = 10.0f;
+        manuallyPermutedInput[2] = 20.0f;
+        var expectedPrediction = identityResult.Predict(manuallyPermutedInput);
@@
-        bool predictionsMatch = Math.Abs(permutedPrediction[0] - identityPrediction[0]) < 1e-6f;
-        // Note: predictions *might* match if all weights happen to be equal (unlikely with random init),
-        // but in practice DenseLayer initializes weights differently per position.
-        // The key thing is the test doesn't throw — proving permutation is applied, not skipped.
         Assert.Equal(outputSize, permutedPrediction.Shape[0]);
+        Assert.Equal(outputSize, expectedPrediction.Shape[0]);
+        Assert.Equal(expectedPrediction[0], permutedPrediction[0], 1e-6f);

As per coding guidelines, Tests MUST be production-quality and Good tests should assert specific expected values, not just non-null.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`
around lines 508 - 523, The test currently computes predictionsMatch but never
uses it, so it only asserts shape and misses verifying permutation; fix by
explicitly applying the same feature permutation to the input tensor and using
that permuted input as the oracle for the identity-path model: create
permutedInput by reordering the original input according to the permutation used
by the model (the same mapping that should have been applied in result), then
call var oraclePrediction = identityResult.Predict(permutedInput) and assert
that oraclePrediction equals permutedPrediction (element-wise equality within an
epsilon) and that shapes match; use the existing variables result.Predict,
identityResult.Predict, permutedPrediction and input to locate where to insert
this check.

331-337: ⚠️ Potential issue | 🔴 Critical

These bounds are still too wide for a fixed regression fixture.

The dataset encodes a concrete mapping, but 39..43 and 44..48 still allow materially wrong outputs to pass. Either switch this direct-path regression to a non-regularized model and assert 41/46 tightly, or keep RidgeRegression<double> and narrow the tolerance to a bound that is justified by its default regularization.

🔧 One deterministic way to tighten this

-        var model = new RidgeRegression<double>();
+        var model = new MultipleRegression<double>();
@@
-        Assert.InRange(predictions[0], 39, 43);
-        Assert.InRange(predictions[1], 44, 48);
+        Assert.Equal(41.0, predictions[0], 1e-6);
+        Assert.Equal(46.0, predictions[1], 1e-6);

As per coding guidelines, Good tests should assert specific expected values, not just non-null.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`
around lines 331 - 337, The test currently allows overly wide tolerances for a
deterministic regression; in AiModelBuilderPredictIntegrationTests update the
model or the assertions so the expected outputs are tight: either switch the
model creation from RidgeRegression<double> to a non-regularized estimator
(e.g., use OrdinaryLeastSquares / the equivalent non-regularized builder) so the
mapping produces exact values 41 and 46, then replace the loose Assert.InRange
checks with exact equality assertions for predictions[0] == 41 and
predictions[1] == 46 (or use Assert.Equal with an appropriate minimal
precision), or keep RidgeRegression<double> but reduce the ranges to a
narrowly-justified tolerance based on its default regularization (e.g., ±1 or
use Assert.Equal with a small delta); update the assertions accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`:
- Around line 28-56: The test never guarantees the optimizer actually reduced
features; after BuildAsync() assert that the trained model selected a strict
subset by checking SelectedFeatureIndices.Count < 5 (i.e. add
Assert.True(result.SelectedFeatureIndices.Count < 5) or the equivalent accessor
on the built model), or instead configure the optimizer (NormalOptimizer)
deterministically to force feature selection; ensure this unconditional
assertion runs before calling result.Predict(newData).

---

Outside diff comments:
In `@src/Models/Results/OptimizationResult.cs`:
- Around line 477-494: DeepCopy() currently shallow-copies nested result objects
(TrainingResult, ValidationResult, TestResult, FitDetectionResult), causing
shared mutable state; update OptimizationResult<T, TInput, TOutput>.DeepCopy()
to deep-clone those nested result objects (call their DeepCopy/Clone methods or
construct new instances copying all fields) instead of reusing references, and
ensure BestSolution is deep-copied before attaching any nested results.
Likewise, in WithParameters(...) detect when parameters change and either
recompute dependent predictions/statistics for
TrainingResult/ValidationResult/TestResult/FitDetectionResult based on the new
BestSolution or explicitly clear/reset those fields so stale predictions aren’t
carried to a different solution; update the WithParameters method to create
fresh result objects or null/empty them when parameters change.

---

Duplicate comments:
In
`@tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs`:
- Around line 508-523: The test currently computes predictionsMatch but never
uses it, so it only asserts shape and misses verifying permutation; fix by
explicitly applying the same feature permutation to the input tensor and using
that permuted input as the oracle for the identity-path model: create
permutedInput by reordering the original input according to the permutation used
by the model (the same mapping that should have been applied in result), then
call var oraclePrediction = identityResult.Predict(permutedInput) and assert
that oraclePrediction equals permutedPrediction (element-wise equality within an
epsilon) and that shapes match; use the existing variables result.Predict,
identityResult.Predict, permutedPrediction and input to locate where to insert
this check.
- Around line 331-337: The test currently allows overly wide tolerances for a
deterministic regression; in AiModelBuilderPredictIntegrationTests update the
model or the assertions so the expected outputs are tight: either switch the
model creation from RidgeRegression<double> to a non-regularized estimator
(e.g., use OrdinaryLeastSquares / the equivalent non-regularized builder) so the
mapping produces exact values 41 and 46, then replace the loose Assert.InRange
checks with exact equality assertions for predictions[0] == 41 and
predictions[1] == 46 (or use Assert.Equal with an appropriate minimal
precision), or keep RidgeRegression<double> but reduce the ranges to a
narrowly-justified tolerance based on its default regularization (e.g., ±1 or
use Assert.Equal with a small delta); update the assertions accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1a135915-d41d-4cc6-bfd5-e56f1b28ceec

📥 Commits

Reviewing files that changed from the base of the PR and between e2c8e52 and 3c94dab.

📒 Files selected for processing (2)

src/Models/Results/OptimizationResult.cs
tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

…eck, robust assertions - InferenceSequence.Predict() now delegates to ApplySelectedFeaturesForPrediction instead of using a count-based check that skipped permuted selections - Move bounds validation to one-time check (cached _featureIndicesValidated flag) instead of scanning on every Predict() call - Assert permuted predictions actually differ from identity predictions - Replace fragile selectedCount < 10 with valid range assertion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add assertion verifying optimizer selected a feature subset before testing Predict() with full-width input, proving the test exercises the reduced-column code path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…s in test Replace per-call SequenceEqual + Enumerable.Range allocation with cached volatile int state (0=unknown, 1=identity, 2=needs-selection). Uses a simple loop instead of LINQ for zero-allocation identity check. Thread-safe via volatile read/write semantics. Set explicit non-uniform weights [1, 2, 3] in permuted feature selection test to ensure permutation deterministically produces different output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Refactor tests that called result.Predict() (which fails due to the feature selection bug fixed in PR #928) to verify preprocessing pipeline state directly via PreprocessingInfo.TransformFeatures() instead. Tests now verify the data leakage fix independently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…mismatch bug (#928) * fix: apply feature selection in AiModelResult.Predict() — fixes dimension mismatch bug The NormalOptimizer (default) randomly selects a subset of features during training via PrepareAndEvaluateSolution, but AiModelResult.Predict() never applied the same feature selection to new input data. This caused the model to receive all columns while it was trained on fewer, throwing "Number of columns must equal length of vector" errors. Changes: - Add SelectedFeatureIndices (List<int>) to OptimizationStepData, OptimizationResult, and ModelResult to store actual column indices - Propagate indices through optimizer flow: PrepareAndEvaluateSolution → UpdateBestSolution → CreateOptimizationResult - Apply feature selection in AiModelResult.Predict() after preprocessing but before Model.Predict(), using OptimizerHelper.SelectFeatures() - Add 12 integration tests verifying the full BuildAsync → Predict pipeline Closes #927 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 7 pr review comments - feature selection, copy semantics, docs - Remove unused LooseTolerance constant - Fix off-by-one: use != instead of > for feature index count check - Apply feature selection in InferenceSession.Predict() (critical gap) - Add comprehensive docs to SelectedFeatureIndices property - Fix WithParameters to deep-copy collections (consistent with DeepCopy) - Add copy independence assertion to WithParameters test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: support 1D tensor feature selection and add predict integration test SelectFeaturesTensor previously threw for 1D tensors. Now handles them by directly selecting elements at the specified indices. Added integration test proving Predict applies SelectedFeatureIndices correctly with a real FeedForwardNeuralNetwork. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use identity-check instead of count heuristic for feature selection Replaced inputSize == featureIndices.Count heuristic with proper identity-mapping check using SequenceEqual. A reordered selection like [2, 0, 1] has the same count but still needs reordering. Also validates indices are in-range before calling SelectFeatures. Renamed misleading test names and added shape assertion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: fix collinear test fixture and add permuted feature selection test - Fixed direct-model test to use non-collinear features with correct formula comment (y = 2*x1 + 3*x2 + 1) and tighter assertion ranges - Added permutation test proving [2, 0, 1] selection is not treated as identity — verifies the SequenceEqual check works correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add null guards to OptimizationResult Clone and WithParameters Callers that create OptimizationResult via object initializer (bypassing the constructor) leave CoefficientLowerBounds, CoefficientUpperBounds, FitnessHistory, SelectedFeatures, and SelectedFeatureIndices as null. Deep-copy calls in Clone/WithParameters would throw NullReferenceException. Now falls back to empty defaults when properties are null. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: unify feature selection in InferenceSequence, one-time bounds check, robust assertions - InferenceSequence.Predict() now delegates to ApplySelectedFeaturesForPrediction instead of using a count-based check that skipped permuted selections - Move bounds validation to one-time check (cached _featureIndicesValidated flag) instead of scanning on every Predict() call - Assert permuted predictions actually differ from identity predictions - Replace fragile selectedCount < 10 with valid range assertion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: assert feature selection actually occurred in regression test Add assertion verifying optimizer selected a feature subset before testing Predict() with full-width input, proving the test exercises the reduced-column code path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: cache identity check with volatile int, set deterministic weights in test Replace per-call SequenceEqual + Enumerable.Range allocation with cached volatile int state (0=unknown, 1=identity, 2=needs-selection). Uses a simple loop instead of LINQ for zero-allocation identity check. Thread-safe via volatile read/write semantics. Set explicit non-uniform weights [1, 2, 3] in permuted feature selection test to ensure permutation deterministically produces different output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: franklinic <franklin@ivorycloud.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…934) * fix: prevent data leakage by fitting preprocessing only on training data In BuildSupervisedInternalAsync(), the preprocessing pipeline (e.g., StandardScaler) was fitted on the ENTIRE dataset before the train/test split. This leaked test/validation statistics (mean, std dev) into the training pipeline, causing artificially inflated metrics. The fix restructures the method: 1. Split data into train/val/test FIRST using DataSplitter.Split() 2. FitTransform preprocessing pipeline on training data ONLY 3. Transform (not FitTransform) validation and test data The federated learning path is unchanged — it correctly uses all data as training data, so FitTransform on everything remains correct. Fixes #929 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 12 pr review comments - leakage tests, pragma scope, registry sync - Scope #pragma warning disable CS8600/CS8604 with matching restore - Restore PreprocessingRegistry.Current in all ConfigurePreprocessing overloads - Update PreprocessingRegistry docs to reflect thread-safety and usage pattern - Rewrite tests to actually detect leakage: verify scaler produces different output than all-data scaler, verify transform shape and non-identity - Add end-to-end predict test with preprocessing - Add test verifying no-preprocessing path still works with predict Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: move fitresample after train/test split to prevent data leakage FitResample (SMOTE, outlier removal, augmentation) was applied to the full dataset before splitting, allowing synthetic samples derived from test/validation data to leak into training. Now FitResample runs after the split on training data only in both standard and AutoML paths. Also fixed AutoML preprocessing to FitTransform on training only and Transform on validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update registry docs and use consistent shuffle policy for splits Updated PreprocessingRegistry docs to clearly warn about concurrent build safety issues. Applied same time-series non-shuffle policy to the final supervised training split that AutoML already uses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use post-preparation training data in no-preprocessing else branch When _dataPreparationPipeline runs (resampling/outlier removal) it modifies XTrain/yTrain. The else branch (no preprocessing) was incorrectly using preparedX/preparedY (pre-preparation) instead of XTrain/yTrain, losing training-only preparation results for downstream logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make data leakage tests self-sufficient without PR 928 Refactor tests that called result.Predict() (which fails due to the feature selection bug fixed in PR #928) to verify preprocessing pipeline state directly via PreprocessingInfo.TransformFeatures() instead. Tests now verify the data leakage fix independently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make data leakage test deterministic with widely-spaced unique values Replace clustered bimodal data with linearly-spaced unique values (0, 10, 20, ..., 990) to guarantee any 70/30 shuffle split produces significantly different scaler statistics from the full dataset, eliminating flaky test failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove null-forgiving operators from default!, null!, and remaining field! patterns Replaces default! with proper initialization (NumOps.Zero, throw, or nullable types), null! with nullable parameters or removed dead code, and field! with extracted local variables with null-coalescing throw expressions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address pr review comments for aimodelbuilder and totem - Fix exception text to reference ConfigureModel() instead of WithModel() - Remove nullable T? from TOTEM._lastCommitmentLoss field, initialize with NumOps.Zero in both constructors to avoid ambiguous nullability - Fix incorrect comment about Matrix<T>/Vector<T> being value types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use numops.equals instead of equalitycomparer for net471 compat Replace EqualityComparer<T>.Default.Equals with _numOps.Equals to fix CS8604 null reference error on net471 and follow codebase patterns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: narrow pragma scope and snapshot transform input in tests - Narrow CS8600/CS8604 pragma scopes to specific declaration and call sites instead of spanning 800+ lines - Clarify StatisticsHelper comment about default(T) vs NumOps.Zero - Snapshot original values before Transform in test to prevent in-place mutation from invalidating the comparison Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: franklinic <franklin@ivorycloud.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…ining directories (#933) (#941) * fix: prevent data leakage by fitting preprocessing only on training data In BuildSupervisedInternalAsync(), the preprocessing pipeline (e.g., StandardScaler) was fitted on the ENTIRE dataset before the train/test split. This leaked test/validation statistics (mean, std dev) into the training pipeline, causing artificially inflated metrics. The fix restructures the method: 1. Split data into train/val/test FIRST using DataSplitter.Split() 2. FitTransform preprocessing pipeline on training data ONLY 3. Transform (not FitTransform) validation and test data The federated learning path is unchanged — it correctly uses all data as training data, so FitTransform on everything remains correct. Fixes #929 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 12 pr review comments - leakage tests, pragma scope, registry sync - Scope #pragma warning disable CS8600/CS8604 with matching restore - Restore PreprocessingRegistry.Current in all ConfigurePreprocessing overloads - Update PreprocessingRegistry docs to reflect thread-safety and usage pattern - Rewrite tests to actually detect leakage: verify scaler produces different output than all-data scaler, verify transform shape and non-identity - Add end-to-end predict test with preprocessing - Add test verifying no-preprocessing path still works with predict Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: move fitresample after train/test split to prevent data leakage FitResample (SMOTE, outlier removal, augmentation) was applied to the full dataset before splitting, allowing synthetic samples derived from test/validation data to leak into training. Now FitResample runs after the split on training data only in both standard and AutoML paths. Also fixed AutoML preprocessing to FitTransform on training only and Transform on validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update registry docs and use consistent shuffle policy for splits Updated PreprocessingRegistry docs to clearly warn about concurrent build safety issues. Applied same time-series non-shuffle policy to the final supervised training split that AutoML already uses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use post-preparation training data in no-preprocessing else branch When _dataPreparationPipeline runs (resampling/outlier removal) it modifies XTrain/yTrain. The else branch (no preprocessing) was incorrectly using preparedX/preparedY (pre-preparation) instead of XTrain/yTrain, losing training-only preparation results for downstream logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make data leakage tests self-sufficient without PR 928 Refactor tests that called result.Predict() (which fails due to the feature selection bug fixed in PR #928) to verify preprocessing pipeline state directly via PreprocessingInfo.TransformFeatures() instead. Tests now verify the data leakage fix independently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: preserve preprocessing pipeline state during serialization (#936) * fix: add serialization support to preprocessing pipeline fitted state - Add [JsonProperty] attributes to private fields across all preprocessing transformers (StandardScaler, MinMaxScaler, RobustScaler, MaxAbsScaler, DecimalScaler, GlobalContrastScaler, LogScaler, LogMeanVarianceScaler, LpNormScaler, Normalizer, SimpleImputer) so Newtonsoft.Json serializes fitted state (means, std devs, min/max, etc.) - Add [JsonProperty] to TransformerBase.IsFitted and ColumnIndices for proper deserialization of base class state - Add [JsonConstructor] to MinMaxScaler and RobustScaler to resolve ambiguous constructor selection during deserialization - Replace ValueTuple in PreprocessingPipeline._steps with a new PipelineStep<T, TInput> class that serializes reliably with Newtonsoft.Json (ValueTuples lose field names during serialization) - Add [JsonProperty(TypeNameHandling = TypeNameHandling.Auto)] to PreprocessingInfo.Pipeline and TargetPipeline for polymorphic deserialization of pipeline step transformers - Add 16 integration tests verifying serialization round-trips for all modified transformers, pipelines, and inverse transforms Fixes #930 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 13 pr review comments - serialization safety, tests, perf - Remove null-forgiving operator from PipelineStep (use JsonConstructor) - Make PipelineStep.Transformer setter private (immutable after construction) - Fix ColumnIndices deserialization (add private setter for Json.NET) - Fix Steps property to return cached read-only list (no alloc per call) - Add explicit range assertions to MinMaxScaler custom range test - Add transform equivalence check to SimpleImputer test - Add LpNormScaler serialization round-trip test - Add Normalizer serialization round-trip test - Add LogMeanVarianceScaler serialization round-trip test - TypeNameHandling.Auto is safe: SafeSerializationBinder restricts types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make pipelinestep immutable, validate constructor, add missing tests - Made PipelineStep.Name private set to prevent mutation after construction - PipelineStep constructor now validates name (non-whitespace) and transformer (non-null), failing fast on invalid serialized steps - Added ColumnIndices round-trip test proving non-default indices survive serialization and untouched columns remain unchanged - Added TargetPipeline round-trip tests proving both null and non-null target pipelines survive serialization with transform equivalence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent JSON duplicate serialization and remove per-member TypeNameHandling Add [JsonIgnore] to public getter-only properties that mirror [JsonProperty] backing fields in all transformers (StandardScaler, RobustScaler, SimpleImputer, DecimalScaler, GlobalContrastScaler, MinMaxScaler, Normalizer, MaxAbsScaler, LogScaler) to prevent Newtonsoft emitting both field and property. Remove per-member TypeNameHandling.Auto from PipelineStep.Transformer and _finalTransformer — rely on serializer-level settings with SafeSerializationBinder instead, preventing unsafe TypeNameHandling bypass. Add Deconstruct method to PipelineStep for backward-compatible tuple destructuring. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: franklinic <franklin@ivorycloud.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove static PreprocessingRegistry — prevents concurrent build race conditions (#935) * fix: remove static PreprocessingRegistry usage from AiModelBuilder (#931) Remove the 3 `PreprocessingRegistry<T, TInput>.Current = _preprocessingPipeline` assignments from AiModelBuilder.ConfigurePreprocessing() and the 3 analogous PostprocessingRegistry assignments from ConfigurePostprocessing(). The pipeline is already stored as an instance field on the builder and flows to AiModelResult via PreprocessingInfo — the global static registry was unnecessary and caused race conditions when multiple builders ran concurrently. Changes: - Remove 6 static registry assignments from AiModelBuilder.cs - Add [Obsolete] attribute to PreprocessingRegistry<T, TInput> class - Replace PreprocessingRegistry usage in DocumentNeuralNetworkBase with an instance-level PreprocessingTransformer property - Add 4 integration tests verifying concurrent builds don't cross-contaminate Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 7 pr review comments - registry deprecation, thread safety, docs - Fix PreprocessingTransformer docs: clarify protected accessibility - Add PostprocessingTransformer instance property to DocumentNeuralNetworkBase - Replace PostprocessingRegistry static usage with instance-based transformer - Mark PostprocessingRegistry as [Obsolete] (same treatment as PreprocessingRegistry) - Update PreprocessingRegistry.Current docs to reflect no longer auto-set - Fix concurrent test to use separate data loaders (thread-safety) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: wire preprocessing transformer onto document models in buildasync Changed PreprocessingTransformer and PostprocessingTransformer from protected to protected internal so AiModelBuilder can set them. BuildSupervisedInternalAsync now wires the preprocessing pipeline onto DocumentNeuralNetworkBase models, replacing the static registry approach that caused race conditions during concurrent builds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: narrow transformer surface, fail fast on unfitted, wire postprocessing - Changed PreprocessingTransformer/PostprocessingTransformer back to protected and added internal ConfigureTransformers() method for AiModelBuilder wiring - PreprocessDocument/PostprocessOutput now throw if transformer is set but not fitted instead of silently falling back to defaults - Wired both preprocessing AND postprocessing onto DocumentNeuralNetworkBase - Updated PreprocessingRegistry docs to reflect deprecated/removed status - Added [Collection] to registry tests and try/finally cleanup - Fixed "override" wording to "assign" in docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: centralize document transformer injection and strengthen test assertions Extract transformer wiring into ConfigureDocumentTransformers() helper method that always passes current pipeline values (clearing prior state on reuse). Called from both supervised build path and BuildProgramSynthesisInferenceOnlyResult(). Strengthen PreprocessingRegistryIntegrationTests to verify PreprocessingInfo is non-null and fitted, not just Assert.NotNull(result). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unfitted check from PostprocessOutput, update docs, strengthen tests PostprocessOutput no longer throws on unfitted transformers since postprocessing transforms (Softmax, LabelDecoder) are often stateless. Updated XML docs to reflect instance-level transformer priority. Strengthen concurrent and sequential test assertions to verify PreprocessingInfo is non-null and fitted on each result. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: franklinic <franklin@ivorycloud.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove null-forgiving operators from video Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove null-forgiving operators from neural network layers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: warn on empty attributions and use guarded locals in jwt attestation verifier Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: include gain layer in backward pass and parameter updates in deepfilternet Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: rebuild streaming buffers after deserializing fft/hop size in neural noise reducer Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove null-forgiving operators from federated learning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove null-forgiving operators from federated learning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: respect bootstrap flag and align estimator probabilities to ensemble label space Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove null-forgiving operators from clustering modules Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: throw on invalid ensemble members and cold model in adaptive random forest Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: unify trained-state error messages in ordinal regression Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused alpha_s and use guarded local consistently in unipc scheduler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add guard for latent variance in compute elbo Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: hoist cached gradient lookup out of per-sample loops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace console.writeline with debug.writeline for checkpoint reporting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: cache compressed momentum after resampling projection matrices Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: show total sharded memory independently of per-device memory in tostring Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address 21 pr review comments for code quality improvements - CLIPScore: throw on embedding/weight dimension mismatch instead of Math.Min - Adam8BitOptimizer: remove redundant _vScales null check - BFGSOptimizer: move _inverseHessian null check before UpdateInverseHessian call - FTRLOptimizer: use post-update accumulator (nPlusGradSq) in both paths - MomentumOptimizer: snapshot _velocity before serialization loop - OptimizerBase: use RequireModel() instead of Model! - SymbolicPhysicsLearner: Clone throws on null children like Evaluate/Format - IncrementalPCA: fix error message to "internal state inconsistent" - ContradictionDetector: wrap JToken.Value<bool>() in try-catch - CriticModel: wrap JToken.Value<double>() in try-catch - LogNormalAFT: hoist Coefficients null check outside loops (3 methods) - WeibullAFT: hoist Coefficients null check outside loops (3 methods) - NHiTSModel: remove redundant ContainsKey before TryGetValue - DiffusionTools: support both 1D and 2D audio_data deserialization - GMMDetector: validate feature count before scoring a point - TimeSeriesForestClassifier: reject NumTrees <= 0 during training Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use guarded sigmaTs local variable consistently in dpm solver methods Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: apply feature selection in Predict() — fixes critical dimension mismatch bug (#928) * fix: apply feature selection in AiModelResult.Predict() — fixes dimension mismatch bug The NormalOptimizer (default) randomly selects a subset of features during training via PrepareAndEvaluateSolution, but AiModelResult.Predict() never applied the same feature selection to new input data. This caused the model to receive all columns while it was trained on fewer, throwing "Number of columns must equal length of vector" errors. Changes: - Add SelectedFeatureIndices (List<int>) to OptimizationStepData, OptimizationResult, and ModelResult to store actual column indices - Propagate indices through optimizer flow: PrepareAndEvaluateSolution → UpdateBestSolution → CreateOptimizationResult - Apply feature selection in AiModelResult.Predict() after preprocessing but before Model.Predict(), using OptimizerHelper.SelectFeatures() - Add 12 integration tests verifying the full BuildAsync → Predict pipeline Closes #927 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 7 pr review comments - feature selection, copy semantics, docs - Remove unused LooseTolerance constant - Fix off-by-one: use != instead of > for feature index count check - Apply feature selection in InferenceSession.Predict() (critical gap) - Add comprehensive docs to SelectedFeatureIndices property - Fix WithParameters to deep-copy collections (consistent with DeepCopy) - Add copy independence assertion to WithParameters test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: support 1D tensor feature selection and add predict integration test SelectFeaturesTensor previously threw for 1D tensors. Now handles them by directly selecting elements at the specified indices. Added integration test proving Predict applies SelectedFeatureIndices correctly with a real FeedForwardNeuralNetwork. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use identity-check instead of count heuristic for feature selection Replaced inputSize == featureIndices.Count heuristic with proper identity-mapping check using SequenceEqual. A reordered selection like [2, 0, 1] has the same count but still needs reordering. Also validates indices are in-range before calling SelectFeatures. Renamed misleading test names and added shape assertion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: fix collinear test fixture and add permuted feature selection test - Fixed direct-model test to use non-collinear features with correct formula comment (y = 2*x1 + 3*x2 + 1) and tighter assertion ranges - Added permutation test proving [2, 0, 1] selection is not treated as identity — verifies the SequenceEqual check works correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add null guards to OptimizationResult Clone and WithParameters Callers that create OptimizationResult via object initializer (bypassing the constructor) leave CoefficientLowerBounds, CoefficientUpperBounds, FitnessHistory, SelectedFeatures, and SelectedFeatureIndices as null. Deep-copy calls in Clone/WithParameters would throw NullReferenceException. Now falls back to empty defaults when properties are null. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: unify feature selection in InferenceSequence, one-time bounds check, robust assertions - InferenceSequence.Predict() now delegates to ApplySelectedFeaturesForPrediction instead of using a count-based check that skipped permuted selections - Move bounds validation to one-time check (cached _featureIndicesValidated flag) instead of scanning on every Predict() call - Assert permuted predictions actually differ from identity predictions - Replace fragile selectedCount < 10 with valid range assertion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: assert feature selection actually occurred in regression test Add assertion verifying optimizer selected a feature subset before testing Predict() with full-width input, proving the test exercises the reduced-column code path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: cache identity check with volatile int, set deterministic weights in test Replace per-call SequenceEqual + Enumerable.Range allocation with cached volatile int state (0=unknown, 1=identity, 2=needs-selection). Uses a simple loop instead of LINQ for zero-allocation identity check. Thread-safe via volatile read/write semantics. Set explicit non-uniform weights [1, 2, 3] in permuted feature selection test to ensure permutation deterministically produces different output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: franklinic <franklin@ivorycloud.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve first batch of pr #941 review comments (10 of 54) - InterpretabilityDashboard: filter empty attributions instead of using console output, use validCount for averaging - BaggingClassifier: validate sampleSize vs data size when bootstrap=false - Denclue: hoist _attractors and _attractorDensities guards out of loops - BIRCH: use Labels ?? Predict(x) instead of Labels! in FitPredict - SelfOrganizingMap: hoist _weights and _neuronLabels guards out of loops - CLIQUE: hoist Labels guard out of accumulation loop - SUBCLU: hoist _trainingData and Labels guards out of loops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve second batch of pr #941 review comments (20 of 54) - GraphIsomorphismLayer: add local variable aliases for _lastMlpHiddenPreRelu in Forward and BackwardManual for clearer null safety - PagedAttentionKernel: add braces to ToFloat/FromFloat if blocks, replace null-forgiving with explicit boxing guards - KnowledgeDistillationTrainerBase: remove Console.WriteLine from library checkpoint loading code - DiffusionMemoryManager: separate PerDeviceMemory and TotalShardedMemory formatting into independent checks - MixtureOfExpertsLayer: add null guard before ApplyActivationToGraph instead of null-forgiving operator - KVCache: replace null-forgiving with explicit boxing guard for Half conversion - CachedMultiHeadAttention: hoist _cache null guard before Append call - CachedGroupedQueryAttention: hoist _cache null guard before Append call - DeformableConvolutionalLayer: stale comments (variables already renamed) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve build errors from linter's automated null-forgiving removal Fix syntax and compilation errors introduced by the automated linter that removed null-forgiving operators without proper replacements: - Fix missing closing parentheses in foreach statements (COPKMeans, AgglomerativeClustering) - Fix duplicate variable declarations (AntColonyOptimization, CachedGroupedQueryAttention, CachedMultiHeadAttention) - Fix dropped underscore prefixes on private field references (KVCache, ConvLSTMLayer, PrincipalNeighbourhoodAggregationLayer) - Add validated accessor properties for nullable cache arrays (KVCache) - Fix variable ordering issues (FloraAdapter) - Fix nullable reference handling (VectorModel, UncertaintyCalibrationData, CrossValidationResult, AiModelResult.Uncertainty) - Fix Debug.WriteLine() call with no arguments (KnowledgeDistillationTrainerBase) - Add null guard for ClassLabels in PredictProbabilities (BaggingClassifier) - Fix nullable permutation array access (ConcatenateLayer) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: hoist null guards for lambdas and alphaTs in DPMSolverMultistepScheduler Replace null-forgiving operators with proper null-coalescing throws for _lambdas and _alphaTs in FirstOrderUpdate, SecondOrderUpdate, and ThirdOrderUpdate methods. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove redundant null guard in LeafSent140FederatedDatasetLoader The testFilePath is already validated by IsNullOrWhiteSpace check in the enclosing if statement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: capture PreprocessingInfo in local variable to prevent TOCTOU Snapshot PreprocessingInfo once before accessing its members to avoid potential race conditions with concurrent access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: dispose pre-permutation tensor to prevent memory leak in ConcatenateLayer Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: accumulate projector gradients from both views in BarlowTwins, SimCLR, and iBOT BarlowTwins: capture view1 projector grads before view2 backward clears them, then combine both views' grads for the projector update. SimCLR: replay projector forward per-view before backward to use correct caches, and accumulate projector gradients from both views. iBOT: backpropagate the full objective (clsLoss1 + clsLoss2 + lambda*(patchLoss1 + patchLoss2)) instead of only clsLoss1. Add ComputeMaskedPatchGradient for MSE gradient on masked patches. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct backward pass in RIFE, FILM, and RAFT video models RIFE: use cached block input shape instead of previous block output shape for gradient splitting to correctly match the fused input width. FILM: fail fast on all required backward-pass caches instead of silently skipping fusion and flow estimator backprop when caches are null. RAFT: replace incorrect upsampleConv.Backward with bilinear UpsampleFlowBackward that mirrors the actual forward upsampling, and hoist invariant layer guards out of the refinement loop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: strengthen DataLeakagePrevention tests with behavioral assertions ScalerStatisticsReflectTrainingDataOnly: add positive assertion that the pipeline transformation is not identity, confirming it was fitted on data. StillWorksCorrectly: exercise the model with actual predictions and assert outputs are finite numbers, not just that objects are non-null. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address major PR review comments across multiple files KNNEvaluator: hoist _trainLabels null check outside the voting loop. JointMutualInformation: use local miScores instead of field on line 212. ContradictionDetector: treat malformed contradictory fields as false instead of falling through to text scan that matches JSON key names. BFGSOptimizer: guard inverse Hessian update with curvature condition (y*s > 0) and use validated locals instead of null-forgiving operators. IncrementalPCA: validate all fitted state upfront in PartialFit and remove null-forgiving on _explainedVariance. RealGatedLinearRecurrenceLayer: hoist all cached tensor validation to top of Backward to prevent partial gradient corruption. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address remaining major PR review comments PreprocessingRegistry: align documentation — clarify get/set is thread-safe but concurrent builds still overwrite each other's pipeline. InMemoryFederatedTrainer: fail fast when HE is enabled but no parameter indices are resolved instead of silently downgrading. Fix DP accounting: FedAsync uses 1/totalClients per iteration, FedBuff uses bufferSize/totalClients. AiModelBuilder: extract IsTimeSeriesTask() helper that checks both AutoML options and model type name for chronological split decisions. Reject row-changing data preparation in natural-client federated mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address trivial PR review comments across multiple files DeformableConvolutionalLayer: rename misleading variable names (mwShape/mbShape to maskWeights/maskBias, mw2/mwg/mbg to maskW/maskWeightGrad/maskBiasGrad). S4DLayer: hoist _lastHiddenStatesReal/Imag guards out of the BPTT loop. OnlineFeatureSelector: pass resolved running arrays into ComputeOnlineCorrelation instead of re-checking nullable fields inside the hot loop. SSLPretrainingPipeline: extract encoder to a local variable to avoid duplicate null-check expressions. SubpixelConvolutionalLayer: hoist ScalarActivation out of the per-element loop. BaggingClassifier: replace OrderBy(random.Next()) with Fisher-Yates partial shuffle for O(sampleSize) unbiased sampling without replacement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: hoist gradient tensor validations in HedgehogLayer.FeatureMapBackward Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address 12 pr review comments across multiple files - FloraAdapter: fix momentum reset bug (was zeroed every step) - KnowledgeDistillationTrainerBase: remove dead checkpoint branch - KVCache: hoist loop-invariant lookups outside inner loops - AiModelResult: capture stable model reference in explainer delegate - VectorModel: guard empty coefficients, remove dead BoxOrDefault - ConvLSTMLayer: use local lastInput instead of field reference - PrincipalNeighbourhoodAggregationLayer: hoist invariant tensors, remove null-forgiving operators in fallback path - BarlowTwins: remove redundant projector null-check - FILM: use local-cache pattern for cached features - DeepFilterNet: propagate gain layer backward gradient Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: net471 build errors in baggingclassifier and leafsent140loader - Replace range syntax pool[..sampleSize] with Array.Copy for net471 compatibility (RuntimeHelpers.GetSubArray not available) - Add explicit null check before string.IsNullOrWhiteSpace for net471 compatibility (missing NotNullWhen attribute) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use correct inner class name and remove null-forgiving in dashboard Replace InterpretabilityExplanation with Explanation (the actual inner class name) and replace null-forgiving operators with proper null guards. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address pr #941 review comments - KDTree: use single guarded local for _data null check in Compare - BallTree: use single guarded local for _data null check in Sort - DiffusionMemoryManager: remove duplicate unconditional total sharded line - InfluenceFunctionExplainer: hoist cachedGradients null check outside loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: franklinic <franklin@ivorycloud.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 7, 2026 11:27

vercel bot deployed to Preview – aidotnet_website March 7, 2026 11:27 View deployment

Copilot started reviewing on behalf of ooples March 7, 2026 11:28 View session

ooples mentioned this pull request Mar 7, 2026

AiModelResult.Predict() ignores feature selection — dimension mismatch on all optimized models #927

Closed

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 11:28 View deployment

Copilot AI reviewed Mar 7, 2026

View reviewed changes

Comment thread tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs Outdated

Comment thread src/Models/Results/AiModelResult.cs Outdated

coderabbitai bot requested changes Mar 7, 2026

View reviewed changes

ooples mentioned this pull request Mar 7, 2026

Hardcoded double in generic <T> classes — ~4,563 instances across ~4,665 files break generic type contract #932

Closed

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 14:36 View deployment

vercel bot deployed to Preview – aidotnet_website March 7, 2026 14:37 View deployment

coderabbitai bot requested changes Mar 7, 2026

View reviewed changes

Comment thread src/Models/Results/AiModelResult.cs Outdated

Comment thread tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs Outdated

Copilot AI review requested due to automatic review settings March 7, 2026 15:26

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 15:27 View deployment

Copilot started reviewing on behalf of ooples March 7, 2026 15:27 View session

vercel bot deployed to Preview – aidotnet_website March 7, 2026 15:27 View deployment

coderabbitai bot requested changes Mar 7, 2026

View reviewed changes

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 15:53 View deployment

vercel bot deployed to Preview – aidotnet_website March 7, 2026 15:53 View deployment

coderabbitai bot requested changes Mar 7, 2026

View reviewed changes

Comment thread tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs Outdated

Comment thread tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

Copilot AI reviewed Mar 7, 2026

View reviewed changes

Comment thread src/Models/Results/OptimizationResult.cs Outdated

coderabbitai bot approved these changes Mar 7, 2026

View reviewed changes

vercel bot deployed to Preview – aidotnet_website March 7, 2026 16:12 View deployment

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 16:13 View deployment

Copilot AI review requested due to automatic review settings March 7, 2026 16:16

Copilot started reviewing on behalf of ooples March 7, 2026 16:16 View session

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 16:17 View deployment

vercel bot deployed to Preview – aidotnet_website March 7, 2026 16:17 View deployment

Copilot AI reviewed Mar 7, 2026

View reviewed changes

coderabbitai bot requested changes Mar 7, 2026

View reviewed changes

Comment thread tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

vercel bot deployed to Preview – aidotnet_website March 7, 2026 16:32 View deployment

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 16:33 View deployment

Copilot AI review requested due to automatic review settings March 7, 2026 16:39

vercel bot deployed to Preview – aidotnet_website March 7, 2026 16:39 View deployment

Copilot started reviewing on behalf of ooples March 7, 2026 16:39 View session

coderabbitai bot approved these changes Mar 7, 2026

View reviewed changes

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 16:40 View deployment

Copilot AI reviewed Mar 7, 2026

View reviewed changes

Comment thread src/Models/Results/AiModelResult.cs Outdated

Comment thread src/Models/Results/AiModelResult.cs Outdated

Comment thread tests/AiDotNet.Tests/IntegrationTests/Regression/AiModelBuilderPredictIntegrationTests.cs

vercel bot deployed to Preview – aidotnet-playground-api March 7, 2026 16:52 View deployment

vercel bot deployed to Preview – aidotnet_website March 7, 2026 16:52 View deployment

ooples merged commit 60d1df1 into master Mar 8, 2026
36 of 39 checks passed

ooples deleted the fix/critical-predict-and-type-safety-bugs branch March 8, 2026 15:03

ooples mentioned this pull request Mar 8, 2026

Critical integration test coverage: Serialize round-trip, preprocessing, classification, accuracy verification #946

Closed

7 tasks

coderabbitai bot mentioned this pull request Mar 8, 2026

test: add 25 critical integration tests for AiModelBuilder pipeline #947

Merged

3 tasks

Uh oh!

Conversation

ooples commented Mar 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The Bug

Impact

The Fix

Files Changed

Test plan

Summary by CodeRabbit

Uh oh!

vercel bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ooples commented Mar 7, 2026 •

edited by coderabbitai bot

Loading

vercel bot commented Mar 7, 2026 •

edited

Loading

coderabbitai bot commented Mar 7, 2026 •

edited

Loading