Skip to content

bug: misleading CV consistency warning message — blames CV>40% when actual failure is consistency<70% #149

@coderabbitai

Description

@coderabbitai

Bug Report

Reported in: PR #145 post-merge console output review
Reporter: @nerdCopter


Symptom

Console output shows ⚠ Low consistency (CV=XX%) — unreliable (>40%) for CV values that are below 40%, e.g.:

⚠ Low consistency (CV=38.9%) — unreliable (>40%)
⚠ Low consistency (CV=35.0%) — unreliable (>40%)
⚠ Low consistency (CV=31.9%) — unreliable (>40%)

The message incorrectly implies the flag was triggered because CV > 40%, when the CV value is plainly within the acceptable threshold.


Root Cause

In src/data_analysis/optimal_p_estimation.rs, TdStatistics::is_consistent() has three independent conditions that all must be true:

pub fn is_consistent(&self) -> bool {
    self.num_samples >= TD_SAMPLES_MIN_FOR_STDDEV          // (1) enough samples
        && self.consistency >= TD_CONSISTENCY_MIN_THRESHOLD // (2) fraction within ±1 SD >= 0.70
        && self.coefficient_of_variation
            .map_or(true, |cv| cv <= TD_COEFFICIENT_OF_VARIATION_MAX) // (3) CV <= 0.40
}

The warning is emitted whenever !is_consistent() fires — but the message in format_console_output always attributes the failure to the CV threshold, regardless of which condition actually caused the failure:

if !self.td_stats.is_consistent() {
    let cv_percent = self.td_stats.coefficient_of_variation
        .map_or(0.0, |cv| cv * 100.0);
    output.push_str(&format!(
        "  ⚠ Low consistency (CV={:.1}%) — unreliable (>{:.0}%)\n",
        cv_percent,
        TD_COEFFICIENT_OF_VARIATION_MAX * 100.0  // always prints ">40%"
    ));
}

When CV=31.9% (below 40%) but consistency < 0.70 (fraction of samples within ±1 std dev), the flag fires correctly but the message incorrectly blames CV > 40%.


Fix Suggestion

The warning message should identify which condition(s) actually failed. Two viable approaches:

Option A — report both metrics when either fails:

if !self.td_stats.is_consistent() {
    let cv_percent = self.td_stats.coefficient_of_variation
        .map_or(0.0, |cv| cv * 100.0);
    let consistency_percent = self.td_stats.consistency * 100.0;
    output.push_str(&format!(
        "  ⚠ Low consistency (CV={:.1}%, Consistency={:.0}%) — unreliable (CV>{:.0}% or Consistency<{:.0}%)\n",
        cv_percent,
        consistency_percent,
        TD_COEFFICIENT_OF_VARIATION_MAX * 100.0,
        TD_CONSISTENCY_MIN_THRESHOLD * 100.0,
    ));
}

Option B — conditionally show only the failing reason:

if !self.td_stats.is_consistent() {
    let cv_percent = self.td_stats.coefficient_of_variation
        .map_or(0.0, |cv| cv * 100.0);
    let consistency_percent = self.td_stats.consistency * 100.0;
    let cv_failed = self.td_stats.coefficient_of_variation
        .map_or(false, |cv| cv > TD_COEFFICIENT_OF_VARIATION_MAX);
    let consistency_failed = self.td_stats.consistency < TD_CONSISTENCY_MIN_THRESHOLD;
    let reason = match (cv_failed, consistency_failed) {
        (true, true)  => format!("CV={:.1}% (>{:.0}%) and Consistency={:.0}% (<{:.0}%)",
                                  cv_percent, TD_COEFFICIENT_OF_VARIATION_MAX * 100.0,
                                  consistency_percent, TD_CONSISTENCY_MIN_THRESHOLD * 100.0),
        (true, false) => format!("CV={:.1}% (>{:.0}%)",
                                  cv_percent, TD_COEFFICIENT_OF_VARIATION_MAX * 100.0),
        (false, true) => format!("Consistency={:.0}% (<{:.0}%)",
                                  consistency_percent, TD_CONSISTENCY_MIN_THRESHOLD * 100.0),
        (false, false) => format!("insufficient samples (need >= {})", TD_SAMPLES_MIN_FOR_STDDEV),
    };
    output.push_str(&format!("  ⚠ Low consistency ({}) — unreliable\n", reason));
}

Option B is more precise and avoids printing a threshold that was not the cause of the failure.


Impact

  • Correctness: The warning fires correctly (the flagging logic in is_consistent() is sound); only the diagnostic message is wrong.
  • User confusion: Pilots seeing CV=31.9% — unreliable (>40%) cannot reconcile the displayed value with the stated threshold, eroding trust in the tool output.
  • No false negatives: No flight data or recommendations are silently wrong — only the explanation text is misleading.

References: PR #145 comment thread (post-merge bug report)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions