Skip to content

fix(qa): Resolve 9 NutriRange calorie back-calculation outliers — adjust EU FIC tolerance #780

@ericsocrat

Description

@ericsocrat

Problem Statement

QA Suite 11 (QA__nutrition_ranges.sql) has 9 persistent calorie back-calculation outliers that have been flagged as known failures across multiple sessions. These are products where the sum of macronutrient calories (fat×9 + carbs×4 + protein×4 + fiber×2) deviates significantly from the declared calorie value.

Root cause: Open Food Facts source data contains nutrition values exactly as printed on European food labels. Label rounding rules (EU FIC Regulation 1169/2011) allow manufacturers to round nutrition values, which creates back-calculation mismatches. This is not a data entry error — it's inherent to how food labeling works in the EU.

Current state:

  • 9 products consistently fail the calorie back-calculation check
  • All other 11+ nutrition range checks pass
  • These failures are documented as pre-existing/non-blocking in CURRENT_STATE.md

User impact: None directly — scoring uses the declared calorie value (which is correct per the label). However, the persistent QA failures create noise that masks real regressions and erodes confidence in the QA suite.


Architectural Evaluation

Approach Verdict Rationale
A. Override OFF calorie values with back-calculated values ❌ Rejected Violates data integrity — declared values on the label are the legal reference.
B. Widen tolerance threshold for back-calculation check ✅ Chosen EU rounding rules allow ±10-20% deviation. Current check may use too tight a tolerance. Widen to match EU regulatory tolerance (±20% for vitamins/minerals, ±8kcal for energy).
C. Remove the back-calculation check entirely ❌ Rejected The check catches genuine data quality issues. Should remain but with appropriate tolerance.
D. Exclude known outliers by product_id ⚠️ Last resort Fragile — hardcodes product IDs. Only if tolerance adjustment doesn't resolve.

Implementation Plan

Phase 1 — Analyze the 9 Outliers

-- Identify the 9 failing products and their deviation %
SELECT p.product_name, p.brand, p.category, p.country,
       nf.calories_kcal AS declared,
       ROUND((nf.total_fat_g * 9 + nf.carbs_g * 4 + nf.protein_g * 4 + COALESCE(nf.fiber_g, 0) * 2)::numeric, 0) AS calculated,
       ROUND(ABS(nf.calories_kcal - (nf.total_fat_g * 9 + nf.carbs_g * 4 + nf.protein_g * 4 + COALESCE(nf.fiber_g, 0) * 2)) / NULLIF(nf.calories_kcal, 0) * 100, 1) AS deviation_pct
FROM products p
JOIN nutrition_facts nf ON p.product_id = nf.product_id
WHERE p.is_deprecated = false
ORDER BY deviation_pct DESC
LIMIT 20;

Phase 2 — Adjust Tolerance

Update the QA check in QA__nutrition_ranges.sql to use EU FIC-compliant tolerance:

  • Energy value tolerance: ±20% (per EU FIC Regulation 1169/2011 guidance)
  • Current tolerance is likely ±10% or absolute deviation

Phase 3 — Validate

.\RUN_QA.ps1  # Verify 9 outliers now pass with adjusted tolerance

Test Requirements

  • QA__nutrition_ranges.sql — all 20 checks pass (including previously-failing 9)
  • No legitimate data quality issues are masked by the wider tolerance
  • Tolerance value documented in the QA check comment

Verification Checklist

  • Analysis of 9 outliers completed (deviation % documented)
  • Tolerance adjusted with EU FIC regulatory justification
  • All 20 nutrition range checks pass
  • CURRENT_STATE.md updated (remove from Known Issues)
  • copilot-instructions.md updated if QA check count changes
  • CHANGELOG.md updated under [Unreleased]

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium: normal prioritydatabaseDatabase / SQL / migrationsqa

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions