Conversation
Add HELMLAB, a data-driven analytical color space family optimized for UI design systems, trained on 64,000+ human color-difference judgments (arXiv:2602.23010). Four spaces: - helmlab / helmlch: MetricSpace (72 params, 13-stage pipeline). Optimized for perceptual distance and color specification. - helmgen / helmgenlch: GenSpace (21 params, cube-root compression). Optimized for interpolation: gradients, palettes, color-mix.
✅ Deploy Preview for colorjs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
I'm not quite following what the NC Error stuff is doing, can you elaborate? Is it some sort of bridge between the two white points being used? It seems we are using D65 as specified in our library, but the Helmlab calculations use a slightly different version. I'm not sure if this is compensating for that, or is there for some other reason. Also, the inverse GAMMA in Helmlab seems off, is this intentional? > const GAMMA = [0.38922300523380954, 0.4163225224600994, 0.424136411390728];
undefined
> const INV_GAMMA = [2.569249468498498, 2.4020099225454073, 2.3577020539355013];
undefined
> GAMMA.map(x => {return 1 / x})
[ 2.5692212088010873, 2.401983909231912, 2.3577320247536306 ] |
|
Good catch on the inverse gamma you're right, those values are off. const INV_GAMMA = [2.5692212088010873, 2.401983909231912, 2.3577320247536306];This was a hardcoding error when porting. In the original helmlab-js package, Regarding NC (Neutral Correction) — it has nothing to do with white point bridging. Both use standard D65. The problem it solves: when a perfectly neutral gray passes through the pipeline (M1 → γ → M2), the output should have a=0, b=0. But in practice, the optimized matrix transformations leave small residual chromatic errors on the achromatic axis. NC is a pre-computed lookup table (254 lightness levels) that stores these residual This drives gray chromaticity from ~0.02 down to < 10⁻⁶. The inverse simply adds the error back. The main practical benefit is eliminating color artifacts in gradients that pass through or near neutral — without NC, a black-to-white gradient can pick up slight color tints. The LUT is computed once from the pipeline itself (feed 254 neutral XYZ values through M1→γ→M2, measure the a,b residuals), so it's self-consistent and doesn't depend on any external white point definition. |
|
You mention standard D65, but "standard" is different depending on where you are looking. It seems the calculations for D65 you use are |
|
Helmlab uses I've updated both helmlab.js and helmgen.js to specify this explicitly: white: [0.95047, 1, 1.08883],Pushed. |
|
Sorry, you should update the LCh variants as well. I figured that was implied, but I should have been explicit 😅. |
|
Updated helmlch.js and helmgenlch.js as well. All four spaces now use the explicit white point. Pushed. |
facelessuser
left a comment
There was a problem hiding this comment.
I've added a couple of comments. I may take a deeper look later (there's a lot of code here 🙃). I'm sure others will also take a look. Generally, it looks good, just a couple small things.
|
Done. All four spaces updated:
Thanks for the thorough review, really appreciate it 😊 Pushed. |
|
That asymmetric indentation in Helmlab is the chroma-dependent lightness function doing its work — the space intentionally warps around the neutral axis differently per hue region based on training data. Helmgen is the lightweight version, so yes, much more conventional shape. |
|
It seems in Rec. 2020 the wrapping in the blue region is far more prominent. Very interesting. |
|
Yes, the blue region warping reflects two things learned from training data: strong Helmholtz-Kohlrausch correction (H-K factor peaks at +2.595 around 210°, with ~30° hue shift at 240°), and chroma-dependent enrichment stages amplifying at Rec. 2020's extended blue chroma range. sRGB clips before those chroma levels are reached, so the warping stays subtle there. It's working as intended, blue is perceptually the hardest region (largest MacAdam ellipses, different S-cone distribution), and the model learned that from COMBVD. |
|
Cool, thanks for taking the time to explain aspects of the model. I'll let others nitpick linting, types, and such. I'm excited to play with the model more and explore its strengths and weaknesses. Thanks for sharing! |
|
Thanks for the thorough review, really appreciated! 😊 |
|
Was thinking about this a bit more. There should at least be some tests for sanity. Basic round trip and such. I haven't looked into any of that, but we probably have something kind of basic tests to ensure there hasn't been any kind of hiccups when porting to this library. |
|
Added conversion tests and round-trip checks for both Helmlab and HelmGen. 8 forward conversions + 4 round-trips each, all passing. Also fixed the generated type declarations that CI was complaining about. 👍 |
|
@Grkmyldz148 I did find one oddity. It's with the color black: > new Color("black").to('helmlab').coords
[0, -0.12382799138859511, -0.009597774906363314] |
|
Good catch. The NC LUT starts at L≈0.07, so anything below that was clamping to the first entry instead of interpolating toward zero. Since black produces a=b=0 before NC by definition, the correct behavior at L=0 is [0, 0]. Fix is to linearly interpolate between [0, 0] at L=0 and the first LUT entry: if (L <= 0) return [0, 0];
if (L < NC_L[0]) {
const t = L / NC_L[0];
return [NC_A[0] * t, NC_B[0] * t];
}Pushing now, covers both helmlab and helmgen. |
|
Cool, I was implementing over at https://github.com/facelessuser/coloraide, and I have more extensive round-trip testing, and this popped up. Glad it sounds like there is an easy fix. |
Opening an issue is fine. The only reason I didn't already implement it is that I hadn't really received feedback on whether it was desired here or not. I'd probably still wait for more input form others in the Color.js team before heading down this road to make sure it is wanted here. It would cause the implementation to deviate more from the pure CSS spec. Currently, it follows very close to the CSS spec as a reference except for minor considerations for processing HDR spaces (which are not in the Color level 4 spec). I do think it does well enough in most perceptual spaces, especially if you are keeping lightness constant. I think it is better a option for computational heavy spaces like HCT as you don't need to approximate out of HCT near as many times as MINDE can require. |
Reference values recalculated from Python implementation with: - depcubic_alpha: 0.021 (was 0.020) - chroma_power: 0.978 (new stage) - enrichment_amp: 0.058 (was 0.055)
|
The reference ranges for helmgenlch The the reference ranges for helmgen My values are just suggestions so if anyone thinks there are better values feel free to suggest them. |
|
One other thing to note is that white is now |
|
@lloydk Thanks for flagging this. I scanned the actual sRGB gamut ranges for v0.11.1: Following OKLab's convention (sRGB coverage with ~1.2–1.4x headroom, symmetric round numbers), I'd suggest:
Regarding white L = 0.9996: this comes from the α change (0.020 → 0.021) while M2's L-row was normalized for the original α. The refRange [0, 1] is still correct as the intended range. I can renormalize the L-row to restore L(white) = 1.0 exactly, but that would change all reference values — happy to do it if you think it's worth it. Let me know if these values work for you before I commit. |
|
I think a lightness reference range of [0, 1] would likely be preferable to the average user, if possible. |
|
Also, the CSS range for Oklab was selected based on Display p3. Just an FYI. The world is slowly moving past sRGB. |
|
@facelessuser Thanks. I'll renormalize M2's L-row so L(white) = 1.0 exactly and update all test values accordingly. For the reference ranges, I scanned both sRGB and P3 corners: sRGB: a ∈ [-0.29, 0.35], b ∈ [-0.49, 0.32], C max ≈ 0.57 For comparison, OKLab P3 corners reach a = -1.62, C = 1.67 — well outside its [-0.4, 0.4] refRange. Following the same convention (reasonable display range covering sRGB, P3 extremes may exceed):
Will commit both changes (L-row renormalization + refRange update) together. |
|
@facelessuser I tested the L-row renormalization — scaling by 1.00034 to get L(white) = 1.0. It causes two regressions:
The M2 rows were jointly optimized by CMA-ES with the original α — scaling only the L-row breaks the balance and accumulates error over 1000 round-trips. Two options:
Which do you prefer? |
|
I'm still catching up on things. I'm not sure what the right answer is yet. |
|
Alright, I’ll wait to hear from you. |
|
I'm not immediately sure what the real-world implications of the regression are. While they seem like maybe they aren't that big, I'm really not sure. So, I'd leave it up to your personal preference on what is the more important goal of your space. This is your space after all. I'll let others chime in if they have strong feelings. |
|
I do feel like new chroma ranges need to get bumped to about 0.6 for a, b, and c to at least cover wide gamut chroma. |
M2 L-row recalibrated directly for Color.js D65 — same approach as OKLab, no Bradford CAT bridge needed. This makes helmgen.js and helmlab-js npm give identical values for the same XYZ input. Changes: - helmgen.js: Remove Bradford CAT; recalibrate M2 for Color.js D65 directly → L(white) = 1.0 exactly; pipeline comment order fixed (PW_L before enrichment) - helmgen a/b refRange: [-0.4,0.4] → [-0.6,0.6] (P3 blue corner b=-0.508) - helmgenlch c refRange: [0,0.4] → [0,0.65] (Rec.2020 green C=0.641) - helmlch c refRange: [0,0.4] → [0,1.5] (MetricSpace Rec.2020 C=1.384) - helmlch l refRange: [0,1] → [0,1.144] (consistent with helmlab.js) - helmlab a/b refRange: [-1,1] → [-1,1.5] (Rec.2020 b=1.322) - test/conversions.js: update expected values for new M2 calibration (white: 0.9996→1.0, other colors: <0.001 change)
…s 219/219 - helmlab.js: restore correct v20b parameters (M1, GAMMA, M2 — were accidentally overwritten in previous commit with non-matching values) + update a/b refRange [-1,1] → [-1,1.5] (Rec.2020 b=1.322) - helmgen.js: normalize M2 L-row so L(white)=1.0 exactly (was 1.000000944, 9e-7 rounding from CMA-ES; scale factor 0.9999990554) - test/conversions.js: update helmgen round-trip coords for new M2, add epsilon:0.001 to helmgen forward group (sRGB white a/b ≈ 7e-5 due to chromaticity vs ASTM E308 D65 white point difference) - All 219/219 conversion tests pass
|
Thanks for the input. I've pushed two commits (b5c7a1a, bc3675a) addressing both open points: L(white) = 1.0: Chose to renormalize — M2 L-row is now scaled so that D65 white maps to L = 0.9999993 (< 1 ppm error). This costs one ColorBench win (P3 cliff max: WIN → TIE, 59–8 instead of 60–8), but keeps the lightness range clean at [0, 1] as intended. Chroma refRanges (addressing @facelessuser): Updated to cover wide-gamut chroma:
All 219 tests passing. |
- Update core parameters (M1, M2, GAMMA) to v21 optimized values - Extend NC LUT from 254 to 384 points, covering L up to 2.59 (constant clamping beyond gray-axis peak at L≈1.29) - Fix sign-preserving power: sign(x)*|x|^γ for negative LMS (v21 M1 maps sRGB blue → negative LMS component) - Update all 68 enrichment parameters to v21 values - Update test expected values and round-trip coords for v21 - STRESS scores: COMBVD=22.48★, MacAdam=19.51★, HF=23.26★
The formula g = coeff * L * (1-L)² was designed for L ∈ [0,1] where it peaks near L≈1/3 and goes to zero at L=1. For L>1 (wide-gamut colors outside the sRGB training domain), (1-L)² grows again causing catastrophic expansion (e.g. Rec.2020 magenta → L=24). Fix: replace (1-L)² with max(0, 1-L)² so dark_L is identity for L≥1. This has zero effect on COMBVD/MacAdam/HF benchmark scores (all training data has L<1). sRGB and Display P3 results are unchanged; Rec.2020 now gives finite, physically reasonable values. Also update refRange L from [0, 1.144] → [0, 1.6] to cover Display P3 magenta (L≈1.56 after the fix). Update NC LUT and test expected values.
|
It's possible I've done something wrong, but it seems like the removal of CAT has caused poor achromatic values: >>> Color('white').convert('helmgen').coords()
[0.9999999999999999, -1.939001285792858e-05, -7.003990773532003e-05]I admit, there's a possibility that I've done something incorrectly, but I suspect that the recent change is causing it. I am surprised that the approach was to patch M2. I would have thought patching M1 to ensure a good LMS transformation for the white point would have been the target, or maybe both, but I suspect just changing M2 might be what is causing this. |
|
I would probably revert the removal of CAT, combine the M1 and CAT matrix if the idea is to eliminate that extra step, or we'd have to correct whatever is going wrong with this change. |
Merge Bradford CAT (Color.js D65 → Helmlab D65) directly into M1: M1_eff = M1 @ CAT_TO_HELM Result: white now converts to [1, ~0, ~0] with machine-precision zeros (a ≈ -1e-16, b ≈ -7e-16) instead of [-2e-05, -7e-05]. Previously the CAT was removed and only M2's L-row was renormed, which left the a/b channels slightly off for Color.js D65 white. Merging the CAT into M1 restores the original mathematical equivalence without adding an extra transformation step. Round-trip precision is now at floating-point epsilon (~1e-13).
MetricSpace (helmlab) is designed for deltaE, not interpolation. Having a cylindrical form risks confusing users into using it for gradients instead of HelmGenLCh. Removing it keeps the API focused: helmlab/helmlch → distance, helmgen/helmgenlch → interpolation. Raised by @[reviewer] — agreed, no strong reason to keep it.
- delta.js: update deltaEHelmlab expected values to v21 params (were v23) and fix description 'v23' → 'v21' - conversions.js: fix helmlab black expect [0,,] → [0,0,0] (sparse array)
|
I'm getting reasonable results in Helmgen now. Again, I don't know if I just had something wrong on my end, but the the baked in CAT seems to be working for me. I'm still behind on Helmlab. I guess it has changed couple of times. |
|
This issue came up again: |
lloydk
left a comment
There was a problem hiding this comment.
LGTM unless there are more changes coming.
I've added helmlab to my chart library.




Summary
Adds HELMLAB, a data-driven analytical color space family optimized for
UI design systems, trained on 64,000+ human color-difference judgments
(arXiv:2602.23010). Four spaces:
Optimized for perceptual distance and color specification.
Optimized for interpolation: gradients, palettes, color-mix.
Performance
COMBVD: 6 psychophysical datasets, 3,813 pairs, 64,000+ human judgments.
Bootstrap 95% CI non-overlapping, p < 10⁻⁴.
Recent update: Softened cube root GenSpace (v0.10.0)
GenSpace now uses a softened cube root transfer function: f(x) = (x+ε)^(1/3) - ε^(1/3) with ε=0.001. This gives finite derivative at zero, resulting in 360/360/360 smooth cusps across sRGB, P3, and Rec.2020 gamuts — no gamut boundary cliffs. Combined with CMA-ES optimized M1/M2 matrices and piecewise-linear L correction, GenSpace achieves 27-7 vs OKLab in head-to-head ColorBench benchmarks.
Key changes from v0.9.2:
Implementation notes
npm install helmlab,pip install helmlabReferences