Skip to content

Commit 228919e

Browse files
fangchenliclaude
andcommitted
BUG: fix bins normalization in value_counts after refactoring
The previous change to avoid unnecessary to_numpy conversion broke normalization when bins is used. Bins normalization should divide by the total input length, not the sum of counts in bins. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 4facd99 commit 228919e

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

pandas/core/algorithms.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -897,7 +897,11 @@ def value_counts_internal(
897897
if dropna and (result._values == 0).all():
898898
result = result.iloc[0:0]
899899

900+
# normalizing is by len of all (regardless of dropna)
901+
normalize_denominator = len(ii)
902+
900903
else:
904+
normalize_denominator = None
901905
if is_extension_array_dtype(values):
902906
# handle Categorical and sparse,
903907
result = Series(values, copy=False)._values.value_counts(dropna=dropna)
@@ -925,8 +929,7 @@ def value_counts_internal(
925929
idx = Index(keys, dtype=keys.dtype, name=index_name)
926930

927931
if (
928-
bins is None
929-
and not sort
932+
not sort
930933
and isinstance(values, (DatetimeIndex, TimedeltaIndex))
931934
and idx.equals(values)
932935
and values.inferred_freq is not None
@@ -940,7 +943,10 @@ def value_counts_internal(
940943
result = result.sort_values(ascending=ascending, kind="stable")
941944

942945
if normalize:
943-
result = result / result.sum()
946+
if normalize_denominator is not None:
947+
result = result / normalize_denominator
948+
else:
949+
result = result / result.sum()
944950

945951
return result
946952

0 commit comments

Comments
 (0)