Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
201 changes: 201 additions & 0 deletions 560_subarray_sum_equals_k_medium/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
# 問題へのリンク
[560. Subarray Sum Equals K](https://leetcode.com/problems/subarray-sum-equals-k/)

# 言語
Python


# 自分の解法
- `nums`に負の数が含まれなければ、two pointersでTC: `O(n)`/ SC: `O(1)`で解けるが、負の数が含まれるので、two pointersは使えない。
- 累積和に対して二分探索をして`O(n log n)`という解法もある。
- すべての数に一律に大きな値(`-min(nums)`など)を足して累積和を非負にする方法も考えたが、それでは累積和が「要素数×ずらした値」の分だけずれるので、うまくいかない。

## step1
二重ループを回す方法
```python
class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
num_subarrays = 0
# cumsums[i] = sum(nums[:i])
# sum(nums[i:j]) = cumsums[i] - cumsums[j]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sum(nums[i:j]) = cumsums[j] - cumsums[i]でしょうか.
コード本体では正しく書けているのでタイポだと思いますが

cumsums = [0] * (len(nums) + 1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cumsum という単語はやや分かりにくく感じました。 prefix_sum や cumulative_sum など、フルスペルで書いたほうが分かりやすいと思います。

for i in range(len(nums)):
cumsums[i + 1] = cumsums[i] + nums[i]

for i in range(len(nums)):
for j in range(i + 1, len(nums) + 1):
sum_from_i_to_j = cumsums[j] - cumsums[i]
if sum_from_i_to_j == k:
num_subarrays += 1

return num_subarrays
```

`n`を`nums`の長さとすると、
- 時間計算量:`O(n^2)`
- 空間計算量:`O(n)`

- この空間計算量は`O(1)`にできる





- `cumsum`の求め方は以下のような方法もある。
```python
cumsums = [0]
for num in nums:
cumsums.append(cumsums[-1] + num)
```
(ref: https://github.com/tokuhirat/LeetCode/pull/16/files?short_path=d4900f9#diff-d4900f989c6f9680b8e8144658ef8f10d6025523b2c0c63bed653dcdcc4fc290)

## step2
空間計算量を`O(1)`にする方法
```python
class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
num_subarrays = 0
for i in range(len(nums)):
subarray_sum = 0 # sum(nums[i:j+1])
for j in range(i, len(nums)):
subarray_sum += nums[j]
if subarray_sum == k:
num_subarrays += 1
return num_subarrays
```

- `cumsum`は`cumsums[i] = sum(nums[:i])`と定義すると、元の配列より1だけ長くなるので、添え字の管理が面倒になる点に注意する。特に、本解法のようにforループを1度だけ回す場合、`range`の範囲をどうするかがポイントになる&バグを生みやすい。
- `cumsums[i] = sum(nums[:i+1])`と定義すると、`cumsums`の長さは`len(nums)`と同じになるが、`cumsums[i] - cumsums[i-1] = nums[i]`が`i=0`のときに成り立たないので、条件分岐が余分に必要になる。
- cumulative sum はprefix sumとも呼ばれる。
- cf. https://en.wikipedia.org/wiki/Prefix_sum



## step3

`step3_1.py` (20 min)
```python
from collections import defaultdict


class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
"""
Find the number of pairs (i,j) such that sum(nums[i:j]) == k (0<=i<j<=n).
This can be found as below:
for each j = 1, ..., len(nums), find the number of i (0<=i<j) that satisifes nums[:j] -k == nums[:i].
Note that nums[:0] is defined as 0.
Comment on lines +85 to +87

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

これは実装の説明であって、関数を使う側には役に立たない情報であり、また下のコードを見れば十分読み取れる内容なため、あえてdocstringに書く必要はないと思いました。

"""
num_subarrays = 0
# sum(nums[:i]) -> count
subarray_sum_count: dict[int, int] = defaultdict(int)
sum_to_j = 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jってなんだろう?」と思ったので,for j in ...の直前にあると認知負荷が下がるかもしれません.

# Add sum(nums[:0]) = 0
subarray_sum_count[0] = 1
for j in range(1, len(nums) + 1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ループカウンタでiを使っていないのにjが先に出てくるのは違和感があります。(通例、jは2種類目のループカウンタ、という意味だと思うので。)

特別な意味がある場合はまず変数名を工夫すべきだと感じます。

sum_to_j += nums[j - 1] # sum(nums[:j])
count = subarray_sum_count[sum_to_j - k]
num_subarrays += count
subarray_sum_count[sum_to_j] += 1
return num_subarrays
```
- `sum_to_j`は`prefix_sum`の方が適切かも
- `subarray_sum_count` も `prefix_sum_count` の方が適切かも

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

自分なら、prefix_sum_to_countやcumsum_to_frequencyなどにします。


テストケース
- 単一要素:`nums=[1]`, `k=1` -> `1`、`nums=[1]`, `k=0` -> `0`
- 同じ要素が複数ある: `nums=[1, 1, 1]`, `k=3` -> `1`、`nums=[1, 1, 1]`, `k=2` -> `2`
- 標準的なケース: `nums=[1, 2, 3, -3, 3]`, `k=3` -> `4`
- k=0: `nums=[1, -1, 1, -1]`, `k=0` -> `4`
- 0を多く含む: `nums=[0, 1, 0]`, `k=1` -> `4`
- 足し算では、0が(単位元なので)特殊なケースになることが多い
- 掛け算では、1
- kが負: `nums=[-1, -1, -1]`, `k=-2` -> `2`
- 空配列: `nums=[]`, `k=0` -> `0`、`nums=[]`, `k=1` -> `0`


```python
from collections import defaultdict
class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
num_subarrays = 0
prefix_sum_counts = defaultdict(int)
prefix_sum = 0
for index in range(len(nums)+1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+, -などの二項演算子の前後は原則半角スペースを入れた方が良いと思います。

https://peps.python.org/pep-0008/#other-recommendations
https://peps.python.org/pep-0008/#whitespace-in-expressions-and-statements

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

初期化時にprefix_sum_counts[0] += 1としておけば、indexを0から始める必要がなくなり、そのほうが可読性が高いと感じます。

また、indexはnumsへのアクセスにしか使われていないので、for num in nums:で良い気がします。

if index > 0:
prefix_sum += nums[index-1]
count_end_with_index = prefix_sum_counts[prefix_sum-k]
num_subarrays += count_end_with_index
Comment on lines +127 to +128
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2行程度でこの後再利用するなどもないので,countでも十分意図は伝わると思いました.

prefix_sum_counts[prefix_sum] += 1

return num_subarrays
```

## step4 (FB)



# 別解・模範解答
## ハッシュマップを使う方法
時間計算量を`O(n)`にできる。
```python
from collections import defaultdict


class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
# k = nums[i:j] = cumsums[j] - cumsums[i]
# if cumsums[j]-k in cumsum_hashmap for j in i+1, ..., then OK
num_subarrays = 0
# cumsums[i] = sum(nums[:i])
cumsums = [0] * (len(nums) + 1)
for i in range(len(nums)):
cumsums[i + 1] = cumsums[i] + nums[i]
Comment on lines +151 to +153

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

以下のようにも書けます。

import itertools

cumsums = [0] + list(itertools.accumulate(nums))

https://docs.python.org/ja/3.13/library/itertools.html#itertools.accumulate


hashmap: dict[int, int] = defaultdict(int)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashmapという変数名は、要はdictと名付けているのと一緒で、あまり読み手の助けにならない気がします。(type hintがあるので情報量としてはゼロだと思います。)

自分ならcumsum_to_countなどにします。(自分はdictの変数名に対し大体は{key}_to_{value}とつけてます。)

for i in range(len(nums) + 1):
num_subarrays += hashmap[cumsums[i] - k]
hashmap[cumsums[i]] += 1
return num_subarrays
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2-passで書かれてますが、こちらも1-passでも書けますね。

from collections import defaultdict


class Solution:
    def subarraySum(self, nums: list[int], k: int) -> int:
        cumsum_to_count = defaultdict(int)
        cumsum_to_count[0] += 1
        cumsum = 0
        total_count = 0
        for num in nums:
            cumsum += num
            total_count += cumsum_to_count.get(cumsum - k, 0)
            cumsum_to_count[cumsum] += 1
        return total_count


- Subarray自体は必要なくて、その数だけが必要であることがミソ。数だけなら、ハッシュマップで管理すれば、`O(1)`でアクセスできる。
- `cumsums`を使う解法ではSubarray自体が求まる
- ただし、`cumsums`の解法の空間計算量を`O(1)`にした解法とは時間計算量と空間計算量のトレードオフの関係にあるので、どちらが良いかは場合による。

- 時間計算量:`O(1)`
- 空間計算量:`O(n)`
Comment on lines +166 to +167

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

入れ替わってますね...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

時間空間ともにO(n)じゃないでしょうか?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全探索、単純な二重ループは時間計算量O(n^2)、空間計算量O(1)です。

hashmapの方法が時間計算量O(n)、空間計算量O(n)です。一重ループのため走査がn回です。各走査ごとのパターン網羅O(n)をhashmapでO(1)に高速化しています。ただしパターンを記憶するためにO(n)の空間容量が必要で、そこがトレードオフです。(時間のn倍を空間のn倍に、hashmapで変換しているイメージです。)

two pointersの方法が時間計算量O(n)、空間計算量O(1)です。理由はleftとrightが2回線形に舐めるので走査回数が2n前後だからです。単純な二重ループの枝刈りに相当します。

速度、空間の2軸では下2つがパレート最適な選択肢で、two pointersが最も優等生っぽく感じます。


## `nums`が正の数のみを含む場合(two pointers)
```python
class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
"""
Find the subarraySum in O(n) when nums have positive elements.
"""
if not nums:
return 0
Comment on lines +176 to +177

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

これは省いたとしてもforループがskipされてそのまま0が返るので書かなくても良いと思います。(あったとしても不自然とまでは思いません。)

num_subarrays = 0
subarray_sum = 0
left = 0
for right in range(len(nums)):
subarray_sum += nums[right]
while left < right and subarray_sum > k:
subarray_sum -= nums[left]
left += 1

if subarray_sum == k:
num_subarrays += 1

return num_subarrays
```
- ただし、`nums`に0が含まれる場合は、結構複雑になる


# 想定されるフォローアップ質問
- この実装の最悪空間使用量は?どうすれば実使用メモリを抑えられますか?
- `defaultdict`の代わりに`dict.get`を使えば、メモリ使用量を抑えられる可能性がある。

# 次に解く問題の予告
- [String to Integer (atoi) - LeetCode](https://leetcode.com/problems/string-to-integer-atoi/description/)
- [Number of Islands - LeetCode](https://leetcode.com/problems/number-of-islands/description/)
28 changes: 28 additions & 0 deletions 560_subarray_sum_equals_k_medium/step1.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#
# @lc app=leetcode id=560 lang=python3
#
# [560] Subarray Sum Equals K
#

# @lc code=start


class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
num_subarrays = 0
# cumsums[i] = sum(nums[:i])
# sum(nums[i:j]) = cumsums[i] - cumsums[j]
cumsums = [0] * (len(nums) + 1)
for i in range(len(nums)):
cumsums[i + 1] = cumsums[i] + nums[i]

for i in range(len(nums)):
for j in range(i + 1, len(nums) + 1):
sum_from_i_to_j = cumsums[j] - cumsums[i]
if sum_from_i_to_j == k:
num_subarrays += 1

return num_subarrays


# @lc code=end
28 changes: 28 additions & 0 deletions 560_subarray_sum_equals_k_medium/step1_hashmap.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#
# @lc app=leetcode id=560 lang=python3
#
# [560] Subarray Sum Equals K
#

# @lc code=start
from collections import defaultdict


class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
# k = nums[i:j] = cumsums[j] - cumsums[i]
# if cumsums[j]-k in cumsum_hashmap for j in i+1, ..., then OK
num_subarrays = 0
# cumsums[i] = sum(nums[:i])
cumsums = [0] * (len(nums) + 1)
for i in range(len(nums)):
cumsums[i + 1] = cumsums[i] + nums[i]

hashmap: dict[int, int] = defaultdict(int)
for i in range(len(nums) + 1):
num_subarrays += hashmap[cumsums[i] - k]
hashmap[cumsums[i]] += 1
return num_subarrays


# @lc code=end
20 changes: 20 additions & 0 deletions 560_subarray_sum_equals_k_medium/step2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#
# @lc app=leetcode id=560 lang=python3
#
# [560] Subarray Sum Equals K
#

# @lc code=start
class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
num_subarrays = 0
for i in range(len(nums)):
subarray_sum = 0 # sum(nums[i:j+1])
for j in range(i, len(nums)):
subarray_sum += nums[j]
if subarray_sum == k:
num_subarrays += 1
return num_subarrays


# @lc code=end
25 changes: 25 additions & 0 deletions 560_subarray_sum_equals_k_medium/step2_hashmap.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#
# @lc app=leetcode id=560 lang=python3
#
# [560] Subarray Sum Equals K
#

# @lc code=start
from collections import defaultdict


class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
# find the number of pairs of (i,j) s.t. 0<=i<j<=n and k = sum(nums[:j]) - sum(nums[:i])
num_subarrays = 0
cumsum_counts: dict[int, int] = defaultdict(int)
cumsum = 0
cumsum_counts[cumsum] = 1
for num in nums:
cumsum += num
num_subarrays += cumsum_counts[cumsum - k]
cumsum_counts[cumsum] += 1
return num_subarrays


# @lc code=end
23 changes: 23 additions & 0 deletions 560_subarray_sum_equals_k_medium/step3.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#
# @lc app=leetcode id=560 lang=python3
#
# [560] Subarray Sum Equals K
#

# @lc code=start
from collections import defaultdict
class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
num_subarrays = 0
prefix_sum_counts = defaultdict(int)
prefix_sum = 0
for index in range(len(nums)+1):
if index > 0:
prefix_sum += nums[index-1]
count_end_with_index = prefix_sum_counts[prefix_sum-k]
num_subarrays += count_end_with_index
prefix_sum_counts[prefix_sum] += 1

return num_subarrays

# @lc code=end
31 changes: 31 additions & 0 deletions 560_subarray_sum_equals_k_medium/step3_1.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#
# @lc app=leetcode id=560 lang=python3
#
# [560] Subarray Sum Equals K
#

# @lc code=start
from collections import defaultdict


class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
"""
Count subarrays with sum == k using prefix sums:
For each prefix S_j, add how many prior prefixes S_i satisfy S_j - S_i = k.
"""
num_subarrays = 0
# sum(nums[:i]) -> count
subarray_sum_count: dict[int, int] = defaultdict(int)
sum_to_j = 0
# Add sum(nums[:0]) = 0
subarray_sum_count[0] = 1
for j in range(1, len(nums) + 1):
sum_to_j += nums[j - 1] # sum(nums[:j])
count = subarray_sum_count[sum_to_j - k]
num_subarrays += count
subarray_sum_count[sum_to_j] += 1
return num_subarrays


# @lc code=end
29 changes: 29 additions & 0 deletions 560_subarray_sum_equals_k_medium/two_pointers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#
# @lc app=leetcode id=560 lang=python3
#
# [560] Subarray Sum Equals K
#
# @lc code=start
class Solution:
def subarraySum(self, nums: list[int], k: int) -> int:
"""
Find the subarraySum in O(n) when nums have positive elements.
"""
if not nums:
return 0
num_subarrays = 0
subarray_sum = 0
left = 0
for right in range(len(nums)):
subarray_sum += nums[right]
while left < right and subarray_sum > k:
subarray_sum -= nums[left]
left += 1

if subarray_sum == k:
num_subarrays += 1

return num_subarrays


# @lc code=end