-
Notifications
You must be signed in to change notification settings - Fork 0
347. Top K Frequent Elements #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implement topKFrequent method to find top k frequent elements in a list.
| @@ -0,0 +1,36 @@ | |||
| # step1 | |||
|
|
|||
| メタ認知でpriority queueを使用しようと思いました。 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
メタ情報からヒントを得てしまっているという意味だと思いますが、個人的にはそれをメタ認知とは言わないと思います。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
たしかに日本語おかしかったです。笑
ご指摘ありがとうございます。
| ただどうも良いやり方が思いつかず、解答にあったコードを参考にして写経してあります。 | ||
|
|
||
| # step2 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
本練習会の標準的な進め方だとStep 2で他の参加者のコードをみたり、コメント集の自分が解いている問題のセクションをみたりすると思います。
何を見たかという意味でURL、そしてそれを見た感想をペアにして列挙していただくとレビュワーの助けになると思います。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
他の方の解き方で見たところも追記してみました。
QuickSelectを用いた解き方もやってみました!
https://github.com/t-ooka/leetcode/blob/question/Top-K-Frequent-Elements/Top%20K%20Frequent%20Elements%20(retry)/step2-using-quickselect.py
| result = [] | ||
| while heap: | ||
| result.append(heapq.heappop(heap)[1]) | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
どの順番で返してもいいと問題文に書いてあるので、return heapとしても良いでしょう。
| counts = {} | ||
| for num in nums: | ||
| counts[num] = 1 + counts.get(num, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
これをPythonでやる場合、標準モジュールcollectionsのCounterクラスを使うと便利です。
import collections
counts = collections.Counter(nums)https://docs.python.org/ja/3.13/library/collections.html#collections.Counter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ありがとうございます!
Counter の存在は認知していたのですが、このあたり を見てみて、一応使わないで実装しようとしておりました 🙇
| counts = {} | ||
| for num in nums: | ||
| counts[num] = 1 + counts.get(num, 0) | ||
| freq_num_pairs = [(cnt, num) for num, cnt in counts.items()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
freqのように英単語から削って変数名としている場合、読み手は元の英単語を推測する必要があり認知負荷が上がることがあります。原則としてフルスペルで書くことをおすすめします。
num, cutについては一時変数で使い捨てなのと、countsのkey, valueはすぐ上のコードから読み取れるので個人的には許容範囲です。
| num_to_frequency = {} | ||
| frequency_to_nums = [ [] for i in range(len(nums) + 1)] | ||
|
|
||
| for num in nums: | ||
| num_to_frequency[num] = 1 + num_to_frequency.get(num, 0) | ||
| for num, frequency in num_to_frequency.items(): | ||
| frequency_to_nums[frequency].append(num) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_to_frequencyの初期化 -> num_to_frequencyの構築 -> frequency_to_numsの初期化 -> frequency_to_numsの構築 の順番にした方が読み手のワーキングメモリに優しく読みやすいと思います。
| class Solution: | ||
| def topKFrequent(self, nums: List[int], k: int) -> List[int]: | ||
| num_to_frequency = {} | ||
| frequency_to_nums = [ [] for i in range(len(nums) + 1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
出現回数freq回の要素をfrequency_to_nums[freq - 1]に入れるようにすればこの配列の長さはlen(nums)で済みますね。
| frequency_to_nums[frequency].append(num) | ||
|
|
||
| res = [] | ||
| for i in range(len(frequency_to_nums) - 1, 0, -1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
自分なら i -> freqにすると思いますが、どちらでも良いと思います。
| heap = [] | ||
| for num, freq in counts.items(): | ||
| heapq.heappush(heap, (freq, num)) | ||
| if len(heap) > k: | ||
| heapq.heappop(heap) | ||
| result = [] | ||
| while heap: | ||
| result.append(heapq.heappop(heap)[1]) | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heapq.nlargestがこれと等価で簡潔です。
return heapq.nlargest(k, counts, key=counts.get)https://docs.python.org/ja/3/library/heapq.html#heapq.nlargest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
これは知りませんでした、助かります。ありがとうございます!
| for frequency_value in range(len(frequency_buckets) - 1, 0, -1): | ||
| for num in frequency_buckets[frequency_value]: | ||
| result.append(num) | ||
| if len(result) == k: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
単に問題を解くだけでなくその先を考えてみるのも良いと思います。
例えばこの関数が広く使われる中で、numsのユニークな要素数がk個未満だった場合、if len(result) == kに至ることはないため、暗黙的にNoneが返ることになりますが、それは望ましいでしょうか?
また、入力に対する頑健性があると便利だと思います。将来numsに[]が来たり、ときにはNoneが来たりするかもしれませんが、if not nums: return [] と最初に書いておけば一応エラーは起こさずに済みます。
実際の仕事のときを少し想像して欲しいんですね。
あなたは、配属されたチームでとりあえずセットアップができた。テックリードはめちゃくちゃ忙しそうです。
ウェブページになんかの上位 K 件を表示する機能をつけたいが、テックリードがやると別に書くのはすぐだがレビューアーとやりとりをしてプロダクションに持っていってとすると大変だ。代わりに引き取ってあげたい。みたいなのが状況です。
同率になることはいまのところありえない。としても、事情が変わるかもしれないので、同率 K 位がたくさんあると全部出ますは避けたほうが良いように思いますね。
https://discord.com/channels/1084280443945353267/1367399154200088626/1371325723612151918
| @@ -0,0 +1,14 @@ | |||
| class Solution: | |||
| def topKFrequent(self, nums: List[int], k: int) -> List[int]: | |||
| counts = {} | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ここは使い方的に counts だと情報が少なすぎる気がするので num_to_frequency などにすると思いました。
| class Solution: | ||
| def topKFrequent(self, nums: List[int], k: int) -> List[int]: | ||
| num_to_frequency = {} | ||
| frequency_to_nums = [ [] for i in range(len(nums) + 1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ と [ の間にスペースを空けることはあまりないように思いました。
Corrected terminology from 'メタ認知' to 'メタ情報' in memo.
| def topKFrequent(self, nums: List[int], k: int) -> List[int]: | ||
| num_to_frequency = {} | ||
| for num in nums: | ||
| num_to_frequency[num] = 1 + num_to_frequency.get(num, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
初期値は0が正しいのではないでしょうか?
| for num in nums: | ||
| num_to_frequency[num] = 1 + num_to_frequency.get(num, 1) | ||
|
|
||
| frequency_to_num_array = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorted()とlambdaを使ってスッキリ描くことも可能です。
問題: https://leetcode.com/problems/top-k-frequent-elements/description/