Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions arai60/unique-email-addresses/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
## 考察
- 初見の問題
- 方針
- 文字列操作してHashSetに追加すればよさそう
- C++にはたしかsplitがなかったはずなので、Pythonでやる
- 文字列操作の代わりに正規表現でも出来そう
- ソラでは書ける自信がない(書けるようになっておいた方がいい?)
- 後ほど、調べながらやる
- time: O(nm), space: O(nm)
- n: emails.length, m: max_i(emails[i].length)
- あとは実装

## Step1
- splitとreplaceを使った
- time: O(nm), space: O(nm)

## Step2
- まずは正規表現でもやってみる -> `step2_re.py`
- 他の人のPRを検索
- splitしてからreplaceした方が効率が良い
- Ref. https://github.com/Mike0121/LeetCode/pull/31#discussion_r1642845600
- 各emailに対する処理は関数化した方がわかりやすいかも
- Ref. https://github.com/SuperHotDogCat/coding-interview/pull/30/files#r1641769186
- 正規表現について
- Ref. https://github.com/fhiyo/leetcode/pull/17#discussion_r1629618002

- 上記を元に各操作を関数化して組み合わせる書き方に修正した
- local_nameとdomain_nameの組み合わせをtupleで表現した

## Step3
- 1回目: 2m19s
- 2回目: 2m15s
- 3回目: 2m02s

## Step4
- レビューを元に修正
- privateメソッド相当のものには先頭に _ を付与
- メソッド名の見直し
- `claen_local_name` -> `_normalize_local_name`
- `remove_after_plus` -> `_remove_plus_and_after`
9 changes: 9 additions & 0 deletions arai60/unique-email-addresses/step1.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
class Solution:
def numUniqueEmails(self, emails: List[str]) -> int:
unique_emails = set()
for email in emails:
local_name, domain_name = email.split('@')
formatted_local_name = local_name.replace('.', '').split('+')[0]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 行に多くの要素を詰め込み過ぎているように感じました。

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ありがとうございます。2行に分けた方が良さそうですね。

unique_emails.add(formatted_local_name + '@' + domain_name)

return len(unique_emails)
20 changes: 20 additions & 0 deletions arai60/unique-email-addresses/step2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
class Solution:
def numUniqueEmails(self, emails: List[str]) -> int:
unique_emails = set()
for email in emails:
local_name, domain_name = email.split('@')
unique_emails.add((
self.clean_local_name(local_name),
domain_name,
))

return len(unique_emails)

def clean_local_name(self, local_name):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private メソッド相当ですので、関数名を _ で始めると良いかもしれません。

また、local_name をきれいにするというよりは、正規化するといったほうがしっくりきます。 _normalize_local_name() はいかがでしょうか? _canonicalize_local_name() でも良いと思います。

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

たしかにクレンジングより正規化の方がしっくり来ました。_normalize_local_name() いいですね、ありがとうございます。
underscoreの話はこの辺りですかね。
Use one leading underscore only for non-public methods and instance variables.
https://peps.python.org/pep-0008/#method-names-and-instance-variables

return self.remove_all_dots(self.remove_after_plus(local_name))

def remove_all_dots(self, text):
return text.replace('.', '')

def remove_after_plus(self, text):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

細かい点となりますが、 after plus ですと、 plus より後を表すように思います。 remove_plus_and_after() はいかがでしょうか?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ありがとうございます。remove_plus_and_after()pick_before_plus() が良さそうです。

return text.split('+')[0]
10 changes: 10 additions & 0 deletions arai60/unique-email-addresses/step2_re.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
class Solution:
def numUniqueEmails(self, emails: List[str]) -> int:
unique_emails = set()
for email in emails:
local_name, domain_name = email.split('@')
local_name = re.sub(r'\+.*', '', local_name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

個人的な感覚ですが、組み込み関数と正規表現で、同じ処理を粉う場合、組み込み関数を使ったほうが、読んでいて認知負荷が低く読みやすく感じます。

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考になります。こちら自分も同じ感覚です。(正規表現にあまり慣れていないこともありますが...)

local_name = re.sub(r'\.', '', local_name)
unique_emails.add(local_name + '@' + domain_name)

return len(unique_emails)
19 changes: 19 additions & 0 deletions arai60/unique-email-addresses/step3.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
class Solution:
def numUniqueEmails(self, emails: List[str]) -> int:
unique_emails = set()
for email in emails:
local_name, domain_name = email.split('@')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split, replace あたりの標準マニュアルは一回見ておきましょう。
https://docs.python.org/3.12/library/stdtypes.html#str.split
意外な引数や機能があったりするからです。

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

いつもありがとうございます。
一通り目を通しておきます。

unique_emails.add((
self.clean_local_name(local_name),
domain_name,
))
return len(unique_emails)

def clean_local_name(self, local_name):
return self.remove_all_dots(self.remove_after_plus(local_name))

def remove_all_dots(self, text):
return text.replace('.', '')

def remove_after_plus(self, text):
return text.split('+')[0]
19 changes: 19 additions & 0 deletions arai60/unique-email-addresses/step4.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
class Solution:
def numUniqueEmails(self, emails: List[str]) -> int:
unique_emails = set()
for email in emails:
local_name, domain_name = email.split('@')
unique_emails.add((
self._normalize_local_name(local_name),
domain_name,
))
return len(unique_emails)

def _normalize_local_name(self, local_name):
return self._remove_all_dots(self._remove_plus_and_after(local_name))

def _remove_all_dots(self, text):
return text.replace('.', '')

def _remove_plus_and_after(self, text):
return text.split('+')[0]
Comment on lines +15 to +19
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1行のみの処理なら関数化しなくてもいいのでは、と思いました。複数箇所で使われていたり、よっぽど分かりづらい内容なら関数名がつくことで理解が捗ることもあるかも知れませんが、どちらの関数の処理も非常に明快に感じます。

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_normalize_local_name に両方の処理があると良いかなと思います

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

たしかにここは好みの問題かと思います。ありがとうございます。