Skip to content

Commit dc0c819

Browse files
committed
๐Ÿ“–train: ๋ชจ๋ธ ๋ฐ์ดํ„ฐ 80๋งŒ๊ฑด finetuning
1 parent dff897c commit dc0c819

2 files changed

Lines changed: 2 additions & 1 deletion

File tree

โ€Ždata/data.csvโ€Ž

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4226,3 +4226,4 @@ id,content,category
42264226
4237,ํž˜-์“ฐ๋‹ค : ํž˜์„ ๋“ค์—ฌ ์ผ์„ ํ•˜๋‹ค.,dictionary
42274227
4238,ํž˜์—†-์ด : ๊ธฐ์šด์ด๋‚˜ ์˜์š• ๋”ฐ์œ„๊ฐ€ ์—†์ด.,dictionary
42284228
4239,ํž˜-์ฐจ๋‹ค : ํž˜์ด ์žˆ๊ณ  ์”ฉ์”ฉํ•˜๋‹ค.,dictionary
4229+
4240,"์˜ค๋Š˜์€ ์ง‘์— ๋Œ์•„์˜ค๋˜ ๊ธธ์— ๊ฐ‘์ž๊ธฐ ๊นŒ์น˜ ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋‚ด ๋จธ๋ฆฌ ์œ„๋ฅผ ๋‚ ์•„๋‹ค๋…”๋‹ค. ์ •๋ง ๊นŒ์น˜๊ฐ€ ๋‚ด ๋จธ๋ฆฌ ์œ„๋ฅผ ๋‚ ์•„๋‹ค๋‹ˆ๋Š” ๊ฒƒ์„ ์ฒ˜์Œ ๋ด์„œ ๋†€๋ž์ง€๋งŒ, ๋™์‹œ์— ์‹ ๊ธฐํ•˜๊ธฐ๋„ ํ–ˆ๋‹ค. ๊ทธ ์ˆœ๊ฐ„์—๋Š” ๋จธ๋ฆฌ๊ฐ€ ์กฐ๊ธˆ ๊ธํžˆ๋Š” ๋“ฏํ•œ ๋А๋‚Œ๋„ ๋“ค์—ˆ์ง€๋งŒ, ์†Œ๋ฆ„ ๋ผ์น˜๋Š” ๊ฒฝํ—˜์ด์—ˆ๋‹ค. ์ด๋Ÿฐ ์ผ์ด ์–ผ๋งˆ๋‚˜ ์ž์ฃผ ์žˆ๋Š” ์ผ์ธ์ง€ ๊ถ๊ธˆํ•˜๋‹ค. ์™œ ๊นŒ์น˜๋Š” ๋‚ด ๋จธ๋ฆฌ ์œ„๋ฅผ ๋‚ ์•„๋‹ค๋…”์„๊นŒ? ํ˜น์‹œ ์šด์ด ์ข‹๋‹ค๋Š” ์‹ ํ˜ธ์ธ๊ฐ€? ๋งค์ผ ๋˜‘๊ฐ™์€ ๊ธธ์„ ๊ฐ€๋Š”๋ฐ๋„ ํ•ญ์ƒ ์ƒˆ๋กœ์šด ์ผ์ƒ์ด ์ผ์–ด๋‚˜๋Š” ๊ฒƒ์ด ์ •๋ง ์‹ ๊ธฐํ•˜๋‹ค.",

โ€Žfinetune_doc2vec.pyโ€Ž

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ def main():
1212
model = Doc2Vec.load("models/memo_doc2vec.model")
1313

1414
# 2. ์ƒˆ ๋ฐ์ดํ„ฐ ๋กœ๋“œ
15-
path = "data/raw/setA_memos.txt"
15+
path = "data/raw/merged.txt"
1616
new_docs = [line.strip() for line in open(path, encoding="utf-8") if line.strip()]
1717
tagged_new = [TaggedDocument(words=tokenize_ko(s), tags=[f"new_{i}"])
1818
for i, s in enumerate(new_docs)]

0 commit comments

Comments
ย (0)