Skip to content

Commit 1399388

Browse files
SonAIengineclaude
andcommitted
revert: embedding 벤치마크 비활성화 — qwen3-embedding:0.6b 품질 부족
qwen3-embedding:0.6b로 embedding 활성화 시 전 데이터셋 하락: - Allganize 0.395→0.158, KLUE-MRC 0.717→0.563, AutoRAG 0.639→0.460 - 0.6B 모델이 한국어 QA 의미 매칭에 부적합 - 더 큰 embedding 모델 (multilingual-e5-large 등) 필요 hybrid weight도 원복 (alpha=0.5) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 49aba41 commit 1399388

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

src/synaptic/search.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ async def search(
108108
vec_scores[node.id] = vec_score
109109

110110
# FTS + vector hybrid score aggregation
111-
alpha = 0.5 # FTS vs vector weight (0.5 = equal)
111+
alpha = 0.5 # FTS vs vector weight
112112
for nid, node in {n.id: n for n in vec_nodes}.items():
113113
fts_s = fts_scores.get(nid, 0.0)
114114
vec_s = vec_scores.get(nid, 0.0)
@@ -118,7 +118,7 @@ async def search(
118118
all_nodes[nid] = (all_nodes[nid][0], min(1.0, hybrid))
119119
else:
120120
# vector only
121-
all_nodes[nid] = (node, vec_s * 0.9) # slight decay when no FTS match
121+
all_nodes[nid] = (node, vec_s * 0.9)
122122

123123
# Stage 2: Synonym expansion (if insufficient results)
124124
if len(all_nodes) < limit:

tests/benchmark/test_external_datasets.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ async def _build_graph(
4343
*,
4444
max_docs: int = 0,
4545
) -> tuple[SynapticGraph, dict[str, str]]:
46-
"""corpus를 SynapticGraph에 인덱싱. id_map 반환."""
46+
"""corpus를 SynapticGraph에 인덱싱. FTS only (embedding은 모델 품질에 의존)."""
4747
backend = MemoryBackend()
4848
await backend.connect()
4949
graph = SynapticGraph(backend)

0 commit comments

Comments
 (0)