Skip to content

Commit ddb2530

Browse files
author
shijiashuai
committed
docs: add bilingual README and update configs
1 parent 8aa5a7a commit ddb2530

4 files changed

Lines changed: 110 additions & 5 deletions

File tree

README.en.md

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# CleanBook — Smart Bookmark Cleaning & Classification
2+
3+
[![Docs](https://img.shields.io/badge/docs-GitHub%20Pages-blue)](https://lessup.github.io/bookmarks-cleaner/)
4+
5+
[简体中文](README.md) | English
6+
7+
KISS: Rules + ML + optional LLM, offline-ready by default. Unified title emoji cleanup, powerful deduplication, outputs HTML/Markdown/JSON.
8+
9+
## Features
10+
11+
- Rules first, ML/semantic assist, optional LLM integration (auto-fallback on failure)
12+
- Unified title cleaning to avoid stacked emoji prefixes
13+
- Always-on deduplication for stable cross-browser export merging
14+
- Output classification limited to two levels for cleaner results
15+
16+
## Installation (pipx Recommended)
17+
18+
```powershell
19+
python -m pip install --user pipx
20+
python -m pipx ensurepath
21+
pipx install .
22+
```
23+
24+
Two commands available after installation:
25+
26+
- `cleanbook`: Command-line processing (equivalent to `python main.py`)
27+
- `cleanbook-wizard`: Interactive wizard experience
28+
29+
## Quick Example
30+
31+
```powershell
32+
cleanbook -i examples/demo_bookmarks.html -o output
33+
cleanbook -i "tests/input/*.html" --train
34+
cleanbook-wizard
35+
```
36+
37+
Common flags: `--workers` parallel, `--train` train ML, `--no-ml` disable ML, `--health-check` reachability check.
38+
39+
## LLM (Optional)
40+
41+
Edit `config.json` to enable:
42+
43+
```json
44+
"llm": {
45+
"enable": true,
46+
"provider": "openai",
47+
"base_url": "https://api.openai.com",
48+
"model": "gpt-4o-mini",
49+
"api_key_env": "OPENAI_API_KEY"
50+
}
51+
```
52+
53+
Set environment variable:
54+
55+
```powershell
56+
$env:OPENAI_API_KEY = "your_api_key"
57+
```
58+
59+
Falls back to offline classification when key is unset or API fails.
60+
61+
With `organizer.enable`, a secondary LLM pass clusters, sorts and summarizes categories after classification.
62+
63+
## Project Structure
64+
65+
```
66+
.
67+
├─ src/
68+
│ ├─ cleanbook/ # Unified CLI wrapper
69+
│ │ └─ cli.py
70+
│ ├─ ai_classifier.py # Rules + ML + semantic + user profile + LLM
71+
│ ├─ enhanced_classifier.py
72+
│ ├─ enhanced_clean_tidy.py
73+
│ ├─ bookmark_processor.py
74+
│ ├─ emoji_cleaner.py # Title emoji cleaning
75+
│ └─ ...
76+
├─ models/ # Models & cache
77+
├─ examples/
78+
├─ docs/
79+
├─ config.json
80+
├─ main.py # Top-level entry
81+
├─ pyproject.toml # Packaging & CLI entry points
82+
└─ changelog/
83+
```
84+
85+
## Distribution
86+
87+
- **Local/Team**: `pipx install .` for isolated global commands
88+
- **Open Source**: GitHub Release with example data; optionally publish to PyPI
89+
- **Windows standalone**: Optional PyInstaller single-file EXE
90+
91+
## License
92+
93+
MIT — see `LICENSE`.

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
1-
# CleanBook —— 智能书签清理与分类(中文)
1+
# CleanBook —— 智能书签清理与分类
22

33
[![Docs](https://img.shields.io/badge/docs-GitHub%20Pages-blue)](https://lessup.github.io/bookmarks-cleaner/)
44

5+
简体中文 | [English](README.en.md)
6+
57
KISS:规则 + 机器学习 + 可选 LLM,默认离线可用。统一清理标题 emoji,强力去重,输出 HTML/Markdown/JSON。
68

79
## 特性

docs/.vitepress/config.mts

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,13 +63,18 @@ export default defineConfig({
6363
},
6464
],
6565

66+
editLink: {
67+
pattern: 'https://github.com/LessUp/bookmarks-cleaner/edit/master/docs/:path',
68+
text: '在 GitHub 上编辑此页',
69+
},
70+
6671
socialLinks: [
6772
{ icon: "github", link: "https://github.com/LessUp/bookmarks-cleaner" },
6873
],
6974

7075
footer: {
71-
message: "MIT License",
72-
copyright: "© 2024-present LessUp",
76+
message: "基于 MIT 许可发布",
77+
copyright: "Copyright © 2025-2026 LessUp",
7378
},
7479

7580
outline: {
@@ -86,6 +91,11 @@ export default defineConfig({
8691
text: "最后更新",
8792
},
8893

94+
returnToTopLabel: '返回顶部',
95+
sidebarMenuLabel: '菜单',
96+
darkModeSwitchLabel: '主题',
97+
externalLinkIcon: true,
98+
8999
search: {
90100
provider: "local",
91101
options: {

docs/guides/development_guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@
3030

3131
```bash
3232
# 克隆项目
33-
git clone <repository-url>
34-
cd CleanBookmarks
33+
git clone https://github.com/LessUp/bookmarks-cleaner.git
34+
cd bookmarks-cleaner
3535

3636
# 创建虚拟环境
3737
python -m venv venv

0 commit comments

Comments
 (0)