Commit 1982bba
committed
20250728_00 - Release
- Switched from Unidecode to ftfy: Replaced aggressive Unicode-to-ASCII conversion with intelligent text fixing
- Preserves Extended ASCII: Now correctly preserves 8-bit extended ASCII characters (128-255) like é, ñ, ü, etc.
- Smarter Unicode Handling: Only converts problematic Unicode characters while preserving intentional extended ASCII usage
- Updated Dependencies: Replaced Unidecode dependency with ftfy in requirements.txt
- Maintains AI Artifact Removal: Still removes smart quotes, EM/EN dashes, and other "AI tells" as designed1 parent 4e31d08 commit 1982bba
3 files changed
+28
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
3 | 13 | | |
4 | 14 | | |
5 | | - | |
| 15 | + | |
6 | 16 | | |
7 | 17 | | |
8 | 18 | | |
| |||
19 | 29 | | |
20 | 30 | | |
21 | 31 | | |
22 | | - | |
| 32 | + | |
23 | 33 | | |
24 | 34 | | |
25 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | | - | |
| 30 | + | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
66 | 70 | | |
67 | 71 | | |
68 | 72 | | |
| |||
153 | 157 | | |
154 | 158 | | |
155 | 159 | | |
156 | | - | |
| 160 | + | |
| 161 | + | |
157 | 162 | | |
158 | | - | |
159 | | - | |
160 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
161 | 168 | | |
162 | 169 | | |
163 | 170 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
0 commit comments