feat: Add Unicode font detection and enhanced error handling#1563
feat: Add Unicode font detection and enhanced error handling#1563otreci4sgelt0nas wants to merge 3 commits intopy-pdf:masterfrom
Conversation
- Add UnicodeFontManager class for automatic font detection and recommendations - Enhance FPDFUnicodeEncodingException with helpful font suggestions - Add comprehensive Unicode script detection (Cyrillic, Arabic, Chinese, etc.) - Provide system-specific font path detection (macOS, Linux, Windows) - Add convenience functions for quick font recommendations - Include comprehensive tests and tutorial examples - Improve error messages with specific font recommendations and usage instructions This addresses common Unicode encoding issues, especially with Cyrillic characters, by providing automatic font detection and helpful error messages that guide users to appropriate Unicode fonts.
|
Hi @otreci4sgelt0nas — thanks for this PR! At first glance the code looks really solid, and turning cryptic encoding errors into actionable messages sounds really good. The tutorials are helpful too. A couple of quick notes/questions:
I have limited time this weekend, but I can take a deeper look early next week (unless @Lucas-C beats me to it). Thanks again for the thoughtful contribution! |
|
I fully agree with @andersonhc feedbacks. Moreover, the GitHub Actions CI pipeline is failing du to the |
andersonhc
left a comment
There was a problem hiding this comment.
Thank you very much for your contribution.
Please:
- Add a CHANGELOG entry
- Format the code with
blackand check withpylintto pass the lint steps - Fix the 4 tests failing due to incorrect error assert.
|
Hi @otreci4sgelt0nas 🙂 👋 |
- Fix failing tests in test_unicode_font_utils.py - Format code with black - Add CHANGELOG entry for Unicode font detection feature Changes: - Fixed detect_script_in_text() to return None for Latin-only text - Fixed detect_script_in_text() to prioritize non-Latin scripts in mixed text - Applied black formatting to errors.py and unicode_font_utils.py - Added comprehensive CHANGELOG entry documenting the new feature
|
Hi @andersonhc and @Lucas-C ! Sorry for the long delay, but I've finally circled back to finish this. 🙂 |
andersonhc
left a comment
There was a problem hiding this comment.
Please run all formatting (black), linting (pylint) and typing (mypy and pyright) tools this project require or the lint job will fail.
Please review the tests failing.
| return f"{base_message}\n\n{self.suggestion}" | ||
| else: | ||
| return f"{base_message} Please consider using a Unicode font." | ||
|
|
There was a problem hiding this comment.
You changed the messages but didn't update the tests.
| self.system = platform.system().lower() | ||
| self.font_paths = self._get_system_font_paths() | ||
| self.available_fonts = self._scan_available_fonts() | ||
|
|
There was a problem hiding this comment.
You are importing UnicodeFontManager on fpdf.py and doing this system font scan on init, so every time fpdf2 is loaded we'll do this scan - that's a considerable performance hit.
Please move the scan out of __init__() and only perform when needed. You can just set a flag on init, like _available_fonts_loaded = False, and do the scan and set the flag only when you need.
feat: Add Unicode font detection and enhanced error handling
This fixes common Unicode encoding issues, especially with Cyrillic characters, by providing automatic font detection and helpful error messages that guide users to appropriate Unicode fonts. Users can now successfully generate PDFs with text in any Unicode script.