Fix tokenising when using using more than just a-zA-Z#37
Open
robotdana wants to merge 1 commit intomyint:masterfrom
Open
Fix tokenising when using using more than just a-zA-Z#37robotdana wants to merge 1 commit intomyint:masterfrom
robotdana wants to merge 1 commit intomyint:masterfrom
Conversation
57da098 to
c8bd64d
Compare
Previously: `Händler` would be tokenized as `ndler` or `ändler` depending on python version Rather than the expected `händler` Solution: use `regexp` rather than `re`. This gives us the ability to use unicode character clasess such as `[[:upper:]]` and `[[:lower:]]` Fixes myint#35
c8bd64d to
08b4eff
Compare
Owner
|
Thanks! I haven't tried the |
Author
|
If you're interested, i took the really long way round fixing this by creating my own spell checker https://github.com/robotdana/spellr |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously:
Händlerwould be tokenized asndlerorändlerdepending on python versionRather than the expected
händlerSolution: use
regexprather thanre.This gives us the ability to use unicode character clasess such as
[[:upper:]]and[[:lower:]]Fixes #35
I'm usually a ruby developer not a python developer I don't know how to get the regex library working on 2.7 or how to compare the test strings in a unicode-aware way (they're different on my mac vs on travis, if one passes the other fails)
But it mostly works