Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 72 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,47 +5,81 @@

# Case Conversion

This is a port of the Sublime Text 3 plugin [CaseConversion](https://github.com/jdc0589/CaseConversion), by [Davis Clark's](https://github.com/jdc0589), to a regular python package. I couldn't find any other python packages on PyPI at the time (Feb 2016) that could seamlessly convert from any case to any other case without having to specify from what type of case I was converting. This plugin worked really well, so I separated the (non-sublime) python parts of the plugin into this useful python package. I also added Unicode support via python's `unicodedata`.
This is a port of the Sublime Text 3 plugin [CaseConversion](https://github.com/jdc0589/CaseConversion), by [Davis Clark](https://github.com/jdc0589), to a regular python package. I couldn't find any other python packages on PyPI at the time (Feb 2016) that could seamlessly convert from any case to any other case without having to specify from what type of case I was converting. This plugin worked really well, so I separated the (non-sublime) python parts of the plugin into this useful python package. I also added Unicode support via python's `unicodedata` and extended the interface some.

## Features

- Autodetection of case *(no need to specify explicitly which case you are converting from!)*
- Auto-detection of case *(no need to specify explicitly which case you are converting from!)*
- Acronym detection *(no funky splitting on every capital letter of an all caps acronym like `HTTPError`!)*
- Unicode supported (non-ASCII characters are first class citizens!)
- Dependency free!
- Supports Python 3.6+
- Over 95 percent test coverage and full type annotation.
- Supports Python 3.10+
- Every case conversion from/to you ever gonna need:
- `camelCase`
- `PascalCase`
- `snake_case`
- `dash-case` (aka `kebap-case`, `spinal-case` or `slug-case`)
- `CONST_CASE` (aka `SCREAMING_SNAKE_CASE`)
- `dot.case`
- `separate words`
- `slash/case`
- `backslash\\case`
- `Ada_Case`
- `Http-Header-Case`
- `camel` -> "camelCase"
- `pascal` / `mixed` -> "PascalCase" / "MixedCase"
- `snake` -> "snake_case"
- `snake` / `kebab` / `spinal` / `slug` -> "dash-case" / "kebab-case" / "spinal-case" / "slug-case"
- `const` / `screaming_snake` -> "CONST_CASE" / "SCREAMING_SNAKE_CASE"
- `dot` -> "dot.case"
- `separate_words` -> "separate words"
- `slash` -> "slash/case"
- `backslash` -> "backslash\case"
- `ada` -> "Ada_Case"
- `http_header` -> "Http-Header-Case"

## Usage

Normal use is self-explanatory.

### Converter Class

Basic

```python
>>> from case_conversion import Converter
>>> converter = Converter()
>>> converter.camel("FOO_BAR_STRING")
'fooBarString'
```

Initialize text when needing to convert the same text to multiple different cases.
```python
>>> from case_conversion import Converter
>>> converter = Converter(text="FOO_BAR_STRING")
>>> converter.camel()
'fooBarString'
>>> converter.pascal()
'FooBarString'
```

Initialize custom acronyms
```python
>>> from case_conversion import Converter
>>> converter = Converter(acronyms=["BAR"])
>>> converter.camel("FOO_BAR_STRING")
'fooBARString'
```

### Convenience Functions

For backwards compatibility and convenience, all converters are available as top level functions. They are all shorthand for:

`Converter(text, acronyms).converter_function()`

```python
>>> import case_conversion
>>> case_conversion.dash("FOO_BAR_STRING")
'foo-bar-string'
```

To use acronym detection simply pass in a list of `acronyms` to detect as whole words.
Simple acronym detection comes included, by treating strings of capital letters as a single word instead of several single letter words.

Custom acronyms can be supplied when needing to separate them from each other.
```python
>>> import case_conversion
>>> case_conversion.snake("fooBarHTTPError")
'foo_bar_h_t_t_p_error' # ewwww :(
>>> case_conversion.snake("fooBarHTTPError", acronyms=['HTTP'])
'foo_bar_http_error' # pretty :)
>>> case_conversion.snake("fooBADHTTPError")
'foo_badhttp_error' # we wanted BAD and HTTP to be separate!
>>> case_conversion.snake("fooBarHTTPError", acronyms=['BAD', 'HTTP'])
'foo_bad_http_error' # custom acronyms achieved!
```

Unicode is fully supported - even for acronyms.
Expand All @@ -66,30 +100,34 @@ FÓÓ_BAR_STRING
pip install case-conversion
```



## Contribute

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

This package is being developed with [poetry]([https://python-poetry.org/](https://python-poetry.org/)) (-> [docs]([https://python-poetry.org/docs/](https://python-poetry.org/docs/))).

Before opening a pull request, please make sure to:

- update tests as appropriate

- `flake8`, `mypy` and `pytest` are happy
This package is being developed with [uv](https://github.com/astral-sh/uv) (-> [docs](https://docs.astral.sh/uv/)).

CI will run tests and lint checks.
Locally you can run them with:
```bash
# runs tests with coverage
make test
# Runs linter (using ruff)
make lint
# Auto-fix linter errors (using ruff --fix)
make format
# run type check (using ty)
make tc
```



## Credits

Credit goes to [Davis Clark's](https://github.com/jdc0589) as the author of the original plugin and its contributors (Scott Bessler, Curtis Gibby, Matt Morrison). Thanks for their awesome work on making such a robust and awesome case converter.

Further credit goes to @olsonpm for making this package dependency-free.

Further thanks and credit to [@olsonpm](https://github.com/olsonpm) for making this package dependency-free and encouraging package maintenance and best practices.


## Licence
## License

Using [MIT licence](LICENSE.txt) with Davis Clark's Copyright
Using [MIT license](LICENSE.txt) with Davis Clark's Copyright
19 changes: 14 additions & 5 deletions case_conversion/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
from .converter import (
from case_conversion.converter import (
camel,
pascal,
mixed,
snake,
dash,
kebab,
spinal,
slug,
const,
screaming_snake,
dot,
separate_words,
slash,
Expand All @@ -16,15 +21,20 @@
http_header,
Converter,
)
from .parser import parse_case
from .types import Case, InvalidAcronymError
from case_conversion.parser import parse_into_words
from case_conversion.acronym import InvalidAcronymError

__all__ = [
"camel",
"pascal",
"mixed",
"snake",
"dash",
"kebab",
"spinal",
"slug",
"const",
"screaming_snake",
"dot",
"separate_words",
"slash",
Expand All @@ -35,8 +45,7 @@
"upper",
"capitalize",
"http_header",
"parse_case",
"Case",
"parse_into_words",
"InvalidAcronymError",
"Converter",
]
154 changes: 154 additions & 0 deletions case_conversion/acronym.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
from typing import Iterator, TypeGuard

from case_conversion.unicode_char import is_separator_char


class InvalidAcronymError(Exception):
"""Raise when acronym fails validation."""

def __init__(self, acronym: str) -> None: # noqa: D107
msg = f"Case Conversion: acronym '{acronym}' is invalid."
super().__init__(msg)


def find_substring_ranges(string: str, substring: str) -> Iterator[tuple[int, int]]:
"""Finds (start, end) ranges of all occurrences of substring in string.

>>> list(find_substring_ranges("foo_bar_bar", "bar"))
[(4, 7), (8, 11)]
"""
start = 0
sub_len = len(substring)
while True:
start = string.find(substring, start)
if start == -1:
return
yield (start, start + sub_len)
start += 1


def is_str_list(a_list: list[str | None]) -> TypeGuard[list[str]]:
return all(isinstance(item, str) for item in a_list)


def advanced_acronym_detection(
s: int, i: int, words: list[str | None], acronyms: list[str]
) -> int:
"""Detect acronyms by checking against a list of acronyms.

Arguments:
s (int): Index of first word in run
i (int): Index of current word
words (list of str): Segmented input string
acronyms (list of str): List of acronyms

Returns:
int: Index of last word in run
"""
# Combine each letter into single string.
words_to_join = words[s:i]
assert is_str_list(words_to_join)
acr_str = "".join(words_to_join)

# List of ranges representing found acronyms.
range_list: list[tuple[int, int]] = []
# Set of remaining letters.
not_range = set(range(len(acr_str)))

# Search for each acronym in acr_str.
for acr in acronyms:
for start, end in find_substring_ranges(acr_str, acr):
# Make sure found acronym doesn't overlap with others.
for r in range_list:
if start < r[1] and end > r[0]:
break
else:
range_list.append((start, end))
for j in range(start, end):
not_range.remove(j)

# Add remaining letters as ranges.
if not_range:
not_range = sorted(not_range)
start_nr = not_range[0] if not_range else -1
prev_nr = start_nr - 1
for nr in sorted(not_range):
if nr > prev_nr + 1:
range_list.append((start_nr, prev_nr + 1))
start_nr = nr
prev_nr = nr
range_list.append((start_nr, prev_nr + 1))

# No ranges will overlap, so it's safe to sort by lower bound,
# which sort() will do by default.
range_list.sort()

# Remove original letters in word list.
for _ in range(s, i):
del words[s]

# Replace them with new word grouping.
for j in range(len(range_list)):
r = range_list[j]
words.insert(s + j, acr_str[r[0] : r[1]])

return s + len(range_list) - 1


def simple_acronym_detection(s: int, i: int, words: list[str | None], *args) -> int:
"""Detect acronyms based on runs of upper-case letters.

Arguments:
s (int): Index of first letter in run
i (int): Index of current word
words (list of str): Segmented input string
args: Placeholder to conform to signature of
advanced_acronym_detection

Returns:
int: Index of last letter in run
"""
# Combine each letter into a single string.
words_to_join = words[s:i]
assert is_str_list(words_to_join)
acr_str = "".join(words_to_join)

# Remove original letters in word list.
for _ in range(s, i):
del words[s]

# Replace them with new word grouping.
words.insert(s, "".join(acr_str))

return s


def is_valid_acronym(a_string: str) -> bool:
if not a_string:
return False

for a_char in a_string:
if is_separator_char(a_char):
return False

return True


def normalize_acronyms(unsafe_acronyms: list[str]) -> list[str]:
"""Validates and normalizes acronyms to upper-case.

Arguments:
unsafe_acronyms (list of str): Acronyms to be sanitized

Returns:
list of str: Sanitized acronyms

Raises:
InvalidAcronymError: Upon encountering an invalid acronym
"""
acronyms = []
for acr in unsafe_acronyms:
if not is_valid_acronym(acr):
raise InvalidAcronymError(acr)
acronyms.append(acr.upper())
return acronyms
Loading