Skip to content

number_to_words() with symbols #230

@PipWasHere

Description

@PipWasHere

If the functionality of number_to_words() is expected to expand, I suggest including some awareness of symbols, especially with those symbols most commonly used with numbers, such as '$', '%', 'º', '/', ':', and the quotes.

Positive/Negative value fixes

Negative values should have a way to alter the prefix and/or add a suffix:
>>> number_to_words(-33)
'minus thirty-three'
>>> number_to_words("-33")
'minus thirty-three'
Optional returns could include 'negative thirty-three', 'thirty-three down'

Similarly, for the leading 'plus':
>>> number_to_words("+33")
'plus thirty-three'
Optional returns could include 'positive thirty-three', 'thirty-three up', or nothing.

Common symbol translations other than 'plus' or 'minus'

Values with unit symbols simply lose them:
>>> number_to_words("33%")
'thirty-three' ... 'percent' suffix lost
>>> number_to_words("$33")
'thirty-three' ... 'dollars' qualifier lost
number_to_words("33º")
'thirty-three' ... 'degrees' suffix lost
>>> number_to_words("33#")
'thirty-three' ... 'pounds' suffix lost
>>> number_to_words("#33")
'thirty-three' ... 'number' or 'hashtag' prefix lost
>>> number_to_words("+/-33")
'plus thirty-three' ... 'or minus' portion of prefix lost
>>> number_to_words("33'")
'thirty-three' ... 'feet' or 'minutes' suffix lost
>>> number_to_words("33""")
'thirty-three' ... 'inches' or 'seconds' suffix lost

Although, not as critical since they are separate numbers, these situations can also arise:
>>> number_to_words("3:2")
'thirty-two' ... 'to', possible 'ratio' or 'odds' suffix, and individuality of numbers are lost
>>> number_to_words("3 + 2")
'thirty-two' ... 'plus' and individuality of numbers are lost
>>> number_to_words("33º22'11""", andword="") (no 'and' to get proper EN_US version)
'three hundred thirty-two thousand, two hundred eleven'
... 'degrees', 'minutes', 'seconds' as well as individuality of numbers are lost

I understand why it ignores spaces and commas, but other than '+', '-', and '.', it treats all symbols as meaningless, which they are not, and does not handle repetitious or multiple symbols, such as used in dates:

>>> number_to_words("22.11.33")
'twenty-two point one one three three' ... second '.' is ignored
>>> number_to_words("11/22/2033")
'eleven million, two hundred and twenty-two thousand and thirty-three' ... slashes are ignored
Dates themselves, of course, are a whole other area of complication, but the symbols are still being lost.

Other usages lost

>>> number_to_words("2a12")
'two hundred and twelve' ... hex digits are lost
>>> number_to_words("0010.0011.0100.0111")
'ten point zero zero one one zero one zero zero zero one one one' ... leading zeros and other 'dots' are lost
>>> number_to_words("2/3")
'twenty-three' ... 'two thirds' is misinterpreted or could have been a mathematical division
>>> number_to_words("1-1/2")
'one hundred and twelve' ... whole and fractional values misinterpreted
'>>> number_to_words("4!")
'four' ... 'factorial' suffix lost

Final thoughts

The symbols that might benefit best from this library are those with pluralized written forms: dollars, feet, inches, degrees, seconds, minutes, pounds, etc., where the numeric value would be significant, e.g., "33º", '12"', "1¢", "$1", and so on.

The translation that might be most useful would be the option to translate fractional values or ratios, e.g. "2/3", "1-1/2", "3:2", "1000:1", etc.

Translating mathematical symbolism could open up another whole toolbox.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions