Skip to content

Corrupted file when a field has non-ascii characters #6

@bcharron

Description

@bcharron

When trying to create an mmdb with non-ascii characters, the file produced cannot be read. It's like the offsets are wrong..

I think it's because the offset written to file assume that the python string length is the same as the output bytes when a string is encoded to utf-8.

Setting the length from the encoded string seems to produce the correct result at https://github.com/cloudflare/py-mmdb-encoder/blob/master/mmdbencoder/__init__.py#L346

length = len(value.encode('utf-8'))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions