escape_chars: add support for nonprintable#201
Conversation
|
I think we can do better even than this, but this is so much more useful to me that I've started here. I'm going to think about just what I want. But I think it's something like a mapping from character sets to how to escape them. When I dump things, I want (Maybe the current patch should use |
|
https://github.com/garu/Data-Printer/pull/101/files looks like it is at least similar in concept to what I was thinking. |
|
I like this PR. It solves a real problem in the simplest way possible. |
The current options for
escape_charsare not enough for my needs. Lately, I'm dumping strings like this:Why? None of your business! (Well, actually, it's part of the db format in the Cyrus IMAP server.)
Anyway, the
nonasciiandnonlatin1escape rules are very permissive. I would say that they're almost never what somebody wants. They always show the NUL byte as\0, but other control characters are passed through, meaning that they mostly become invisible without a hex dumper. Ugh! Also, bothnonasciiandnonlatin1are annoying if I am dumping something with CJK, when I want to see방탄소년단dumped correctly, but I still want\x18for CANCEL.This introduces the
nonprintableoption forescape_chars, which will escape the 67 Unicode codepoints that are notPrint, of which 65 are in the Latin-1 space. This will do much, much better at dealing with data with control characters. This is actually exactly "all the Control characters plus two extremely rare whitespace characters."