A UTF-8 aware hex dump utility for modern terminals.
uhexdump extends traditional hex dump tools with Unicode awareness, visual whitespace rendering, and multiple display modes designed for debugging text/binary streams.
It is especially useful when inspecting:
- UTF-8 encoded data
- serial protocols
- mixed binary/text logs
- whitespace-sensitive formats
- Python indentation
- corrupted data streams
- UTF-8 aware decoding
- Unicode Control Pictures for control characters
- visible whitespace visualization
- Python indentation detection
- UTF-8 highlighting
- grouped hex output
- multiple display modes
- ANSI colored output
- pipe-friendly CLI tool
Traditional hex dump layout.
offset hex bytes text column
Example:
./uhexdump.py utf8_test.txt
Features visible:
- UTF-8 characters rendered correctly
- invalid sequences marked
- control characters visualized
Two aligned rows per block.
hex bytes
aligned characters
Example:
lsb_release -a | ./uhexdump.py --mode dual --show-space
Advantages:
- exact byte alignment
- easy to see UTF-8 continuation bytes
- whitespace clearly visible
Compact stacked representation.
hex row
text row
Example:
date | ./uhexdump.py -m stacked -w 42
Useful for quick stream inspection.
UTF-8 sequences are decoded and displayed as a single character.
Continuation bytes are shown using filler symbols.
Example:
e2 90 8a
Displayed as:
␊ · ·
This allows easy detection of:
- UTF-8 start bytes
- continuation bytes
- broken sequences
Option:
--show-space
Spaces appear as:
␠
Tabs appear as:
⇥
This is extremely useful for debugging whitespace issues.
--indent-mode python
Highlights indentation characters.
Mixed indentation (tabs + spaces) is flagged with a warning marker.
Example:
! 00000010 ...
Clone repository:
git clone https://github.com/kimmiikki/uhexdump.git
cd uhexdump
Make executable:
chmod +x uhexdump.py
Optional dependency for correct character width handling:
pip install wcwidth
uhexdump.py [options] [file]
Input sources:
- file
-(stdin)- pipe
Examples:
cat file.bin | ./uhexdump.py
./uhexdump.py file.bin
./uhexdump.py - < file.bin
| Option | Description |
|---|---|
-m, --mode |
Output format: classic, dual, stacked |
-w, --width |
Bytes per row |
--show-space |
Show spaces as ␠ |
--indent-mode python |
Visualize Python indentation |
--start-offset |
Start dumping from byte offset |
--length |
Limit number of bytes |
--color |
ANSI color mode (auto, always, never) |
--no-text |
Hide text column |
--group |
Group hex bytes into blocks |
--highlight utf8 |
Highlight UTF-8 sequences |
Dump file:
./uhexdump.py file.bin
Pipe input:
cat log.txt | ./uhexdump.py
Highlight UTF-8 sequences:
./uhexdump.py --highlight utf8 file.txt
Large rows:
./uhexdump.py -w 32
Whitespace visualization:
./uhexdump.py --show-space script.py
Python indentation debugging:
./uhexdump.py --indent-mode python script.py
| Feature | hexdump | xxd | uhexdump |
|---|---|---|---|
| UTF-8 decoding | ✗ | ✗ | ✓ |
| control pictures | ✗ | ✗ | ✓ |
| whitespace visualization | ✗ | ✗ | ✓ |
| indentation detection | ✗ | ✗ | ✓ |
| UTF-8 highlighting | ✗ | ✗ | ✓ |
uhexdump/
├─ uhexdump.py
├─ README.md
├─ LICENSE
└─ images/
├─ classic.png
├─ dual.png
└─ combo.png
MIT License
Kim Miikki
Possible future improvements:
- automatic terminal width detection
- binary diff mode
- protocol decoding helpers
- pip install package
- man page


