Bidirectional encoding of integer identifiers into RFC 9562 Version 8 UUIDs.
This repository contains the algorithm specification, canonical test vectors, and implementation guidance for IntToUuid — an encoding scheme that converts a non-negative 64-bit integer and an optional 32-bit namespace into a valid, deterministic, reversible UUID.
IntToUuid encodes auto-incrementing database IDs as non-sequential UUIDs for use in public-facing contexts (APIs, URLs, external references) where exposing raw integer identifiers would be undesirable. The encoding is performed at runtime and is fully reversible — the UUID does not need to be stored or indexed alongside the original integer.
The algorithm uses XXH3 (64-bit) as its hash primitive, XOR-based obfuscation for the id and namespace values, and a deterministic seed that doubles as a checksum for decode-time validation. The output is a standard RFC 9562 Version 8 UUID, recognized by any compliant UUID parser.
- Deterministic: The same (id, namespace) pair always produces the same UUID.
- Reversible: Any UUID produced by the algorithm decodes to the original values.
- Non-sequential: Consecutive IDs produce visually unrelated UUIDs.
- Self-validating: Decoding detects UUIDs not produced by this algorithm.
The integer values are encoded, not encrypted. A party with knowledge of this specification can recover the original values. This scheme mitigates casual user-enumeration attacks but does not provide cryptographic security.
The full algorithm specification is in SPECIFICATION.md.
It covers input constraints, the encoding and decoding algorithms, UUID field
layout, seed computation, validation, error conditions, and security
considerations.
The test-vectors.json file contains 36 canonical test
vectors generated by the
PHP reference implementation.
Conforming implementations MUST reproduce these vectors exactly.
The vectors cover zero values, small identifiers, namespace variations, 32-bit boundary values, large identifiers, and maximum value combinations. See Appendix A of the specification for the full annotated listing.
The following recommendations are intended for authors of new implementations in any language.
Rather than passing raw integer scalars into encode/decode functions, wrap the id and namespace in a dedicated value object (or struct, record, data class, etc.) that enforces the input constraints at construction time:
idMUST be a non-negative integer, maximum 9,223,372,036,854,775,807 (2⁶³ − 1).namespaceMUST be a non-negative integer, maximum 4,294,967,295 (2³² − 1).namespacedefaults to 0 when not specified.
This ensures invalid values are rejected early with a clear error, regardless of
whether the caller intends to encode immediately or pass the value around first.
It also simplifies the return value when decoding a UUID object (or string) into
id and namespace integers.
The PHP reference implementation uses an IntegerId class for this purpose.
The single most common source of cross-language incompatibility is the byte order of the XXH3 digest. The 64-bit hash integer MUST be serialized as big-endian bytes before use in XOR operations. Verify your implementation against the reference value before running the full test vector suite:
XXH3_64bits("hello, world") = 0x302cd5fba73d006c
As 8 big-endian bytes: 30 2c d5 fb a7 3d 00 6c
When encoding the namespace, the hash output is 8 bytes but the namespace is only 4 bytes. The XOR MUST use only the first 4 bytes of the hash. Some languages (like PHP) handle this implicitly; most require explicit slicing.
All integer-to-bytes conversions use unsigned big-endian (network byte order) encoding. This applies to both the id (64-bit) and namespace (32-bit), as well as the seed and intermediate hash extractions.
The decode function should return the same value object type used for encoding
input, establishing a symmetry where decode(encode(integerId)) returns an
equivalent value object.
| Language | Package | Status |
|---|---|---|
| PHP | wickedbyte/int-to-uuid | Reference Implementation |
If you have written a conforming implementation and would like it listed here,
please open a pull request adding it to the table above. Implementations should
pass all test vectors in test-vectors.json and note any
deviations from the specification (e.g., extended id range for unsigned 64-bit
languages).
Contributions to the specification and test vectors are welcome. Please open an issue for discussion before submitting changes to the algorithm itself, as any modification requires updating all existing implementations.