Skip to content

wickedbyte/int-to-uuid-spec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IntToUuid Specification

Bidirectional encoding of integer identifiers into RFC 9562 Version 8 UUIDs.

This repository contains the algorithm specification, canonical test vectors, and implementation guidance for IntToUuid — an encoding scheme that converts a non-negative 64-bit integer and an optional 32-bit namespace into a valid, deterministic, reversible UUID.

Overview

IntToUuid encodes auto-incrementing database IDs as non-sequential UUIDs for use in public-facing contexts (APIs, URLs, external references) where exposing raw integer identifiers would be undesirable. The encoding is performed at runtime and is fully reversible — the UUID does not need to be stored or indexed alongside the original integer.

The algorithm uses XXH3 (64-bit) as its hash primitive, XOR-based obfuscation for the id and namespace values, and a deterministic seed that doubles as a checksum for decode-time validation. The output is a standard RFC 9562 Version 8 UUID, recognized by any compliant UUID parser.

Key Properties

  • Deterministic: The same (id, namespace) pair always produces the same UUID.
  • Reversible: Any UUID produced by the algorithm decodes to the original values.
  • Non-sequential: Consecutive IDs produce visually unrelated UUIDs.
  • Self-validating: Decoding detects UUIDs not produced by this algorithm.

What This Is Not

The integer values are encoded, not encrypted. A party with knowledge of this specification can recover the original values. This scheme mitigates casual user-enumeration attacks but does not provide cryptographic security.

Specification

The full algorithm specification is in SPECIFICATION.md. It covers input constraints, the encoding and decoding algorithms, UUID field layout, seed computation, validation, error conditions, and security considerations.

Test Vectors

The test-vectors.json file contains 36 canonical test vectors generated by the PHP reference implementation. Conforming implementations MUST reproduce these vectors exactly.

The vectors cover zero values, small identifiers, namespace variations, 32-bit boundary values, large identifiers, and maximum value combinations. See Appendix A of the specification for the full annotated listing.

Implementation Guidance

The following recommendations are intended for authors of new implementations in any language.

Use a Value Object for the Integer ID

Rather than passing raw integer scalars into encode/decode functions, wrap the id and namespace in a dedicated value object (or struct, record, data class, etc.) that enforces the input constraints at construction time:

  • id MUST be a non-negative integer, maximum 9,223,372,036,854,775,807 (2⁶³ − 1).
  • namespace MUST be a non-negative integer, maximum 4,294,967,295 (2³² − 1).
  • namespace defaults to 0 when not specified.

This ensures invalid values are rejected early with a clear error, regardless of whether the caller intends to encode immediately or pass the value around first. It also simplifies the return value when decoding a UUID object (or string) into id and namespace integers.

The PHP reference implementation uses an IntegerId class for this purpose.

XXH3 Byte Order

The single most common source of cross-language incompatibility is the byte order of the XXH3 digest. The 64-bit hash integer MUST be serialized as big-endian bytes before use in XOR operations. Verify your implementation against the reference value before running the full test vector suite:

XXH3_64bits("hello, world") = 0x302cd5fba73d006c

As 8 big-endian bytes: 30 2c d5 fb a7 3d 00 6c

XOR Truncation

When encoding the namespace, the hash output is 8 bytes but the namespace is only 4 bytes. The XOR MUST use only the first 4 bytes of the hash. Some languages (like PHP) handle this implicitly; most require explicit slicing.

All Integers Are Big-Endian

All integer-to-bytes conversions use unsigned big-endian (network byte order) encoding. This applies to both the id (64-bit) and namespace (32-bit), as well as the seed and intermediate hash extractions.

Return Type for Decode

The decode function should return the same value object type used for encoding input, establishing a symmetry where decode(encode(integerId)) returns an equivalent value object.

Implementations

Language Package Status
PHP wickedbyte/int-to-uuid Reference Implementation

If you have written a conforming implementation and would like it listed here, please open a pull request adding it to the table above. Implementations should pass all test vectors in test-vectors.json and note any deviations from the specification (e.g., extended id range for unsigned 64-bit languages).

Contributing

Contributions to the specification and test vectors are welcome. Please open an issue for discussion before submitting changes to the algorithm itself, as any modification requires updating all existing implementations.

About

IntToUuid Specification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors