Skip to content

Why is RAPIDJSON_48BITPOINTER_OPTIMIZATION x86-64-specific when ARM64 also has 48-bit virtual addresses? #2354

@adjgh

Description

@adjgh

I've recently been studying the RapidJSON source code and am particularly interested in the RAPIDJSON_48BITPOINTER_OPTIMIZATION macro. I've done some thinking on it myself, but unfortunately, the original author Milo Yip has deactivated his Zhihu account, so I can no longer reach him. I'm posting my question here—hopefully some experts can help clarify my confusion.
I understand that this macro is primarily designed to save memory per node. For example, on 32-bit systems, the Data structure takes 16 bytes, while on standard 64-bit systems it takes 24 bytes—due to an extra 8 bytes needed to store metadata like flags. With this optimization enabled, however, 8 bytes are saved.
I noticed that RAPIDJSON_48BITPOINTER_OPTIMIZATION is only enabled for x86-64 architectures. My understanding is that although x86-64 uses 64-bit pointers, Intel only implements 48 bits for virtual addressing (with canonical addressing), leaving the upper 16 bits available for other uses. RapidJSON leverages this by storing metadata (such as flags) in those otherwise-unused high bits.
For example, in the source code, there's a structure like this:
struct Flag { #if RAPIDJSON_48BITPOINTER_OPTIMIZATION char payload[sizeof(SizeType) * 2 + 6]; // 2 x SizeType + lower 48-bit pointer #elif RAPIDJSON_64BIT char payload[sizeof(SizeType) * 2 + sizeof(void*) + 6]; // 6 padding bytes #else char payload[sizeof(SizeType) * 2 + sizeof(void*) + 2]; // 2 padding bytes #endif uint16_t flags; };
And another structure:
struct String { SizeType length; SizeType hashcode; //!< reserved const Ch* str; }; // 12 bytes in 32-bit mode, 16 bytes in 64-bit mode
These two are part of a union Data, meaning they share the same memory layout:
union Data { String s; ShortString ss; Number n; ObjectData o; ArrayData a; Flag f; }; // 16 bytes in 32-bit mode, 24 bytes in 64-bit mode, 16 bytes in 64-bit with
Due to x86-64 being little-endian, the data stored in the Flag::payload (specifically the lower 48 bits of the pointer and metadata) naturally aligns with the actual layout of the str pointer in the String struct—where the high 16 bits of the 64-bit pointer are unused. Since the metadata is placed in what would be the upper 16 bits of the pointer, no explicit bit-shifting is needed—the layout "just works" thanks to the union and struct padding.
However, here's my question: ARM64 (AArch64) also uses only 48-bit virtual addresses (in the standard case), leaving the upper 16 bits of a pointer free, and it's also little-endian. So why isn't this optimization enabled for ARM64?
Is this limitation due to assumptions about:

  • ABI differences?

  • Lack of guaranteed canonical addressing on ARM64?

  • Portability concerns (e.g., some ARM64 implementations using 52-bit addresses with LVA)?

  • Or simply because the optimization hasn't been implemented/tested for ARM yet?

I'd really appreciate any insights into the reasoning behind this x86-64-specific restriction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions