-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Occasionally, the mmap syscall will crash with:
CR0: 0x80050033 CR3: 0xA000
CR2: 0x0 CR4: 0x350E20
RAX: 0x21403000 RBX: 0x0 RCX: 0x1006338
RDX: 0x3 RSI: 0x3F000 RDI: 0x0
RIP: 0x21AA RBP: 0x1198A10 RSP: 0x11989E8
SS: 0x10 CS: 0x8 DS: 0x23 FS: 0x0 GS: 0x0
FS BASE: 0x0 GS BASE: 0x6050
Machine exception: page_at: pt entry not user writable Data: 0x21403000
terminate called after throwing an instance of 'tinykvm::MemoryException'
what(): page_at: pt entry not user writable
with a callstack pointing to
tinykvm/lib/tinykvm/machine_utils.cpp
Line 33 in fe757a7
| auto* page = memory.get_writable_page(addr & ~PageMask(), memory.expectedUsermodeFlags(), true, false); |
collect_state_guest = master_vm.mmap_allocate(0x1000, 0x7, false);
tinykvm::page_at(master_vm.main_memory(), collect_state_guest, [] (uint64_t addr, uint64_t& entry, size_t size) {
// Make the page executable by the user (There is probably a better way to do this?)
entry = entry & ~PDE64_NX | PDE64_DIRTY;
});
// Emulate the relevant mmap
auto new_page = master_vm.mmap_allocate(258048, 3);
master_vm.memzero(new_page, 258048);
is a reduced reproducer, although is a symptom of the issue showing up from a userspace program executing mmap(0x0, 258048, prot=3, flags=22, vfd=-1) = 0x21403000 instead. The collect_state page is an executable memory page that I'm allocating from the VMM - the issue "goes away" if you don't set PDE64_DIRTY in page_at, however removing all of the PDE64_DIRTY flags from the original program still causes a crash in a (slightly later) mmap call instead.
I believe the issue is due to the above page_at resolving to a hugepage that the newly mmap'd region is embedded within, and so it sees that collect_state_guest has the dirty bit set and thus must_be_zeroed = true, but then the later get_writable_page gets the hugepage which has PDE64_NX cleared and fails the flag against vMemory::expectedUsermodeFlags
This is maybe a case of me holding tinykvm wrong, and executable pages should somehow be allocated separately from non-executable pages? But mmap_allocate throws away prot, and so I'm not sure how else I'm supposed to allocate code pages from the VMM. It seems like the executable_heap MachineOption configures !NX everywhere via the vMemory::expectedUsermodeFlags, and so would cause or hide this issue depending on e.g. if you have a dynamic ELF or not and turn it off - but even in the non-executable case it seems like you could get unlucky and have the initial machine mapped .text page for your ELF coalesce with the first user serviced mmap and get sad.