Skip to content

[DYNAREC] Save temporary registers on the stack before calling PrintT…#3876

Merged
ptitSeb merged 1 commit into
ptitSeb:mainfrom
zengdage:save-restore-temp-register
May 20, 2026
Merged

[DYNAREC] Save temporary registers on the stack before calling PrintT…#3876
ptitSeb merged 1 commit into
ptitSeb:mainfrom
zengdage:save-restore-temp-register

Conversation

@zengdage
Copy link
Copy Markdown
Contributor

…race

The PrintTrace function may modify temporary registers, so we need to push them onto the stack before execution and restore them upon return.

For example, in the RV64 implementation, register t3 stores the comparison result. As its value may be overwritten by PrintTrace, the subsequent jz instruction will use invalid data directly.

[BOX64] 0x3f0000239b: 48 85 C0 test rax, rax
[BOX64] 0x3ff7af34d4: 53 emitted opcodes, inst=2, barrier=0 state=3/1(0), set=3F/80, use=0, need=0/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=1
        03f00e13        ADDI            t3, zero, 0x3f(63)
        45ccaa23        SW              t3, emu_s9, 0x454(1108)
        01087e33        AND             t3, rax_a6, rax_a6
        47ccb423        SD              t3, emu_s9, 0x468(1128)
[BOX64] New Instruction x64:0x3f0000239e, native:0x3ff7af35a8
[BOX64] TRACE ----
        01f80b37        LUI             rip_s6, 0x1f80000(33030144)
        001b0b1b        ADDIW           rip_s6, rip_s6, 0x1(1)
        00db1b13        SLLI            rip_s6, rip_s6, 0xd(13)
        39eb0b13        ADDI            rip_s6, rip_s6, 0x39e(926)
        000b0313        MV              t1, rip_s6
        018cbc23        SD              rbx_s8, emu_s9, 0x18(24)
        029cb023        SD              rsp_s1, emu_s9, 0x20(32)
        028cb423        SD              rbp_s0, emu_s9, 0x28(40)
        05acb823        SD              r10_s10, emu_s9, 0x50(80)
        05bcbc23        SD              r11_s11, emu_s9, 0x58(88)
        072cb023        SD              r12_s2, emu_s9, 0x60(96)
        073cb423        SD              r13_s3, emu_s9, 0x68(104)
        074cb823        SD              r14_s4, emu_s9, 0x70(112)
        075cbc23        SD              r15_s5, emu_s9, 0x78(120)
        0010039b        ADDIW           t2, zero, 0x1(1)
        ffffffb7        LUI             t6, 0xfffff000(-4096)
        7dff8f9b        ADDIW           t6, t6, 0x7df(2015)
        01fbffb3        AND             t6, flags_s7, t6
        020bfb93        ANDI            flags_s7, flags_s7, 0x20(32)
        006b9b93        SLLI            flags_s7, flags_s7, 0x6(6)
        01fbebb3        OR              flags_s7, flags_s7, t6
        097cb023        SD              flags_s7, emu_s9, 0x80(128)
        ff010113        ADDI            sp, sp, 0xfffffff0(-16)
        00513023        SD              t0, sp, 0x0(0)
        02acbc23        SD              rdi_a0, emu_s9, 0x38(56)
        02bcb823        SD              rsi_a1, emu_s9, 0x30(48)
        00ccb823        SD              rdx_a2, emu_s9, 0x10(16)
        00dcb423        SD              rcx_a3, emu_s9, 0x8(8)
        04ecb023        SD              r8_a4, emu_s9, 0x40(64)
        04fcb423        SD              r9_a5, emu_s9, 0x48(72)
        010cb023        SD              rax_a6, emu_s9, 0x0(0)
        096cb423        SD              rip_s6, emu_s9, 0x88(136)
[BOX64]   Table64: 0x5d
        00000f97        AUIPC           t6, 0x0(0)
        3a8fbf83        LD              t6, t6, 0x3a8(936)
        00030593        MV              rsi_a1, t1
        00038613        MV              rdx_a2, t2
        000c8513        MV              rdi_a0, emu_s9
        000f80e7        JALR            ra, t6, 0x0(0)
        00013283        LD              t0, sp, 0x0(0)
        01010113        ADDI            sp, sp, 0x10(16)
        038cb503        LD              rdi_a0, emu_s9, 0x38(56)
        030cb583        LD              rsi_a1, emu_s9, 0x30(48)
        010cb603        LD              rdx_a2, emu_s9, 0x10(16)
        008cb683        LD              rcx_a3, emu_s9, 0x8(8)
        040cb703        LD              r8_a4, emu_s9, 0x40(64)
        048cb783        LD              r9_a5, emu_s9, 0x48(72)
        000cb803        LD              rax_a6, emu_s9, 0x0(0)
        088cbb03        LD              rip_s6, emu_s9, 0x88(136)
        080cbb83        LD              flags_s7, emu_s9, 0x80(128)
        fdfbfb93        ANDI            flags_s7, flags_s7, 0xffffffdf(-33)
        006bdf93        SRLI            t6, flags_s7, 0x6(6)
        020fff93        ANDI            t6, t6, 0x20(32)
        01fbebb3        OR              flags_s7, flags_s7, t6
[BOX64] ----------
[BOX64] 0x3f0000239e: 74 02 jz 0x0000003F000023A2
[BOX64] 0x3ff7af35a8: 55 emitted opcodes, inst=3, barrier=0 state=0/3(0), set=0/0, use=0, need=80/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=2, jmp=5
        140e0463        BEQ             t3, zero, 0x148(328) # +82i(0x3ff7af37c4)
        00000013        NOP

@ptitSeb
Copy link
Copy Markdown
Owner

ptitSeb commented May 20, 2026

Are you sure it's usefull to save all 6 temp register here? That's a lot of opcodes added in this case!

@ksco
Copy link
Copy Markdown
Collaborator

ksco commented May 20, 2026

Nice find, but yeah, you should use get_free_scratch() here.

@zengdage zengdage force-pushed the save-restore-temp-register branch 3 times, most recently from 950cfb9 to 117c0f3 Compare May 20, 2026 15:18
…race

The PrintTrace function may modify temporary registers, so we need to push
them onto the stack before execution and restore them upon return.

For example, in the RV64 implementation, register `t3` stores the comparison
result. As its value may be overwritten by PrintTrace, the subsequent `jz`
instruction will use invalid data directly.

```
[BOX64] 0x3f0000239b: 48 85 C0 test rax, rax
[BOX64] 0x3ff7af34d4: 53 emitted opcodes, inst=2, barrier=0 state=3/1(0), set=3F/80, use=0, need=0/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=1
        03f00e13        ADDI            t3, zero, 0x3f(63)
        45ccaa23        SW              t3, emu_s9, 0x454(1108)
        01087e33        AND             t3, rax_a6, rax_a6
        47ccb423        SD              t3, emu_s9, 0x468(1128)
[BOX64] New Instruction x64:0x3f0000239e, native:0x3ff7af35a8
[BOX64] TRACE ----
        01f80b37        LUI             rip_s6, 0x1f80000(33030144)
        001b0b1b        ADDIW           rip_s6, rip_s6, 0x1(1)
        00db1b13        SLLI            rip_s6, rip_s6, 0xd(13)
        39eb0b13        ADDI            rip_s6, rip_s6, 0x39e(926)
        000b0313        MV              t1, rip_s6
        018cbc23        SD              rbx_s8, emu_s9, 0x18(24)
        029cb023        SD              rsp_s1, emu_s9, 0x20(32)
        028cb423        SD              rbp_s0, emu_s9, 0x28(40)
        05acb823        SD              r10_s10, emu_s9, 0x50(80)
        05bcbc23        SD              r11_s11, emu_s9, 0x58(88)
        072cb023        SD              r12_s2, emu_s9, 0x60(96)
        073cb423        SD              r13_s3, emu_s9, 0x68(104)
        074cb823        SD              r14_s4, emu_s9, 0x70(112)
        075cbc23        SD              r15_s5, emu_s9, 0x78(120)
        0010039b        ADDIW           t2, zero, 0x1(1)
        ffffffb7        LUI             t6, 0xfffff000(-4096)
        7dff8f9b        ADDIW           t6, t6, 0x7df(2015)
        01fbffb3        AND             t6, flags_s7, t6
        020bfb93        ANDI            flags_s7, flags_s7, 0x20(32)
        006b9b93        SLLI            flags_s7, flags_s7, 0x6(6)
        01fbebb3        OR              flags_s7, flags_s7, t6
        097cb023        SD              flags_s7, emu_s9, 0x80(128)
        ff010113        ADDI            sp, sp, 0xfffffff0(-16)
        00513023        SD              t0, sp, 0x0(0)
        02acbc23        SD              rdi_a0, emu_s9, 0x38(56)
        02bcb823        SD              rsi_a1, emu_s9, 0x30(48)
        00ccb823        SD              rdx_a2, emu_s9, 0x10(16)
        00dcb423        SD              rcx_a3, emu_s9, 0x8(8)
        04ecb023        SD              r8_a4, emu_s9, 0x40(64)
        04fcb423        SD              r9_a5, emu_s9, 0x48(72)
        010cb023        SD              rax_a6, emu_s9, 0x0(0)
        096cb423        SD              rip_s6, emu_s9, 0x88(136)
[BOX64]   Table64: 0x5d
        00000f97        AUIPC           t6, 0x0(0)
        3a8fbf83        LD              t6, t6, 0x3a8(936)
        00030593        MV              rsi_a1, t1
        00038613        MV              rdx_a2, t2
        000c8513        MV              rdi_a0, emu_s9
        000f80e7        JALR            ra, t6, 0x0(0)
        00013283        LD              t0, sp, 0x0(0)
        01010113        ADDI            sp, sp, 0x10(16)
        038cb503        LD              rdi_a0, emu_s9, 0x38(56)
        030cb583        LD              rsi_a1, emu_s9, 0x30(48)
        010cb603        LD              rdx_a2, emu_s9, 0x10(16)
        008cb683        LD              rcx_a3, emu_s9, 0x8(8)
        040cb703        LD              r8_a4, emu_s9, 0x40(64)
        048cb783        LD              r9_a5, emu_s9, 0x48(72)
        000cb803        LD              rax_a6, emu_s9, 0x0(0)
        088cbb03        LD              rip_s6, emu_s9, 0x88(136)
        080cbb83        LD              flags_s7, emu_s9, 0x80(128)
        fdfbfb93        ANDI            flags_s7, flags_s7, 0xffffffdf(-33)
        006bdf93        SRLI            t6, flags_s7, 0x6(6)
        020fff93        ANDI            t6, t6, 0x20(32)
        01fbebb3        OR              flags_s7, flags_s7, t6
[BOX64] ----------
[BOX64] 0x3f0000239e: 74 02 jz 0x0000003F000023A2
[BOX64] 0x3ff7af35a8: 55 emitted opcodes, inst=3, barrier=0 state=0/3(0), set=0/0, use=0, need=80/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=2, jmp=5
        140e0463        BEQ             t3, zero, 0x148(328) # +82i(0x3ff7af37c4)
        00000013        NOP
```
@zengdage zengdage force-pushed the save-restore-temp-register branch from 117c0f3 to 10aae81 Compare May 20, 2026 15:39
@ptitSeb ptitSeb merged commit ee42c71 into ptitSeb:main May 20, 2026
28 checks passed
@ptitSeb
Copy link
Copy Markdown
Owner

ptitSeb commented May 20, 2026

Thanks

zengdage added a commit to zengdage/box64 that referenced this pull request May 21, 2026
Fix scratch register corruption caused by non-consecutive flags producer and
consumer when BOX64_DYNAREC_TRACE is enabled, which introduced by ptitSeb#3876.

For example, the `test rax, rax` flags producer stores its flag calculation
operands in scratch register `t3`. The next `mov r14, rax` instruction does
not use scratch registers, but its associated trace code can still overwrite
t3's value. This means we need to reference the flags consumer that is
`jz 0x0000003F0000ABC0` to identify which registers require saving.

```
[BOX64] 0x3f00009f68: 48 85 C0 test rax, rax
[BOX64] 0x3ff7afaef0: 53 emitted opcodes, inst=14, barrier=0 state=3/1(0), set=3F/80, use=0, need=0/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=13
	03f00e13	ADDI            t3, zero, 0x3f(63)
	45ccaa23	SW              t3, emu_s9, 0x454(1108)
	01087e33	AND             t3, rax_a6, rax_a6
	47ccb423	SD              t3, emu_s9, 0x468(1128)
[BOX64] New Instruction x64:0x3f00009f6b, native:0x3ff7afafc4
[BOX64] TRACE ----
[BOX64]  n1:0 n2:0 ----
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	005b0b1b	ADDIW           rip_s6, rip_s6, 0x5(5)
	00db1b13	SLLI            rip_s6, rip_s6, 0xd(13)
	f6bb0b13	ADDI            rip_s6, rip_s6, 0xffffff6b(-149)
...............................................................
...............................................................
...............................................................
	048cb783	LD              r9_a5, emu_s9, 0x48(72)
	000cb803	LD              rax_a6, emu_s9, 0x0(0)
	088cbb03	LD              rip_s6, emu_s9, 0x88(136)
	080cbb83	LD              flags_s7, emu_s9, 0x80(128)
	fdfbfb93	ANDI            flags_s7, flags_s7, 0xffffffdf(-33)
	006bdf93	SRLI            t6, flags_s7, 0x6(6)
	020fff93	ANDI            t6, t6, 0x20(32)
	01fbebb3	OR              flags_s7, flags_s7, t6
[BOX64] ----------
[BOX64] 0x3f00009f6b: 49 89 C6 mov r14, rax
[BOX64] 0x3ff7afafc4: 54 emitted opcodes, inst=15, barrier=0 state=0/3(0), set=0/0, use=0, need=80/80, fuse=0/1, sm=0(0/0), sew@entry=7, sew@exit=7, pred=14
	00080a13	MV              r14_s4, rax_a6
[BOX64] New Instruction x64:0x3f00009f6e, native:0x3ff7afb09c
[BOX64] TRACE ----
[BOX64]  n1:28 n2:0 ----
	ff010113	ADDI            sp, sp, 0xfffffff0(-16)
	01c13023	SD              t3, sp, 0x0(0)
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	01fbebb3	OR              flags_s7, flags_s7, t6
...............................................................
...............................................................
...............................................................
	00013e03	LD              t3, sp, 0x0(0)
	01010113	ADDI            sp, sp, 0x10(16)
[BOX64] ----------
[BOX64] 0x3f00009f6e: 0F 84 4C 0C 00 00 jz 0x0000003F0000ABC0
[BOX64] 0x3ff7afb09c: 67 emitted opcodes, inst=16, barrier=2 state=0/3(0), set=0/0, use=0, need=80/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=15, jmp=out
	020e1463	BNE             t3, zero, 0x28(40) # +10i(0x3ff7afb1a8)
	00000013	NOP

```
zengdage added a commit to zengdage/box64 that referenced this pull request May 22, 2026
…ratch registers

1. Rename macro to SPILL_NF_REGISTERS, add implementation for LA64 and PPC64LE.
2. Modify nat flag register spill logic to now save all scratch registers.

Fix scratch register corruption caused by non-consecutive flags producer and
consumer when BOX64_DYNAREC_TRACE is enabled, which introduced by ptitSeb#3876.

For example, the `test rax, rax` flags producer stores its flag calculation
operands in scratch register `t3`. The next `mov r14, rax` instruction does
not use scratch registers, but its associated trace code can still overwrite
t3's value. This means we need to reference the flags consumer that is
`jz 0x0000003F0000ABC0` to identify which registers require saving. But this is
too complicated. So we went with the simpler approach of saving all scratch
registers, this won't add noticeable performance overhead in trace mode.

```
[BOX64] 0x3f00009f68: 48 85 C0 test rax, rax
[BOX64] 0x3ff7afaef0: 53 emitted opcodes, inst=14, barrier=0 state=3/1(0), set=3F/80, use=0, need=0/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=13
	03f00e13	ADDI            t3, zero, 0x3f(63)
	45ccaa23	SW              t3, emu_s9, 0x454(1108)
	01087e33	AND             t3, rax_a6, rax_a6
	47ccb423	SD              t3, emu_s9, 0x468(1128)
[BOX64] New Instruction x64:0x3f00009f6b, native:0x3ff7afafc4
[BOX64] TRACE ----
[BOX64]  n1:0 n2:0 ----
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	005b0b1b	ADDIW           rip_s6, rip_s6, 0x5(5)
	00db1b13	SLLI            rip_s6, rip_s6, 0xd(13)
	f6bb0b13	ADDI            rip_s6, rip_s6, 0xffffff6b(-149)
...............................................................
...............................................................
...............................................................
	048cb783	LD              r9_a5, emu_s9, 0x48(72)
	000cb803	LD              rax_a6, emu_s9, 0x0(0)
	088cbb03	LD              rip_s6, emu_s9, 0x88(136)
	080cbb83	LD              flags_s7, emu_s9, 0x80(128)
	fdfbfb93	ANDI            flags_s7, flags_s7, 0xffffffdf(-33)
	006bdf93	SRLI            t6, flags_s7, 0x6(6)
	020fff93	ANDI            t6, t6, 0x20(32)
	01fbebb3	OR              flags_s7, flags_s7, t6
[BOX64] ----------
[BOX64] 0x3f00009f6b: 49 89 C6 mov r14, rax
[BOX64] 0x3ff7afafc4: 54 emitted opcodes, inst=15, barrier=0 state=0/3(0), set=0/0, use=0, need=80/80, fuse=0/1, sm=0(0/0), sew@entry=7, sew@exit=7, pred=14
	00080a13	MV              r14_s4, rax_a6
[BOX64] New Instruction x64:0x3f00009f6e, native:0x3ff7afb09c
[BOX64] TRACE ----
[BOX64]  n1:28 n2:0 ----
	ff010113	ADDI            sp, sp, 0xfffffff0(-16)
	01c13023	SD              t3, sp, 0x0(0)
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	01fbebb3	OR              flags_s7, flags_s7, t6
...............................................................
...............................................................
...............................................................
	00013e03	LD              t3, sp, 0x0(0)
	01010113	ADDI            sp, sp, 0x10(16)
[BOX64] ----------
[BOX64] 0x3f00009f6e: 0F 84 4C 0C 00 00 jz 0x0000003F0000ABC0
[BOX64] 0x3ff7afb09c: 67 emitted opcodes, inst=16, barrier=2 state=0/3(0), set=0/0, use=0, need=80/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=15, jmp=out
	020e1463	BNE             t3, zero, 0x28(40) # +10i(0x3ff7afb1a8)
	00000013	NOP

```
zengdage added a commit to zengdage/box64 that referenced this pull request May 22, 2026
…ratch registers

1. Rename macro to SPILL_NF_REGISTERS, add implementation for LA64 and PPC64LE.
2. Modify nat flag register spill logic to now save all scratch registers.

Fix scratch register corruption caused by non-consecutive flags producer and
consumer when BOX64_DYNAREC_TRACE is enabled, which introduced by ptitSeb#3876.

For example, the `test rax, rax` flags producer stores its flag calculation
operands in scratch register `t3`. The next `mov r14, rax` instruction does
not use scratch registers, but its associated trace code can still overwrite
t3's value. This means we need to reference the flags consumer that is
`jz 0x0000003F0000ABC0` to identify which registers require saving. But this is
too complicated. So we went with the simpler approach of saving all scratch
registers, this won't add noticeable performance overhead in trace mode.

```
[BOX64] 0x3f00009f68: 48 85 C0 test rax, rax
[BOX64] 0x3ff7afaef0: 53 emitted opcodes, inst=14, barrier=0 state=3/1(0), set=3F/80, use=0, need=0/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=13
	03f00e13	ADDI            t3, zero, 0x3f(63)
	45ccaa23	SW              t3, emu_s9, 0x454(1108)
	01087e33	AND             t3, rax_a6, rax_a6
	47ccb423	SD              t3, emu_s9, 0x468(1128)
[BOX64] New Instruction x64:0x3f00009f6b, native:0x3ff7afafc4
[BOX64] TRACE ----
[BOX64]  n1:0 n2:0 ----
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	005b0b1b	ADDIW           rip_s6, rip_s6, 0x5(5)
	00db1b13	SLLI            rip_s6, rip_s6, 0xd(13)
	f6bb0b13	ADDI            rip_s6, rip_s6, 0xffffff6b(-149)
...............................................................
...............................................................
...............................................................
	048cb783	LD              r9_a5, emu_s9, 0x48(72)
	000cb803	LD              rax_a6, emu_s9, 0x0(0)
	088cbb03	LD              rip_s6, emu_s9, 0x88(136)
	080cbb83	LD              flags_s7, emu_s9, 0x80(128)
	fdfbfb93	ANDI            flags_s7, flags_s7, 0xffffffdf(-33)
	006bdf93	SRLI            t6, flags_s7, 0x6(6)
	020fff93	ANDI            t6, t6, 0x20(32)
	01fbebb3	OR              flags_s7, flags_s7, t6
[BOX64] ----------
[BOX64] 0x3f00009f6b: 49 89 C6 mov r14, rax
[BOX64] 0x3ff7afafc4: 54 emitted opcodes, inst=15, barrier=0 state=0/3(0), set=0/0, use=0, need=80/80, fuse=0/1, sm=0(0/0), sew@entry=7, sew@exit=7, pred=14
	00080a13	MV              r14_s4, rax_a6
[BOX64] New Instruction x64:0x3f00009f6e, native:0x3ff7afb09c
[BOX64] TRACE ----
[BOX64]  n1:28 n2:0 ----
	ff010113	ADDI            sp, sp, 0xfffffff0(-16)
	01c13023	SD              t3, sp, 0x0(0)
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	01fbebb3	OR              flags_s7, flags_s7, t6
...............................................................
...............................................................
...............................................................
	00013e03	LD              t3, sp, 0x0(0)
	01010113	ADDI            sp, sp, 0x10(16)
[BOX64] ----------
[BOX64] 0x3f00009f6e: 0F 84 4C 0C 00 00 jz 0x0000003F0000ABC0
[BOX64] 0x3ff7afb09c: 67 emitted opcodes, inst=16, barrier=2 state=0/3(0), set=0/0, use=0, need=80/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=15, jmp=out
	020e1463	BNE             t3, zero, 0x28(40) # +10i(0x3ff7afb1a8)
	00000013	NOP

```
ptitSeb pushed a commit that referenced this pull request May 22, 2026
…ratch registers (#3880)

1. Rename macro to SPILL_NF_REGISTERS, add implementation for LA64 and PPC64LE.
2. Modify nat flag register spill logic to now save all scratch registers.

Fix scratch register corruption caused by non-consecutive flags producer and
consumer when BOX64_DYNAREC_TRACE is enabled, which introduced by #3876.

For example, the `test rax, rax` flags producer stores its flag calculation
operands in scratch register `t3`. The next `mov r14, rax` instruction does
not use scratch registers, but its associated trace code can still overwrite
t3's value. This means we need to reference the flags consumer that is
`jz 0x0000003F0000ABC0` to identify which registers require saving. But this is
too complicated. So we went with the simpler approach of saving all scratch
registers, this won't add noticeable performance overhead in trace mode.

```
[BOX64] 0x3f00009f68: 48 85 C0 test rax, rax
[BOX64] 0x3ff7afaef0: 53 emitted opcodes, inst=14, barrier=0 state=3/1(0), set=3F/80, use=0, need=0/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=13
	03f00e13	ADDI            t3, zero, 0x3f(63)
	45ccaa23	SW              t3, emu_s9, 0x454(1108)
	01087e33	AND             t3, rax_a6, rax_a6
	47ccb423	SD              t3, emu_s9, 0x468(1128)
[BOX64] New Instruction x64:0x3f00009f6b, native:0x3ff7afafc4
[BOX64] TRACE ----
[BOX64]  n1:0 n2:0 ----
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	005b0b1b	ADDIW           rip_s6, rip_s6, 0x5(5)
	00db1b13	SLLI            rip_s6, rip_s6, 0xd(13)
	f6bb0b13	ADDI            rip_s6, rip_s6, 0xffffff6b(-149)
...............................................................
...............................................................
...............................................................
	048cb783	LD              r9_a5, emu_s9, 0x48(72)
	000cb803	LD              rax_a6, emu_s9, 0x0(0)
	088cbb03	LD              rip_s6, emu_s9, 0x88(136)
	080cbb83	LD              flags_s7, emu_s9, 0x80(128)
	fdfbfb93	ANDI            flags_s7, flags_s7, 0xffffffdf(-33)
	006bdf93	SRLI            t6, flags_s7, 0x6(6)
	020fff93	ANDI            t6, t6, 0x20(32)
	01fbebb3	OR              flags_s7, flags_s7, t6
[BOX64] ----------
[BOX64] 0x3f00009f6b: 49 89 C6 mov r14, rax
[BOX64] 0x3ff7afafc4: 54 emitted opcodes, inst=15, barrier=0 state=0/3(0), set=0/0, use=0, need=80/80, fuse=0/1, sm=0(0/0), sew@entry=7, sew@exit=7, pred=14
	00080a13	MV              r14_s4, rax_a6
[BOX64] New Instruction x64:0x3f00009f6e, native:0x3ff7afb09c
[BOX64] TRACE ----
[BOX64]  n1:28 n2:0 ----
	ff010113	ADDI            sp, sp, 0xfffffff0(-16)
	01c13023	SD              t3, sp, 0x0(0)
	01f80b37	LUI             rip_s6, 0x1f80000(33030144)
	01fbebb3	OR              flags_s7, flags_s7, t6
...............................................................
...............................................................
...............................................................
	00013e03	LD              t3, sp, 0x0(0)
	01010113	ADDI            sp, sp, 0x10(16)
[BOX64] ----------
[BOX64] 0x3f00009f6e: 0F 84 4C 0C 00 00 jz 0x0000003F0000ABC0
[BOX64] 0x3ff7afb09c: 67 emitted opcodes, inst=16, barrier=2 state=0/3(0), set=0/0, use=0, need=80/80, fuse=1/0, sm=0(0/0), sew@entry=7, sew@exit=7, pred=15, jmp=out
	020e1463	BNE             t3, zero, 0x28(40) # +10i(0x3ff7afb1a8)
	00000013	NOP

```
@zengdage zengdage deleted the save-restore-temp-register branch May 22, 2026 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants