Skip to content

Conversation

@ryanbreen
Copy link
Owner

Summary

  • Fix scheduler desync in yield_current() that was corrupting fork return values
  • Fix register preservation bug in timer interrupt causing RAX=3 on userspace entry
  • Add proper FD cleanup on process termination for correct pipe EOF/EPIPE behavior
  • Add dup2 syscall and comprehensive pipe IPC tests

Changes

Kernel Fixes

Scheduler desync (f59bccd)

  • yield_current() was calling schedule() which updated current_thread without an actual context switch
  • This caused the scheduler to think Thread B was running while Thread A was still active
  • Timer interrupts would save Thread A's registers to Thread B's context, corrupting fork return values
  • Fix: Changed to just call set_need_resched() - actual scheduling happens at interrupt return

Register initialization (8df81c0)

  • timer_entry.asm was overwriting RCX with CS value for privilege check after restoring registers
  • This caused RAX=3 (from 0x33 & 3) instead of 0 on first userspace entry
  • Fix: Save/restore RCX around the privilege level check

Process FD cleanup (8df81c0)

  • Added close_all_fds() to Process::terminate()
  • Ensures pipe reader/writer counts decrement properly when processes exit
  • Fixes pipe EOF detection in concurrent scenarios

Heap size (8df81c0)

  • Increased from 1 MiB to 4 MiB to support concurrent process tests
  • Bump allocator requires headroom since memory only reclaims when ALL allocations freed

New Syscall

  • Added dup2 (syscall 33) for file descriptor duplication

New Tests

  • pipe_fork_test.rs - Pipe IPC across fork with EOF detection
  • pipe_concurrent_test.rs - 4 concurrent writers to single pipe
  • pipe_refcount_test.rs - Reference counting, EOF, EPIPE, dup2 behavior

Test plan

  • All 78/78 boot stages pass
  • pipe_fork_test validates parent/child communication and EOF
  • pipe_concurrent_test validates 4 writers, 12 messages, 384 bytes
  • pipe_refcount_test validates 10 test cases including dup2
  • Register initialization test confirms all GP registers = 0

🤖 Generated with Claude Code

ryanbreen and others added 4 commits December 22, 2025 14:18
yield_current() was calling schedule() which updates self.current_thread,
but no actual context switch happened. This caused the scheduler to get
out of sync with reality:

1. Thread A runs, calls yield (via sys_yield or pipe blocking)
2. yield_current() calls schedule(), returns (A, B)
3. schedule() sets self.current_thread = B
4. No actual context switch - thread A continues running
5. Timer fires, schedule() returns (B, C), thinks B is current
6. Context save stores thread A's registers to thread B's context
7. Thread B's fork return value (RAX) is corrupted with thread A's RAX

The fix is to just set need_resched flag in yield_current(). The actual
scheduling decision and context switch will happen at the next interrupt
return via check_need_resched_and_switch(), which properly:
- Identifies the currently running thread from interrupt context
- Saves that thread's registers
- Restores the target thread's context

This was causing the pipe_fork_test parent to run child code because
the parent's RAX (fork return value = child PID) was being overwritten
with 0 from another thread's register state.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This function had the same bug as yield_current() - it called schedule()
which mutates self.current_thread. Calling get_pending_switch() "just to
peek" at what scheduling decision would be made would corrupt the scheduler
state, causing the same context corruption bug.

Since the function was unused (#[allow(dead_code)]), removing it is the
safest fix. If a peek function is needed in the future, it must be
implemented without calling schedule().

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit addresses multiple interconnected issues in process forking,
pipe handling, and resource cleanup:

Kernel Fixes:
- Fix register preservation in timer_entry.asm: RCX was being overwritten
  with CS value during privilege check, causing RAX=3 on userspace entry
- Add close_all_fds() to Process::terminate() to properly decrement pipe
  reader/writer counts when processes exit
- Add FdTable::drop() as safety net for fd cleanup
- Increase HEAP_SIZE from 1 MiB to 4 MiB to support concurrent process tests
- Add scheduler unit test for yield_current() to prevent regression
- Add architectural constraint comment warning against schedule() for peeking
- Add dup2 syscall (number 33) for file descriptor duplication

New Tests:
- pipe_fork_test.rs: Tests pipe IPC across fork with EOF detection
- pipe_concurrent_test.rs: Tests 4 concurrent writers to single pipe
- pipe_refcount_test.rs: Tests reference counting, EOF, EPIPE, dup2

Test Improvements:
- Add EAGAIN retry loops to pipe tests for proper non-blocking handling
- Update register_init_test to validate all GP registers are zeroed
- Add boot stages 77-78 for pipe fork and concurrent tests

All 78/78 boot stages now pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants