Skip to content

Latest commit

 

History

History
122 lines (104 loc) · 6.65 KB

File metadata and controls

122 lines (104 loc) · 6.65 KB

Networking Architecture

1. Overview of the TCP Server

The Redis-lite server utilizes a POSIX-compliant TCP networking stack. It operates fundamentally as a daemon that binds to a specific IP address and port (default: 127.0.0.1:6379), listening for inbound connections. The networking layer is deliberately decoupled from the data execution layer, functioning exclusively to negotiate transport and stream bytes back and forth over the wire.

2. Why TCP Was Chosen

  • Reliability: TCP guarantees ordered, lossless delivery of data packets. For a key-value store, losing bytes in a SET command or reordering tokens would silently corrupt the database state.
  • Stream-Oriented: TCP abstracts away network packets into a continuous byte stream. This allows the internal Parser to trivially tokenize variable-length string commands without worrying about MTU limits or manual packet fragmentation.
  • Connection State: Unlike UDP, TCP maintains persistent sessions, which aligns perfectly with our Thread Pool architecture where a worker handles a client for the duration of its session.

3. Socket Lifecycle

The POSIX socket API dictates a strict lifecycle for server initialization and client management:

       SERVER                          CLIENT
       ======                          ======
    1. socket()
         |
    2. bind()
         |
    3. listen()
         |
    4. accept() <------------------- socket()
         |                              |
         v                              v
      (Blocks) <-------------------- connect()
         |                              |
         v                              v
       recv() <--------------------- send()  [Command]
         |                              |
         v                              v
       send() ---------------------> recv()  [Response]
         |                              |
         v                              v
    5. close() <-------------------- close()
  • socket(): Requests a file descriptor from the OS kernel for IPv4 TCP communication.
  • bind(): Binds the file descriptor to the specific network interface and port (6379).
  • listen(): Instructs the kernel to begin queuing inbound TCP handshakes (up to the BACKLOG limit).
  • accept(): Dequeues a fully established TCP connection, returning a distinct, new socket descriptor dedicated solely to that client.
  • recv() / send(): Streams bytes back and forth over the established client descriptor.
  • close(): Tears down the connection and frees the descriptor back to the OS.

4. Server Startup Sequence

During startup (server_start), the server synchronously executes socket(), sets the SO_REUSEADDR socket option to prevent port exhaustion during rapid restarts, executes bind(), and finally listen(). Crucially, it registers signal handlers (sigaction) for SIGINT before entering the infinite accept() loop.

5. Client Connection Lifecycle (Integration with Thread Pool)

The architecture deliberately isolates the slow accept() loop from the CPU-bound request processing.

[ Kernel TCP Backlog ]
          |
          v
[ Main Thread: accept() ] ---> Returns Client Socket FD (e.g., FD 5)
          |
          v
[ Enqueue to Thread Pool ] ---> Signal Worker Thread
          |
          v
[ Main Thread loops back ] ---> Instantly ready for next accept()

By offloading the accepted socket descriptor immediately into the Producer-Consumer Job Queue, the Main Thread never performs parsing or data retrieval. This entirely eliminates Head-of-Line blocking at the networking layer.

6. Request-Response Protocol

The server speaks a strict plaintext protocol designed for parsing efficiency. Commands are expected in the format: COMMAND ARG1 ARG2\n. Responses are returned directly as strings:

  • Success: OK\n
  • Value: <value>\n
  • Missing: (nil)\n

7. Command Flow (End-to-End)

Client Application
       | (TCP Send)
       v
Worker Thread: recv()  -->  "SET user:1 alice\n"
       |
       v
Parser Module          -->  tokens: ["SET", "user:1", "alice"]
       |
       v
Executor Module        -->  Routes to store_set("user:1", "alice")
       |
       v
Store Module           -->  Acquires Write Lock, inserts into Hash Table
       |
       v
Executor Module        -->  Formats "OK\n"
       |
       v
Worker Thread: send()  -->  Flushes "OK\n" back through the socket
       |
       v
Client Application

8. Error Handling Strategy

System programming demands aggressive error checking. Every single POSIX networking call returns an integer status.

  • Fatal Errors: If socket(), bind(), or listen() fail (e.g., Port already in use), perror is logged and the server gracefully triggers EXIT_FAILURE. The server cannot operate without a listening port.
  • Transient Errors: If send() or recv() fails (e.g., ECONNRESET when a client pulls their ethernet cable), the failure is isolated to that specific Worker Thread. The worker logs the error, safely calls close(fd), and cleanly returns to the Thread Pool to service the next healthy client.

9. Graceful Client Disconnect

When a client application gracefully terminates or calls close(), the recv() call on the server side returns 0 bytes. The Worker Thread detects this 0, breaks out of its execution loop, and explicitly executes close(client_fd) to prevent file descriptor leaks (FD exhaustion), before yielding back to the idle pool.

10. Graceful Server Shutdown (Signals)

If a user presses Ctrl+C, the OS sends SIGINT.

  1. The registered sigaction handler flips a global atomic shutdown_flag.
  2. The accept() loop evaluates the flag and breaks.
  3. The server socket (6379) is aggressively closed to reject new traffic.
  4. The Thread Pool is instructed to destroy itself, allowing active workers to finish their current in-flight requests.
  5. All memory (Hash table, Threads, Queues) is freed cleanly.

11. Scalability and Limitations

  • Current Architecture: The Thread-Pool-per-Connection model scales excellently up to hundreds of concurrent connections, bounded purely by the number of threads allocated and the CPU core count.
  • Limitations: If 10,000 clients connect simultaneously, but 9,990 of them sit idle without sending data, they will exhaust the Thread Pool (e.g., 100 workers). The remaining active clients will be queued indefinitely waiting for an idle worker.
  • Future Improvements: To achieve Cloudflare-level internet scale (millions of concurrent, mostly-idle websockets), the networking layer must transition from a blocking Thread Pool to an Asynchronous Event Loop using epoll (Linux) or kqueue (macOS), completely decoupling active TCP connections from OS threads.