|
| 1 | +From your **tcpdump output**, the **8-byte UDP packet** from the client (port 50290) to the server (port 13400) **is arriving** at the loopback interface: |
| 2 | + |
| 3 | +``` |
| 4 | +17:31:49.091339 lo In IP localhost.50290 > localhost.13400: UDP, length 8 |
| 5 | + 0x0020: 04fb 0001 0000 0000 |
| 6 | +``` |
| 7 | +This is a **DoIP Vehicle Identification Request** (type `0x0001`), and it is **reaching the kernel**. |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +### **Root Cause Analysis** |
| 12 | +Since the packet is visible in `tcpdump` but **not logged by your server**, the issue is **almost certainly in your server code**. Here’s what’s happening: |
| 13 | + |
| 14 | +#### **1. The Server Socket is Not Receiving the Packet** |
| 15 | +- The server socket is bound to port 13400, but **something is preventing `recvfrom` from returning the packet**. |
| 16 | +- Possible reasons: |
| 17 | + - The socket is **not in the expected state** (e.g., closed, re-bound, or filtered). |
| 18 | + - The socket is **not the same as the one bound to port 13400** (e.g., `m_udp_sock` was overwritten or closed). |
| 19 | + - The socket is **blocked by a firewall or SELinux** (unlikely for loopback, but possible). |
| 20 | + - The socket is **not actually bound to port 13400** (e.g., `bind` failed silently). |
| 21 | + |
| 22 | +#### **2. The Server is Stuck in a Timeout Loop** |
| 23 | +- Your logs show `recvfrom returned: -1 (errno: 11)` repeatedly. |
| 24 | +- This means `recvfrom` is **timing out** (`EAGAIN`), but the packet is **not being delivered to userspace**. |
| 25 | +- This suggests the socket is **not properly bound** or is **not the same socket** as the one receiving the packet. |
| 26 | + |
| 27 | +--- |
| 28 | + |
| 29 | +### **Debugging Steps** |
| 30 | +#### **1. Verify the Socket is Bound Correctly** |
| 31 | +Add this **right after `bind()`** in `setupUdpSocket()`: |
| 32 | +```cpp |
| 33 | +// After bind() |
| 34 | +sockaddr_in bound_addr; |
| 35 | +socklen_t bound_addr_len = sizeof(bound_addr); |
| 36 | +getsockname(m_udp_sock, (sockaddr*)&bound_addr, &bound_addr_len); |
| 37 | +LOG_UDP_DEBUG( |
| 38 | + "Socket {} bound to {}:{}", |
| 39 | + m_udp_sock, |
| 40 | + inet_ntoa(bound_addr.sin_addr), |
| 41 | + ntohs(bound_addr.sin_port) |
| 42 | +); |
| 43 | +``` |
| 44 | +- If this logs `0.0.0.0:13400` or `127.0.0.1:13400`, the socket is bound correctly. |
| 45 | +- If not, `bind()` failed or was not called. |
| 46 | + |
| 47 | +#### **2. Check for Socket Descriptor Leaks or Overwrites** |
| 48 | +- Log `m_udp_sock` in `receiveUdpMessage()`: |
| 49 | + ```cpp |
| 50 | + LOG_UDP_DEBUG("Using socket fd: {}", m_udp_sock); |
| 51 | + ``` |
| 52 | +- If the value changes between `setupUdpSocket()` and `receiveUdpMessage()`, something is overwriting it. |
| 53 | +
|
| 54 | +#### **3. Test with a Minimal UDP Server** |
| 55 | +Replace your `receiveUdpMessage()` with this **minimal test**: |
| 56 | +```cpp |
| 57 | +ssize_t DoIPServer::receiveUdpMessage() { |
| 58 | + char buf[1024]; |
| 59 | + sockaddr_in client_addr; |
| 60 | + socklen_t client_len = sizeof(client_addr); |
| 61 | + ssize_t read_bytes = recvfrom( |
| 62 | + m_udp_sock, buf, sizeof(buf), 0, |
| 63 | + (sockaddr*)&client_addr, &client_len |
| 64 | + ); |
| 65 | + LOG_UDP_DEBUG( |
| 66 | + "recvfrom returned: {} (errno: {})", |
| 67 | + read_bytes, errno |
| 68 | + ); |
| 69 | + if (read_bytes > 0) { |
| 70 | + LOG_UDP_INFO( |
| 71 | + "RX {} bytes from {}:{}", |
| 72 | + read_bytes, |
| 73 | + inet_ntoa(client_addr.sin_addr), |
| 74 | + ntohs(client_addr.sin_port) |
| 75 | + ); |
| 76 | + } |
| 77 | + return read_bytes; |
| 78 | +} |
| 79 | +``` |
| 80 | +- If this works, the issue is in your buffer or processing logic. |
| 81 | +- If it still times out, the socket is **not receiving packets**. |
| 82 | + |
| 83 | +#### **4. Check for Firewall/SELinux** |
| 84 | +Even for loopback, run: |
| 85 | +```bash |
| 86 | +sudo iptables -L -n -v # Check firewall rules |
| 87 | +sudo setenforce 0 # Disable SELinux temporarily |
| 88 | +``` |
| 89 | +Retest. If it works, adjust your firewall/SELinux rules. |
| 90 | + |
| 91 | +#### **5. Use `strace` to Confirm `recvfrom` Behavior** |
| 92 | +Run the server with: |
| 93 | +```bash |
| 94 | +strace -e trace=recvfrom ./your_server |
| 95 | +``` |
| 96 | +- If `recvfrom` is **never called**, your loop is broken. |
| 97 | +- If it returns `-1 EAGAIN` repeatedly, the socket is **not receiving packets**. |
| 98 | + |
| 99 | +#### **6. Check for Multiple Sockets** |
| 100 | +- If your code creates **multiple UDP sockets**, ensure the correct one is used for `recvfrom`. |
| 101 | +- Log all socket creation and `bind()` calls. |
| 102 | + |
| 103 | +--- |
| 104 | + |
| 105 | +### **Most Likely Fixes** |
| 106 | +1. **The socket is not bound to port 13400.** |
| 107 | + - Confirm `bind()` succeeds and logs the correct port. |
| 108 | +2. **`m_udp_sock` is overwritten or closed.** |
| 109 | + - Log its value in `setupUdpSocket()` and `receiveUdpMessage()`. |
| 110 | +3. **The socket is non-blocking and the timeout is too short.** |
| 111 | + - Try removing `SO_RCVTIMEO` temporarily to see if packets arrive. |
| 112 | + |
| 113 | +--- |
| 114 | + |
| 115 | +### **Summary** |
| 116 | +- The packet **is arriving at the kernel** (confirmed by `tcpdump`). |
| 117 | +- The server **is not receiving it**, likely due to: |
| 118 | + - Socket not bound correctly. |
| 119 | + - Wrong socket file descriptor used in `recvfrom`. |
| 120 | + - Firewall/SELinux blocking delivery to userspace. |
| 121 | + |
| 122 | +**Next step:** |
| 123 | +Add the debug logs for `m_udp_sock` and `getsockname()` as shown above. If the socket is correct, test with the minimal `recvfrom` code. If it still fails, the issue is **outside your code** (firewall, SELinux, or kernel networking stack). |
0 commit comments