Conversation
Co-authored-by: deacon-mp <61169193+deacon-mp@users.noreply.github.com>
…iency Co-authored-by: deacon-mp <61169193+deacon-mp@users.noreply.github.com>
Co-authored-by: deacon-mp <61169193+deacon-mp@users.noreply.github.com>
|
There was a problem hiding this comment.
Pull Request Overview
This PR implements comprehensive HTTP retry logic with exponential backoff to resolve network resiliency issues with the sandcat agent when running in environments with Zscaler or similar network interference tools.
Key changes include:
- Addition of retry configuration with maximum 3 attempts and exponential backoff (2s → 4s → 8s with jitter)
- Smart error classification to distinguish between retryable and non-retryable errors
- Implementation of retry logic across all HTTP operations (beacon, payload downloads, file uploads, execution results)
Reviewed Changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| gocat/contact/api.go | Core implementation of HTTP retry logic with exponential backoff, error classification functions, and retry loops for all HTTP operations |
| gocat/contact/contact_test.go | Test cases for retry functionality including status code classification, error handling, and delay calculation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| requestBody := bytes.Buffer{} | ||
| contentType, err := createUploadForm(&requestBody, data, uploadName) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
This line was changed from return nil to return err, but the function signature indicates it should return an error. However, the original return nil suggests this might have been intended to return a nil error on success, not the error itself.
| delay = maxRetryDelay | ||
| } | ||
| // Add jitter to prevent thundering herd | ||
| jitter := time.Duration(rand.Intn(1000)) * time.Millisecond |
There was a problem hiding this comment.
Using math/rand without seeding can produce predictable sequences. Consider using crypto/rand for better randomness or seed math/rand with rand.Seed(time.Now().UnixNano()) to ensure different jitter patterns across agent instances.
|
Addressing Copilot feedback: The math/rand usage should use crypto/rand for jitter to ensure non-predictable sequences across agent instances. |


Problem
When the sandcat agent runs on devices with Zscaler ZIA or similar network interference tools, occasional network errors cause operations to freeze indefinitely without retry or failure reporting. This manifests as:
Failed to decode HTTP response: illegal base64 data at input byte 0Failed to perform HTTP request: Post "https://xxxxx/beacon": read tcp 10.XX.XXX.XXX:58794->XXX.XXX.XX.X:443: wsarecv: A connection attempt failed...Solution
This PR implements comprehensive HTTP retry logic with exponential backoff to handle temporary network failures gracefully.
Key Changes
1. Retry Configuration
2. Smart Error Classification
3. Comprehensive Coverage
All HTTP operations now include retry logic:
GetBeaconBytes)GetPayloadBytes)UploadFileBytes)SendExecutionResults)4. Enhanced Logging
[!] HTTP request failed (attempt 1/4): connection refused. Retrying in 2.322s[+] HTTP request succeeded on attempt 3Testing
The implementation has been thoroughly tested with a mock server that simulates network failures:
Impact
This resolves the core issue where network instability would cause sandcat operations to freeze indefinitely, improving overall agent reliability in enterprise environments with network security tools.
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
224.0.0.251If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.