Firmware JSON document memory management#117
Open
brian-r-calder wants to merge 17 commits into
Open
Conversation
We ran into issues with allocation failures when attempting to set up SSL connections. After a lot of searching and trial-and-error, we realized that as the number of log files on the SD card grew, the size of the JSON document returned by GenerateFilelist() grew to consume a significant chunk of heap memory. This document was resident while the upload loop ran, which is the exact time we needed the memory most. Once this was realized, we almost immediately notices that we didn't need this JSON at all, as the only thing we were pulling from it was the file numbers, something we can easily get from a CountLogFiles() call.
While dealing with heap shortages, I realized that we were always holding onto 4 KiB array, even if only the first few entries were actually populated. Four kibibytes isn't a massive amount, but when running near capacity, it might be enough to push us over the edge into SSL allocation failures. To do this shrink-to-fit operation consistently, I replaced one overload of logging::Manager::CountLogFiles() with a new GetLogFileNumbers() method that returns a vector, which provides much nicer ergonomics to the caller.
Previously any missing field would cause the whole row to be dropped. Given that MD5 hashes aren't guaranteed to be present, that felt like overkill.
My goal is to pull the file list out of the status JSON document, which will require calculating the total file count and size separately from the full file list.
I'm running into situations where a large JSON document is being successfully generated, but then allocation fails when trying to copy it into ESP32WiFiAdapter::m_messages. Move semantics to the rescue!
Keeping the document itself on the heap only introduced complexity which we didn't really need. Also store the default buffer size in a named constant.
This document's capacity increased every time a large message was sent, but that memory was never recovered. So if we succeeded in sending a giant file list, ESP32WiFiAdapter would hold onto that memory forever.
…ument This is a subtle dance that was repeated a few times. All of those copies were missing a std::move() call that significantly reduces the peak memory usage of the operation. To me that justifies abstracting it out.
Given that our large JSON documents can grow to fill temporarily fill all available memory, a more conservate growth factor seems appropriate. 1. Less aggressive growth means we are more likely to "just" fit in memory. 2. A growth factor less than the golden ratio allows the possibility of reusing memory blocks the document previously occupied. A growth factor of 2 prevents this.
There was a lot of unnecessary repetition between commands. Other improvements: * Serializing directly to Serial instead of allocating a large temporary string, like some jobs were doing. * Using Serial.println() to print the trailing new line instead of appending a single character to a large string, potentially resulting in another large allocation and copy. * Avoid some unnecessary JsonDocument->String->JsonDocument round-trips.
Just include the total file count and total file size. The file list JSON just isn't guaranteed to fit in memory, so it doesn't make sense to try to cram it inside the status JSON. A failed allocation means no status is sent, which is much more problematic than an incomplete file list. Moving "catalog" out into its own command means the WIBL can serve the status, release memory, then serve the file list. This approach also lets us handle a partial file list by appending a row of ellipses to the file table.
This merges a PR to address memory usage requirements in the logger firmware which can cause it to hold on to too much memory for JSON documents (and messages being sent back to the WiFi client), which has the knock-on effect of stopping the system from setting up SSL blocks for upload. This address issue #107
…ure on check-in. Bumped version to 1.6.2.
selimnairb
requested changes
May 26, 2026
Member
selimnairb
left a comment
There was a problem hiding this comment.
Some minor changes in the firmware code. I have some concerns about the changes to the UploadServer README, as detailed in my comments.
…st) to v.4.8.1 so that we can run without having to use a license.
selimnairb
requested changes
May 28, 2026
Member
selimnairb
left a comment
There was a problem hiding this comment.
One minor change leftover from the last round, else I think this is ready.
| uint32_t CountLogFiles(uint32_t filenumbers[MaxLogFiles]); | ||
| /// \brief Count the number of log files on the system | ||
| uint32_t CountLogFiles(void); | ||
| uint32_t CountLogFiles(uint64_t * fileSize = nullptr); |
Member
There was a problem hiding this comment.
Should be *fileSize instead of * fileSize.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR includes the code contributed in PR #108 from Spatialnetics, which will close issue #107 when merged.
The problem here is that the JSON documents being constructed for status information can become large if there are large numbers of files on the logger for any reason, and the firmware can either run out of memory to hold the whole document, or might eat up enough of the limited heap space that you can't get a block of memory to generate the SSL data for a secure upload connection. It appears that this was also the thing that was slowing down the web server implementation (or, at least, part of it, since it's much snappier afterwards).
The modifications here beyond PR #108 are to bump the firmware version number, and to fix up the parsing of the new status message at the upload server end of the connection so that check-ins still work as expected.