Skip to content

Conversation

@freemans13
Copy link
Contributor

Add Streaming Serialization Methods for Memory-Efficient Large Subtree Processing

Summary

Adds two new methods to SubtreeData that enable memory-efficient streaming serialization and deserialization of transaction data. This allows applications to process very large subtrees (1M+ transactions) without loading all transactions into memory at once.

Motivation

When processing large subtrees with millions of transactions, the existing Serialize() and serializeFromReader() methods require all transactions to be in memory simultaneously. For production workloads with 1M transaction subtrees, this results in multi-GB memory usage per subtree.

With multiple subtrees being processed concurrently, memory consumption becomes prohibitive, leading to OOM issues in production.

Changes

New Methods

WriteTransactionsToWriter(w io.Writer, startIdx, endIdx int) error

  • Writes a specific range of transactions directly to a writer
  • Enables streaming serialization without buffering all transactions in memory
  • Transactions are written sequentially in the specified range
  • Validates that transactions are non-nil before writing

ReadTransactionsFromReader(r io.Reader, startIdx, endIdx int) (int, error)

  • Reads a specific range of transactions from a reader
  • Enables chunked deserialization by reading only part of the data at a time
  • Validates transaction hashes match expected values from subtree structure
  • Returns number of transactions successfully read

Use Case

These methods enable a chunked processing pattern:

// Writing (serialization)
for chunk in chunks {
    LoadTransactionsIntoChunk()
    WriteTransactionsToWriter(writer, chunkStart, chunkEnd)
    ProcessAndReleaseChunk()  // Free memory
}

// Reading (deserialization)  
for chunk in chunks {
    ReadTransactionsFromReader(reader, chunkStart, chunkEnd)
    ProcessChunk()
    ReleaseChunk()  // Free memory
}

@freemans13 freemans13 requested a review from mrz1836 as a code owner December 12, 2025 23:14
@github-actions github-actions bot added fork-pr PR originated from a forked repository requires-manual-review PR or issue requires manual review by a maintainer or security team labels Dec 12, 2025
@github-actions
Copy link
Contributor

👋 Thanks, @freemans13!

This pull request comes from a fork. For security, our CI runs in a restricted mode.
A maintainer will triage this shortly and run any additional checks as needed.

  • 🏷️ Labeled: fork-pr, requires-manual-review
  • 👀 We'll review and follow up here if anything else is needed.

Thanks for contributing to bsv-blockchain/go-subtree! 🚀

@freemans13 freemans13 changed the title Stu/subtree streaming Streaming Serialization Methods for Memory-Efficient Large Subtree Processing Dec 12, 2025
Copy link
Collaborator

@mrz1836 mrz1836 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woot!

@mrz1836 mrz1836 enabled auto-merge (squash) December 12, 2025 23:17
@mrz1836 mrz1836 self-requested a review December 12, 2025 23:20
Copy link
Collaborator

@mrz1836 mrz1836 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few linter issues, see CI

@mrz1836 mrz1836 added the feature Any new significant addition label Dec 12, 2025
@mrz1836 mrz1836 requested a review from galt-tr December 12, 2025 23:21
auto-merge was automatically disabled December 12, 2025 23:21

Head branch was pushed to by a user without write access

@mrz1836 mrz1836 self-requested a review December 12, 2025 23:33
Copy link
Collaborator

@mrz1836 mrz1836 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - however since this was a fork, tests and dep audit checks were not run. They will be run upon merging this branch.

@sonarqubecloud
Copy link

@mrz1836 mrz1836 merged commit dd70219 into bsv-blockchain:master Dec 13, 2025
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Any new significant addition fork-pr PR originated from a forked repository requires-manual-review PR or issue requires manual review by a maintainer or security team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants