Skip to content

fix: make streaming chunk sizes consistent across filesystem and IPFS#979

Open
odesenfans wants to merge 2 commits into
mainfrom
od/harmonize-streaming-chunk-sizes
Open

fix: make streaming chunk sizes consistent across filesystem and IPFS#979
odesenfans wants to merge 2 commits into
mainfrom
od/harmonize-streaming-chunk-sizes

Conversation

@odesenfans
Copy link
Copy Markdown
Collaborator

Explain what problem this PR is resolving

Related Clickup or Jira tickets : ALEPH-XXX

Self proofreading checklist

  • Is my code clear enough and well documented
  • Are my files well typed
  • New translations have been added or updated if new strings have been introduced in the frontend
  • Database migrations file are included
  • Are there enough tests
  • Documentation has been included (for new feature)

Changes

Explain the changes that were made. The idea is not to list exhaustively all the changes made (GitHub already provides a full diff), but to help the reviewers better understand:

  • which specific file changes go together, e.g: when creating a table in the front-end, there usually is a config file that goes with it
  • the reasoning behind some changes, e.g: deleted files because they are now redundant
  • the behaviour to expect, e.g: tooltip has purple background color because the client likes it so, changed a key in the API response to be consistent with other endpoints

How to test

Explain how to test your PR.
If a specific config is required explain it here (account, data entry, ...)

Print screen / video

Upload here print screens or videos showing the changes if relevant.

Notes

Things that the reviewers should know: known bugs that are out of the scope of the PR, other trade-offs that were made.
If the PR depends on a PR in another repo, or merges into another PR (i.o. main), it should also be mentioned here

Copy link
Copy Markdown

@foxpatch-aleph foxpatch-aleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR successfully harmonizes streaming chunk sizes to 128 KB across both the filesystem and IPFS backends, and also cleans up the questionable P2P non-streaming fallback in get_hash_content_iterator. The core changes are correct and the simplification is welcome. However, removing the P2P fallback left three parameters (use_network, timeout, tries) in get_hash_content_iterator's signature that are now completely dead — they are accepted but never consulted. This makes the API actively misleading and needs to be addressed. There are also no tests for either the new chunk-size propagation or the changed error path.

src/aleph/storage.py (line 239): After removing the P2P fallback, the parameters use_network, timeout, and tries are no longer used anywhere in get_hash_content_iterator. They are still part of the method signature, so callers passing them will silently get no effect. These should either be removed from the signature (with callers updated) or, if backwards compatibility is needed in the short term, at least decorated with a deprecation warning. Leaving dead parameters in place is actively misleading.

src/aleph/storage.py (line 39): The choice of 128 KB is not explained. Previously the filesystem engine defaulted to 1 MB and IPFS to 16 KB. A short comment explaining the rationale (e.g. a balance between memory pressure and number of syscalls/round-trips) would help future maintainers.

src/aleph/services/ipfs/service.py (line 21): fetch_raw_cid_streamed is a module-level function whose signature changed: chunk_size is now required and appears before the optional params. This is the correct Python convention (required before optional), but any caller outside this module that was passing chunk_size by position or relying on its default will break silently. Consider adding a leading underscore (_fetch_raw_cid_streamed) to signal it is an internal helper, or at least verify there are no external callers.

src/aleph/storage.py (line 236): No tests are included for the changed behavior: (1) chunk size is now propagated correctly to both the filesystem and IPFS iterators, and (2) when content is absent from both DB and IPFS, ContentCurrentlyUnavailable is raised directly instead of attempting a P2P non-streaming fallback. Both paths are worth covering with unit tests, especially since the P2P fallback removal is a behavior change that could affect callers of the streaming endpoint.

Copy link
Copy Markdown

@foxpatch-aleph foxpatch-aleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk-size harmonization is correct and the constant STREAM_CHUNK_SIZE is a clear improvement. However, the PR silently removes the P2P fallback path in get_hash_content_iterator, which is a functional regression for content only reachable via P2P. This behavioral change is not mentioned in the PR description, and there are no tests covering the streaming path before or after the change. The source-assignment ordering also has a subtle issue worth addressing.

src/aleph/storage.py (line 269): The P2P fallback was removed without explanation. Previously, if content was not in DB or IPFS, the code fell back to get_hash_content (which supports P2P) and wrapped the bytes in an async generator. Now, content that is only reachable via P2P will immediately raise ContentCurrentlyUnavailable. The removed comment even noted 'However, for GET /storage/raw/ we really want streaming' and 'keeps it working'. If dropping P2P support here is intentional, it must be documented in the PR description and the callers (e.g. GET /storage/raw/) should be audited for the impact. If it is not intentional, the fallback needs to be restored.

src/aleph/storage.py (line 267): source = ContentSource.IPFS is set unconditionally before checking whether content_iterator is None. In practice get_ipfs_content_iterator never returns None (it returns an async generator object whose errors only surface on iteration), so the current if content_iterator is None guard below is never triggered for the IPFS branch. This is harmless today but fragile: if get_ipfs_content_iterator is ever changed to return None on error, source will already be wrong. Moving the assignment inside a guard (if content_iterator is not None: source = ContentSource.IPFS) would be cleaner and match the style of the DB branch above.

src/aleph/services/ipfs/service.py (line 21): Making chunk_size a required positional argument (removing the = 16 * 1024 default) is a breaking change for any caller outside this module. The single internal caller in this PR passes it correctly, so the immediate impact is zero, but if this function is part of a broader internal API surface, consider whether removing the default is appropriate or if the old default should be replaced by a reference to STREAM_CHUNK_SIZE from storage.py. At minimum, this change should be called out explicitly.

src/aleph/storage.py (line 36): The chunk size increased 8× (16 KB → 128 KB). For the IPFS streaming path this directly controls how much memory is buffered per in-flight request. With many concurrent streaming requests the aggregate RSS could grow significantly. This may be fine, but please confirm the expected concurrency level and add a comment explaining the rationale for 128 KB (e.g., alignment with filesystem block size, benchmarking result, etc.).

tests/ (line 1): There are no tests for get_hash_content_iterator before or after this change. Given that the P2P fallback was removed and chunk sizes were modified, at minimum the following cases should be covered: (1) content found in DB returns a streaming iterator with correct chunks, (2) content found in IPFS returns a streaming iterator, (3) content not found in either raises ContentCurrentlyUnavailable. The PR checklist item 'Are there enough tests' is unchecked.

Copy link
Copy Markdown

@foxpatch-aleph foxpatch-aleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk-size harmonization is clean and correct — moving from an inconsistent 16 KB IPFS default to an explicit 128 KB constant shared by both backends is a straightforward improvement. The parameter reordering and making chunk_size required are sensible API hygiene. However, the PR silently removes the P2P/network fallback path in get_hash_content_iterator, which is a meaningful behavioral regression that is not mentioned anywhere in the description. Any content reachable only via P2P will now unconditionally raise ContentCurrentlyUnavailable on the streaming endpoint instead of being fetched and wrapped. If this is intentional (e.g., P2P streaming is being phased out), it needs a comment and a note in the PR; if it is a side-effect of the refactor, it should be reverted or handled separately.

src/aleph/storage.py (line 271): The P2P/network fallback (get_hash_content_iterator()) has been deleted with no explanation. Previously, content that was unavailable from DB and IPFS but reachable via P2P would still be served; now the caller immediately gets ContentCurrentlyUnavailable. This is a silent behavioral regression. If dropping P2P fallback is intentional, please (a) add a comment explaining why, (b) mention it explicitly in the PR description, and (c) verify that no callers of get_hash_content_iterator relied on this path. If it is unintentional, restore the fallback block.

src/aleph/services/ipfs/service.py (line 23): Reordering chunk_size before params (and dropping its default) is fine for an internal function, but worth noting: any future caller that passed params positionally would silently break. Since both parameters are now positional-or-keyword with no defaults, consider adding a * to force keyword-only usage: async def fetch_raw_cid_streamed(aioipfs_client, *, chunk_size, params=None). This makes the intent clearer and prevents accidental positional misuse.

src/aleph/services/ipfs/service.py (line 216): Minor pre-existing issue (not introduced here, but worth noting while touching this function): get_ipfs_content_iterator is async def and returns the result of calling fetch_raw_cid_streamed(...) without await. If fetch_raw_cid_streamed is a regular async def (not an async generator), the caller receives a coroutine object rather than an AsyncIterable[bytes], which would silently fail at iteration time. Worth verifying the runtime type being returned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants