Skip to content

Use raw format to GET data from IPFS gateway#24

Open
karolk91 wants to merge 2 commits into
mainfrom
kk-content-mismatch-fix
Open

Use raw format to GET data from IPFS gateway#24
karolk91 wants to merge 2 commits into
mainfrom
kk-content-mismatch-fix

Conversation

@karolk91

Copy link
Copy Markdown

Content can fail to load via the IPFS-gateway backend with: Content hash mismatch for — refusing tampered content (via ipfs-gateway)

The gateway sits behind Cloudflare, which rewrites text/html responses in-flight — Email Address Obfuscation, Auto Minify, Rocket Loader, HTTPS rewrites, beacon injection, and similar. Once the body is modified, the delivered bytes no longer hash to the CID, and content verification correctly rejects them.

In the observed case the trigger was Email Obfuscation stripping the <!--email_off-->…<!--/email_off--> markers around a mailto: link.

This only affects the IPFS-gateway path. The P2P/bitswap path returns the raw block untouched.

The fix

These transforms only apply to text/html, so we request the raw block as a binary type instead of a content-negotiable GET:

const url = `${gateway}/ipfs/${cid}?format=raw`;
const response = await fetch(url, { headers: { Accept: "application/vnd.ipld.raw" } });

The gateway then returns application/vnd.ipld.raw, which Cloudflare passes through verbatim.

Scope and caveats

  • Touches only fetchFromIpfs (single raw-codec 0x55 blocks). dag-pb already uses ?format=car, which is also binary and unaffected.

Example affected site
shawn-hphq24.dot — e.g. https://shawn-hphq24.paseo.li/?chainBackend=rpc-gateway

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Bundle Size Report

Chunks over 500 KB:

File Raw Brotli Gzip
host/assets/paseo.smol-DboPaEh1.json 1.84 MB 941.7 KB 1019.4 KB
host/assets/paseo-people-next.smol.json 3.36 MB 1.68 MB 1.82 MB
host/assets/previewnet.smol.json 810.7 KB 116.0 KB 149.3 KB
host/assets/smoldot_worker.js 2.92 MB 2.19 MB 2.20 MB
Total 10.35 MB (-244 B) 5.38 MB (-561 B) (-48%) 5.72 MB (-676 B)
All files
File Raw Brotli Gzip
host/.well-known/apple-app-site-association 738 B 738 B 738 B
host/.well-known/assetlinks.json 986 B 986 B 986 B
host/assets/auth.js 461.4 KB 175.4 KB (+57 B) 221.4 KB
host/assets/blake2.js 10.2 KB 3.2 KB (-8 B) 3.7 KB
host/assets/bridge.js 5.4 KB 1.8 KB (+3 B) 2.1 KB (+3 B)
host/assets/browser.js 22.9 KB 7.6 KB 8.6 KB
host/assets/client.js 93.0 KB 27.3 KB (+10 B) 30.1 KB
host/assets/container.js 52.5 KB (+57 B) 13.1 KB (+34 B) 14.8 KB (+35 B)
host/assets/dist.js 28.8 KB (+3.9 KB) 10.0 KB (+2.3 KB) 11.1 KB (+2.5 KB)
host/assets/dist.js 24.8 KB 7.7 KB 8.6 KB
host/assets/dist.js 20.6 KB (-4.3 KB) 4.8 KB (-2.9 KB) 5.4 KB (-3.2 KB)
host/assets/dotli-debug-bus.js 495 B 495 B 495 B
host/assets/index.js 102.9 KB 27.8 KB (-11 B) 32.3 KB (+13 B)
host/assets/index.css 46.6 KB 7.2 KB 8.1 KB
host/assets/log.js 972 B 972 B 972 B
host/assets/manifest.js 22.6 KB 7.3 KB (-37 B) 8.0 KB (-1 B)
host/assets/mode.js 1.7 KB 600 B 664 B
host/assets/nanoevents.js 215 B 215 B 215 B
host/assets/network.js 3.7 KB 1.4 KB 1.6 KB
host/assets/nova-scale.js 6.7 KB 2.6 KB 2.9 KB
host/assets/panel.js 84.9 KB 23.0 KB (-28 B) 26.3 KB (+1 B)
host/assets/paseo.smol-DboPaEh1.json 1.84 MB 941.7 KB 1019.4 KB
host/assets/paseo-people-next.smol.json 3.36 MB 1.68 MB 1.82 MB
host/assets/paseo.smol.json 129.8 KB 22.9 KB 31.8 KB
host/assets/previewnet.smol.json 810.7 KB 116.0 KB 149.3 KB
host/assets/resolve.js 154 B 154 B 154 B
host/assets/rolldown-runtime.js 694 B 694 B 694 B
host/assets/rpc-resolve.js 2.5 KB 1.1 KB (+12 B) 1.2 KB (-1 B)
host/assets/shared-mode.js 1.9 KB 793 B (-7 B) 895 B
host/assets/smoldot_worker.js 2.92 MB 2.19 MB 2.20 MB
host/assets/spans.js 1.5 KB 763 B 907 B
host/assets/src.js 1.9 KB 889 B (+1 B) 999 B (+1 B)
host/assets/styles.css 15.1 KB 3.2 KB 3.8 KB
host/assets/substrate-client.js 7.2 KB 2.7 KB (-2 B) 3.0 KB (-2 B)
host/assets/ws.js 25.8 KB 8.4 KB 9.1 KB (-1 B)
host/dotli.png 11.5 KB 11.5 KB 11.5 KB
host/favicon.svg 1.8 KB 1.8 KB 1.8 KB
host/host-sw.js 2.9 KB 1.1 KB (-14 B) 1.3 KB (+5 B)
host/icon-192.png 12.5 KB 12.5 KB 12.5 KB
host/icon-512.png 42.8 KB 42.8 KB 42.8 KB
host/index.html 20.3 KB 4.5 KB (-8 B) 5.5 KB
host/manifest.webmanifest 441 B 441 B 441 B
host/workbox.js 14.8 KB 4.6 KB 5.1 KB
sandbox/app-sw.js 8.7 KB 2.8 KB (+1 B) 3.2 KB (+1 B)
sandbox/assets/bitswap-bridge.js 840 B 840 B 840 B
sandbox/assets/fetch.js 3.4 KB (+57 B) 1.2 KB (+4 B) 1.4 KB (+6 B)
sandbox/assets/index.js 118.1 KB 33.7 KB (+11 B) 39.7 KB
sandbox/assets/index.css 46.6 KB 7.2 KB 8.1 KB
sandbox/favicon.svg 1.8 KB 1.8 KB 1.8 KB
sandbox/index.html 1.7 KB 581 B (+3 B) 786 B (+1 B)
Total 10.35 MB (-244 B) 5.38 MB (-561 B) (-48%) 5.72 MB (-676 B)

Commit: 06401f7

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

⚡ Performance Report

⚠️ No baseline found on main. This PR's results are recorded but cannot be compared.
Merge to main to establish a baseline.

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

E2E Product suite failed on e575b4b45a5eefe9efcbd6ee8c2e21f81ca3c3e7 — 0 passed, 0 failed, 0 skipped.

Failed tests:

Logs: https://github.com/paritytech/dotli-community/actions/runs/27548193199
Artifacts: e2e-product-results (uploaded above) — open the failed test's trace.zip with npx playwright show-trace.

@karolk91 karolk91 marked this pull request as ready for review June 15, 2026 13:05
@KarimJedda

Copy link
Copy Markdown
Collaborator

Looks sensible, thank you @karolk91 . Let's deploy it then we can merge, sharing with the team

@leonardocustodio leonardocustodio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to evaluate whether this will not break other things.

We request the .car file to use the trustless gateway option, as stated here.

Removing that would "remove" this feature, which would not be a problem for applications uploaded through bulletin-deploy, e.g., as they are uploaded as .car already, so getting them as raw would just make it return as a ' .car ' for those; this would actually be very good, as it would get rid of the double-wrap.

The problem is with data that is not uploaded through bulletin-deploy, where it would indeed come as a file rather than a .car.

We have a mechanism that verifies the .car file against the CID to ensure the integrity of the fetched files. I'm not sure if this change will break it.

Were you able to fully test e2e both multiple files uploaded and a .car file to see if everything keeps working correctly?

@karolk91

Copy link
Copy Markdown
Author

We might need to evaluate whether this will not break other things.

We request the .car file to use the trustless gateway option, as stated here.

Removing that would "remove" this feature, which would not be a problem for applications uploaded through bulletin-deploy, e.g., as they are uploaded as .car already, so getting them as raw would just make it return as a ' .car ' for those; this would actually be very good, as it would get rid of the double-wrap.

The problem is with data that is not uploaded through bulletin-deploy, where it would indeed come as a file rather than a .car.

We have a mechanism that verifies the .car file against the CID to ensure the integrity of the fetched files. I'm not sure if this change will break it.

As far as I'm aware by looking at the code - this change here doesn't modify anything related to CAR handling, these are separate branches in the code

if (cid.code === CODEC_RAW) {
onStatus?.("Fetching content via IPFS gateway...");
log.warn(`[dot.li fetch] Gateway: plain GET (codec raw)...`);

and
if (cid.code === CODEC_DAG_PB) {
onStatus?.("Fetching archive from IPFS gateway...");
log.warn(`[dot.li fetch] Gateway: requesting CAR (codec dag-pb)...`);

This change is also in line with trustless gateway design as you linked: https://docs.ipfs.tech/reference/http/gateway/#trusted-vs-trustless

I don't expect this impacting "double-wrap" in any way

Were you able to fully test e2e both multiple files uploaded and a .car file to see if everything keeps working correctly?

I tested this change by verifying if various apps in the Playground->Apps are loading properly - and I checked that some of them were just "single file" (raw codec) and some of them were "multi file" (dag-pb codec) based. Any specific user story you have in mind I could go through to confirm all is ok?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants