Skip to content

feat: add Safelist overloads to Html component#24528

Open
adraarda23 wants to merge 3 commits into
vaadin:mainfrom
adraarda23:feat/23610-html-sanitization-api
Open

feat: add Safelist overloads to Html component#24528
adraarda23 wants to merge 3 commits into
vaadin:mainfrom
adraarda23:feat/23610-html-sanitization-api

Conversation

@adraarda23

Copy link
Copy Markdown
Contributor

Fixes #23610

Untrusted HTML passed to the Html component can lead to cross-site
scripting, and developers may forget to sanitize it. Add an overload of
every HTML-accepting member that additionally takes a jsoup Safelist and
runs the content through Jsoup.clean before using it:

- Html(InputStream, Safelist)
- Html(String, Safelist)
- Html(Signal<String>, Safelist)
- setHtmlContent(String, Safelist)
- bindHtmlContent(Signal<String>, Safelist)

The signal-based overloads sanitize both the initial value and every
subsequent update. The existing methods are left unchanged.

Fixes vaadin#23610

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@adraarda23

Copy link
Copy Markdown
Contributor Author

Note: this is still a work in progress — I haven't had the chance to review it as carefully as I'd like, so there may be a few rough edges. I'll go over it again with fresh eyes shortly. Feedback is very welcome in the meantime.

@mcollovati mcollovati added the Contribution PRs coming from the community or external to the team label Jun 5, 2026
@mshabarov mshabarov requested a review from mcollovati June 8, 2026 11:42

@mcollovati mcollovati left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, but there's an incompatibility with the Javadoc.
Javadoc says

Any heading or trailing whitespace is removed while parsing but any whitespace inside the root tag is preserved.

However, it looks like that Jsoup.clean(...) seems to pretty-print the output.

For example, the following test fails

    @Test
    void stringWithSafelist_whitespacePreserved() {
        Html html = new Html("    <div><pre> text </pre> <b>bold</b>    <b>b2</b>  </div>  ", Safelist.basic().addTags("div"));
        assertEquals("<pre> text </pre> <b>bold</b>    <b>b2</b>  ", html.getInnerHtml());
    }
Image

We should probably introduce a method in HTML class, that performs clean disabling pretty printing.

Comment on lines +326 to +333
getElement().getNode()
.getFeatureIfInitialized(SignalBindingFeature.class)
.ifPresent(feature -> {
if (feature.hasBinding(SignalBindingFeature.HTML_CONTENT)) {
throw new BindingActiveException(
"setHtmlContent is not allowed while a binding for HTML content exists.");
}
});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could be extracted in a method and reused to avoid duplication.

Jsoup.clean pretty-prints its output, which reformatted whitespace
inside the root element. Sanitize with Cleaner directly and disable
pretty-printing so inner whitespace is preserved as documented.

Also extract the duplicated HTML content binding check into a helper.
@adraarda23 adraarda23 marked this pull request as ready for review June 10, 2026 14:45
@adraarda23 adraarda23 requested a review from mcollovati June 10, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contribution PRs coming from the community or external to the team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add sanitization API to the Html component

2 participants