feat: add Safelist overloads to Html component#24528
Conversation
Untrusted HTML passed to the Html component can lead to cross-site scripting, and developers may forget to sanitize it. Add an overload of every HTML-accepting member that additionally takes a jsoup Safelist and runs the content through Jsoup.clean before using it: - Html(InputStream, Safelist) - Html(String, Safelist) - Html(Signal<String>, Safelist) - setHtmlContent(String, Safelist) - bindHtmlContent(Signal<String>, Safelist) The signal-based overloads sanitize both the initial value and every subsequent update. The existing methods are left unchanged. Fixes vaadin#23610 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Note: this is still a work in progress — I haven't had the chance to review it as carefully as I'd like, so there may be a few rough edges. I'll go over it again with fresh eyes shortly. Feedback is very welcome in the meantime. |
mcollovati
left a comment
There was a problem hiding this comment.
Overall looks good, but there's an incompatibility with the Javadoc.
Javadoc says
Any heading or trailing whitespace is removed while parsing but any whitespace inside the root tag is preserved.
However, it looks like that Jsoup.clean(...) seems to pretty-print the output.
For example, the following test fails
@Test
void stringWithSafelist_whitespacePreserved() {
Html html = new Html(" <div><pre> text </pre> <b>bold</b> <b>b2</b> </div> ", Safelist.basic().addTags("div"));
assertEquals("<pre> text </pre> <b>bold</b> <b>b2</b> ", html.getInnerHtml());
}
We should probably introduce a method in HTML class, that performs clean disabling pretty printing.
| getElement().getNode() | ||
| .getFeatureIfInitialized(SignalBindingFeature.class) | ||
| .ifPresent(feature -> { | ||
| if (feature.hasBinding(SignalBindingFeature.HTML_CONTENT)) { | ||
| throw new BindingActiveException( | ||
| "setHtmlContent is not allowed while a binding for HTML content exists."); | ||
| } | ||
| }); |
There was a problem hiding this comment.
nit: could be extracted in a method and reused to avoid duplication.
Jsoup.clean pretty-prints its output, which reformatted whitespace inside the root element. Sanitize with Cleaner directly and disable pretty-printing so inner whitespace is preserved as documented. Also extract the duplicated HTML content binding check into a helper.
Fixes #23610