I'm using the OWASP Java HTML Sanitizer to sanitize HTML input. The problem is that the "rel" attribute values "noopener" and "noreferrer" is duplicated or removed by the library.
I have an input like this:
<a href="https://stackoverflow.com" rel="noopener noreferrer" target="_blank">StackOverflow</a>
and I want that the safe HTML remains the same, because it's valid.
This is an example class to reproduce the problem:
import org.owasp.html.HtmlPolicyBuilder;
import org.owasp.html.PolicyFactory;
public class HtmlSanitizerError {
public static void main(String[] args) {
PolicyFactory policyFactory = new HtmlPolicyBuilder()
.allowElements("a")
.allowUrlProtocols("https")
.skipRelsOnLinks("noopener", "noreferrer") // with this line the "rel" attribute will be removed
// without this line the "rel" attribute values will be duplicated
.allowAttributes("href", "rel", "target").onElements("a")
.toFactory();
String html = "<a href=\"https://stackoverflow.com\" rel=\"noopener noreferrer\" target=\"_blank\">StackOverflow</a>";
String safeHTML = policyFactory.sanitize(html);
System.out.println(safeHTML);
}
}
I already see this similar issue on StackOverflow, but it didn't solve my problem: Java - Owasp Html Sanitizer shows double noopener noreferrer for url