I try to parse such string with DOMDocument:
$html1 = "<script>document.write('<scr'+'ipt>alert(123);</scr'+'ipt>')</script>";
$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadHTML($html1);
$html2 = $dom->saveHTML();
but $html2 has the string
<html><head><script>document.write('\<scr'+'ipt\>alert(123);')</script></head></html>
which is missing the </scr'+'ipt> part.
I expect to receive the same string between script tags as in the input.
If you try to load partial HTML on
DOMDocumentit gets confused as it doesn't know how to parse it and sometimes it ends up parsing it as XML.To avoid this always be sure that the minimum of a HTML5 document is present before adding loading the document.
Also anything inside the
<script>tags cannot have any</those needs to be escaped before being loaded.In you case, satisfying the minimum requirements, it should be something like this:
I just hope that this is not real code. Inserting scripts with scripts and helping you coding some code hacking injection.