<speak><voice name=\"en-US-JennyNeural\"><prosody rate=\"1\">aaaaaaaa<break time=\"5s\"/> bbbb. <br time=\"2s\"/>ccccccdddddddd </prosody></voice></speak>
I use this code to parse and get:
doc, err := goquery.NewDocumentFromReader(strings.NewReader(text))
if err != nil {
return "", err
}
ssml, err := doc.Find("html body").Html()
if err != nil {
return "", err
}
Result:
<speak><voice name="en-US-JennyNeural"><prosody rate="1">aaaaaaaa<break time="5s"> bbbb. <br time="2s"/>ccccccdddddddd </break></prosody></voice></speak>
I think the break doesn't parse Correctly. I want to parse <break/> like <br/>.
Assuming you're using
github.com/PuerkitoBio/goquery, it usesgolang.org/x/net/htmlunder the hood for HTML parsing, which is an HTML5-compliant tokenizer and parser.<br>and<break>are parsed differently because in HTML<br>is a tag that must not have a closing tag, but<break>is not such a tag.If you want
goqueryto handle your HTML properly, you must use an explicit closing tag for<break>instead of the self-closing tag.E.g. instead of this:
You must use this:
With this change your output will be (try it on the Go Playground):