How to convert html correctly in javascript?

81 views Asked by At

I need to convert snippets of text that contain html tags into plain text using Javascript / Node.Js.

I currently use String.Js library for that, but the problem is that when it removes the tags (using strip_tags() functions), it also removes the new line.

E.g.

   <div>Some text</div><div>another text</div>

becomes

   Some textanother text

Do you know how I could get rid of this problem? Maybe another library?

Thanks!

2

There are 2 answers

1
user3385530 On

Hi this is very simple solution of your problem because I'm using reg exp and you can do what you want.

In this case we remove all tags except br tags.If you want you can remove br tag and add another tag maybe \n \t or what you want.

I hope this can help you.

Chears!!!

var html = "<div>Some text</div><div>another text</div><br />test<div>10</div>";
var removeHtmlTags = html.replace(/(<([^>!br]+)>)/ig,"");
console.log(removeHtmlTags);
0
Alex Hill On

Try using Cheerio. It will expose a jQuery like interface for you on the server side. Then it's just:

var html = $(htmlstring).html();

Then just traverse the DOM for whatever elements you want and call $(element).text();