Excluding URLs/Images in markdown from string Regular Expression operation

729 views Asked by At

I'm building an application where users highlight and scroll to words in an article they write in search bar. articles come in a Markdown format and I'm using a Markdown-it to render article body.

It works well except for if the word they search for is part of an image URL. it applies regular expression to it and image breaks.

    applyHighlights() {
      let str = this.article.body
      let searchText = this.articleSearchAndLocate
      const regex = new RegExp(searchText, 'gi')
      let text = str
      if (this.articleSearchAndLocate == '') {
        return text
      } else {
        const newText = text.replace(
          regex,
          `<span id="searchResult" class="rounded-sm shadow-xl py-0.25 px-1 bg-accent font-semibold text-tint">$&</span>`
        )
        return newText
      }
    }

Is there a way to exclude applying regex if it's an Image URL ?

1

There are 1 answers

2
Wiktor Stribiżew On BEST ANSWER

You can use

applyHighlights() {
  let str = this.article.body
  let searchText = this.articleSearchAndLocate
  const regex = new RegExp('(!\\[[^\\][]*]\\([^()]*\\))|' + searchText.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), 'gi')
  let text = str
  if (this.articleSearchAndLocate == '') {
    return text
  } else {
    const newText = text.replace(
     regex, function(x, y) { return y ? y :
      '<span id="searchResult" class="rounded-sm shadow-xl py-0.25 px-1 bg-accent font-semibold text-tint">' + x + '</span>'; })
    return newText
  }
}

Here,

  • new RegExp('(!\\[[^\\][]*]\\([^()]*\\))|' + searchText.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), 'gi') - creates a regex like (!\[[^\][]*]\([^()]*\))|hello that matches and captures into Group 1 a string like ![desc](value), or it matches hellow (if the searchText is hello).
  • .replace(regex, function(x, y) { return y ? y : '<span id="searchResult" class="rounded-sm shadow-xl py-0.25 px-1 bg-accent font-semibold text-tint">' + x + '</span>'; }) means that if Group 1 (y) was matched, the return value is y itself as is (no replacement is peformed), else, the x (the whole match, searchText) is wrapped with a span tag
  • .replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') is necessary to support searchText that can contain special regex metacharacters, see Is there a RegExp.escape function in JavaScript?