How can I select the entire import statements using regular expressions?

111 views Asked by At

I'm writing a static code analysis tool in Python.

Here's a sample of a code that needs to be analyzed:

import {
    component$,
    useClientEffect$,
} from '@builder.io/qwik'
 import Swiper from 'swiper'
import {
    Navigation,
    Pagination
} from 'swiper'
import { Image } from 'Base'
import Button from '../Shared/Button'
import Heading from '../Shared/Heading'
const Portfolio = component$(
    (
        {
            items,
            title,
            linkText,
            link
        }
    ) => {

One item that I have to check is to ensure that there is an empty line after the last import. As you can see in the example above, the const Portfolio which is the component declaration is attached to the previous import. I need to make sure it has an empty line before it.

I tried (?<=import).*(?=from), but it does not work. Please note that a developer might place an empty line before two imports. Or he might place empty space before an import. In other words, a developer might write the most unformatted code.

What regular expression can I use to ensure this requirement?

1

There are 1 answers

0
Albina On

This is the solution I came up with so far:

"(^[\s\n]{0,}import.*from[^a-zA-Z0-9]+[ ./a-zA-Z0-9']+$)|(^[\s]{0,}import.*(\n +.*|\s)*.*from[^a-zA-Z0-9]+[ ./a-zA-Z'0-9]+$)"mg

There are 2 groups:

  • (^[\s\n]{0,}import.*from[^a-zA-Z0-9]+[ ./a-zA-Z0-9']+$)
    • ^[\s\n]{0,}import - a line starts with the word import. And before this word we can meet (or not at all) spaces, new lines.
    • import.*from between the words import and from we can meet anything
    • from[^a-zA-Z0-9]+[ ./a-zA-Z0-9']+$ - the word from is followed by non-letter & non-digit symbols ([^a-zA-Z0-9]+) or letters/digits/single quote/dot/space ([ ./a-zA-Z0-9']+).
  • (^[\s]{0,}import.*(\n +.*|\s)*.*from[^a-zA-Z0-9]+[ ./a-zA-Z'0-9]+$)
    • ^[\s]{0,}import - a line starts with the word import. And before this word we can meet (or not at all) some spaces.
    • import.*(\n +.*|\s)*.*from - between the words import and from we can meet any text on multiple lines (here is why we need the flag m - multiline)
    • from[^a-zA-Z0-9]+[ ./a-zA-Z0-9']+$ - the same meaning as in the first group

regex101.com