Python regex.findall not finding all matches of shorter length

Question

Python regex.findall not finding all matches of shorter length

135 views Asked by Artie Vandelay At 21 July 2023 at 20:50

How can I find all matches that don't necessarily consume all characters with * and + modifiers?

import regex as re
matches = re.findall("^\d+", "123")
print(matches)
# actual output: ['123']
# desired output: ['1', '12', '123']

I need the matches to be anchored to the start of the string (hence the ^), but the + doesn't even seem to be considering shorter-length matches. I tried adding overlapped=True to the findall call, but that does not change the output.

Making the regex non-greedy (^\d+?) makes the output ['1'], overlapped=True or not. Why does it not want to keep searching further?

I could always make shorter substrings myself and check those with the regex, but that seems rather inefficient, and surely there must be a way for the regex to do this by itself.

s = "123"
matches = []
for length in range(len(s)+1):
    matches.extend(re.findall("^\d+", s[:length]))
print(matches)
# output: ['1', '12', '123']
# but clunky :(

Edit: the ^\d+ regex is just an example, but I need it to work for any possible regex. I should have stated this up front, my apologies.

Original Q&A

There are 4 answers

wim On 21 July 2023 at 21:05

How about a positive lookbehind assertion:

>>> import regex as re
>>> re.findall(r'(?<=(^\d+))', '123')
['1', '12', '123']

Andrej Kesely On 21 July 2023 at 21:07

I'd use standard library re:

import re

matches = re.findall("^\d+", "123")
out = [m[:i] for m in matches for i in range(1, len(m)+1)]
print(out)

Prints:

['1', '12', '123']

Niveditha S On 21 July 2023 at 21:08

import re

m = re.findall(r'\d', '123')
op = ["".join(m[:i]) for i in range(1, len(m) + 1)]
print(op)

This is a bit better as re.findall() is called only once

**The fourth bird** · Accepted Answer · 2023-07-22T09:33:26+00:00

The fourth bird On 22 July 2023 at 09:33 BEST ANSWER

You could use overlapped=True with the PyPi regex module and reverse searching (?r)

Then reverse the resulting list from re.findall

import regex as re

res = re.findall(r"(?r)^\d+", "123", overlapped=True)
res.reverse()
print(res)

Output

['1', '12', '123']

See a Python demo.

TechQA.

Python regex.findall not finding all matches of shorter length

There are 4 answers

Related Questions in PYTHON

Related Questions in REGEX

Related Questions in FINDALL

Popular Questions

Trending Questions