Python lark dedent to non-0 column

64 views Asked by At

guys, this should be an easy one, but I can't figure it out.

I have a text that I need to parse that dedents to non-0 column, like this:

text = """
firstline
    indentline
   partialdedent
"""

When I try to run grammar like this:

start          : [_NL] textblock

textblock: first _INDENT ind_line _DEDENT _INDENT ind_line _DEDENT*
first   : STRING _NL
ind_line: STRING _NL

STRING         : LETTER+

%import common.LETTER
%declare _INDENT _DEDENT
%ignore " "

_NL: /(\r?\n[\t ]*)+/
"""

I get the following error: DedentError: Unexpected dedent to column 2. Expected dedent to 0

Digging more into it, it seems it's quite possible to indent sequentially and parser will remember the previous indentation levels (meaning you have to dedent multiple times), however it doesn't seem to be possible dedenting sequentially. And it's not possible to dedent to 0 and then to indent to some other value.

Is there something I'm missing, how would this be best solved?

The use case for this is to parse a log output where lines are numbered with indentation (for multiple digits) and when it goes to two digits it adds the digit to the left (dedents, but not to 0).

1

There are 1 answers

0
thevoiddancer On

So to potentially answer my own question. I have decided to give up on lark's indentation feature and do it manually. So my modified grammar is:

start          : [_NL] textblock

textblock: first ind_line ind_line
first   : STRING _NL
ind_line: " "* STRING _NL

STRING         : LETTER+

%import common.LETTER

_NL: /(\r?\n[\t ]*)+/
"""

It's not pretty and it's not elegant but at least it parses correctly, so I have that going for me which is nice.