C Lexical Analyzer Bug: Pointer Value Reset

26 views Asked by At

Description of the Problem

I am working on implementing a lexical analyzer in C, and I have encountered an issue with a pointer variable (pos) that seems to be behaving unexpectedly. The problem occurs within a function called reconhece_id, which is responsible for recognizing and extracting identifiers from a character buffer.

Here's a simplified version of the relevant code (the function and some variables are written in Portuguese, but I believe by just reading the code it's quite understandable what it does):

TInfoAtomo reconhece_id(char* buffer, int *pos)
{
    int init_id = *pos;
    TInfoAtomo infoAtomo;
    infoAtomo.atomo = ERRO;

    if(islower(buffer[*pos]))
    {
        *pos++;
        goto q1;
    }
    return infoAtomo;

q1:
    if(islower(buffer[*pos])||isdigit(buffer[*pos]))
    {
        *pos++;
        goto q1;
    }
        
    if(isupper(buffer[*pos]))
        return infoAtomo;

    strncpy(infoAtomo.atributo_ID, buffer + init_id, (*pos) - init_id);
    infoAtomo.atributo_ID[(*pos) - init_id] = '\x0';
    infoAtomo.atomo  = IDENTIFICADOR;
    return infoAtomo;
}

The problem I'm facing is that after successfully incrementing the pos pointer in the first if block, it appears to reset to its original value (zero) in the second if block when it increments it's value again. This unexpected behavior is causing issues in my lexical analyzer.

Debugging With GDB

I have used GDB to debug the reconhece_id function to investigate the issue with the pos pointer. Here's a breakdown of what GDB shows during the debugging process:

reconhece_id (buffer=0x406490 "abc 10.0  f 20.0\n30.0  \n40.0", 
    pos=0x7fffffffdb78) at src/TInfoAtomo.c:98
98      int init_id = *pos;
(gdb) p pos
$1 = (int *) 0x7fffffffdb78
(gdb) p *pos
$2 = 0
(gdb) n
100     infoAtomo.atomo = ERRO;
(gdb) p init_id
$3 = 0
(gdb) n
102     if(islower(buffer[*pos]))
(gdb) 
104         *pos++;
(gdb) 
105         goto q1;
(gdb) p *pos
$4 = 1
(gdb) n
110     if(islower(buffer[*pos])||isdigit(buffer[*pos]))
(gdb) p *pos
$5 = 1
(gdb) p buffer[*pos]
$6 = 98 'b'
(gdb) n
112         *pos++;
(gdb) 
113         goto q1;
(gdb) p *pos
$7 = 0

Here's a breakdown of the GDB output:

  1. The debugging session starts with pos initially set to zero.

  2. At line 104, *pos++ is used to increment pos, and it correctly becomes 1.

  3. The program reaches line 110, where buffer[*pos] is checked, and it correctly points to the character 'b' in the buffer.

  4. After another *pos++ operation at line 112, pos unexpectedly resets to zero.

This unexpected behavior is causing the issue with the function. I need help understanding why the pos pointer is resetting to zero and how I can ensure it maintains its value throughout the function.

Any insights or guidance on resolving this issue would be greatly appreciated.

0

There are 0 answers