Can't understand this backtracking question from book

82 views Asked by At

Currently I am studying Data Structure and Algorithmic thinking with python by Narasimha Karumanchi. But I can't understand one question which is as follows:

Q. Generate all the binary strings with n-bits. Assume A[0..n-1] is an array of size n.

Sol:

def appendAtFront(x,L):
    return [x+element for element in L]
def bitStrings(n):
    if n == 0: return []
    if n == 1: return ["0", "1"]
    else:
        return (appendAtFront("0", bitStrings(n-1) + appendAtFront("1", bitStrings(n-1))))

print(bitStrings(4))

OUTPUT:

['0000', '0001', '00010', '00011', '00100', '00101', '001010', '001011', '01000', '01001', '010010', '010011', '010100', '010101', '0101010', '0101011']

Can any one explain me what this question is asking?

2

There are 2 answers

4
Ada On

The question is asking you to generate all possible binary strings of length n. In binary, each digit can be either 0 or 1, so for a binary string of length n, you can have 2^n possible combinations.

e.g.:

n = 1 --> [0, 1]
n = 2 --> [00, 01, 10, 11]
n = 3 --> [000, 001, 010, 011, 100, 101, 110, 111]
0
trincot On

You have misquoted the code from the book "Data Structure and Algorithmic thinking with python" by Narasimha Karumanchi. The output you have included clearly shows the result is wrong: the strings have differing lengths, and none of the strings have three consecutive "1" characters.

Here is a screenshot of pages 45 and 46:

enter image description here

As you can see, you have placed the parentheses differently in the return statement. Other insignificant difference are in the name of the first function, and the print statement, which uses Python 2 syntax. The code in the book is like this:

def appendAtBeginningFront(x, L):
    return [x + element for element in L]
    
def bitStrings(n):
    if n == 0: return []
    if n == 1: return ["0", "1"]
    else:
        return (appendAtBeginningFront("0", bitStrings(n-1)) + appendAtBeginningFront("1", bitStrings(n-1)))

print bitStrings(4)

It produces the correct output.

The issues in the book...

The text below the code says:

Let () be the running time of Binary(n). Assume function printf takes time O(1).

enter image description here

Using Subtraction and Conquer Master theorem we get: () = O(2). This means the algorithm for generating bit-strings is optimal.

This is problematic:

  1. The presented Python code has no Binary function, nor a printf function. This clearly refers to some other C code that is not presented.

  2. The time complexity of the presented code is not O(2), but O(²2).

The explanation for these errors is that this book was a later variant of the book "Data Structures and Algorithms Made Easy" by the same author, but that book used C code which got replaced with Python code in the book you are looking at.

The book "Data Structures and Algorithms Made Easy"

In the earlier book, we find at the same spot the following:

2.12 Backtracking: Problems & Solutions

Problem 3   Generate all the strings of bits. Assume [0..−1] is an array of size .

Solution:

void Binary(int n) {
    if (n < 1)
        printf("%s", A);      // Assume array A is a global variable
    else {
        A[n-1] = 0;
        Binary(n - 1);
        A[n-1] = 1;
        Binary(n - 1);
    }
}

Let () be the running time of Binary(n). Assume function printf takes time O(1).

enter image description here

Using Subtraction and Conquer Master theorem we get: () = O(2). This means the algorithm for generating bit-strings is optimal.

Now the text below the code makes sense, but this C code is drastically different from the Python code in the book you refer to:

  • It does not produce a collection (array, list) of strings, but prints strings.
  • It doesn't produce new strings for each combination, but mutates the same string. (Contrary to Python strings, C-strings are mutable)

By consequence, the time complexity is different for these different implementations.

So the author (or their editor) was sloppy in replacing the C code with Python code, without apparently realising the text below that code snippet is no longer true.

What is the true complexity?

Honestly, assuming O(1) complexity for the C printf function is not fair: printing characters has a time complexity of O(). Although the text in the C version is consistent, if we decide to print the results, then we have to agree there are 2 characters in the output, and so the time complexity cannot be just O(2).

On the other hand, the Python code is inefficient, as it performs unnecessary extra work by copying a string each time a character needs to be prefixed. This means for one string in the output, the code has already generated strings of length 1, of length 2, ... up to the string's final length. That represents a time complexity of 1+2+3+...+ = O(²) for one string, and thus the total time complexity becomes O(²2). This extra factor of can be avoided.

If we wanted to translate the C implementation to Python, we'd have to represent strings as lists of characters, so that we can mutate these "strings". Here is how that would look if we stay as close as possible to the C code:

n = 4
A = [0] * n

def binary(n: int):
    if n < 1:
        print(A)
    else:
        A[n-1] = 0
        binary(n - 1)
        A[n-1] = 1
        binary(n - 1)

binary(n)

This is not very elegant Python code, as we should not mix printing with logic, and should not avoid global variables. So to improve on that, and produce strings, we could write:

def binary(n: int):
    lst = ["0"] * n

    def recur(n: int):
        if n < 0:
            yield "".join(lst)
        else:
            lst[n] = "0"
            yield from recur(n - 1)
            lst[n] = "1"
            yield from recur(n - 1)

    return list(recur(n - 1))

print(binary(4))

This has a time complexity of O(2), just like the C code if we don't ignore the printing cost in that code.