Do iterators have to stop forever after raising StopIteration?

655 views Asked by At

I have a Receiver object that will sometimes accumulate a queue of packets that will be consumed when processed. It seems reasonable to make this receiver have an iterator protocol, so next( receiver ) will manually retrieve the next packet, and for packet in receiver will iterate through the currently available packets. Intuitively, it would be fine to do such an iteration once, going through all the available packets until the receiver stops the for loop by raising StopIteration (which is the standard way for iterators to tell for loops it's time to stop), and then later use such a for loop again to go through whatever new packets have arrived in the interim.

However, Python docs say:

Once an iterator’s __next__() method raises StopIteration, it must continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken.

Even though this code is supposedly "deemed broken" it works just fine, as far as I can tell. So I'm wondering how bad is it for me to have code that seemingly works fine, and in the way one would intuitively expect an iterator to be able to work, but is somehow "deemed broken"? Is there something that's actually broken about returning more items after you've raised StopIteration? Is there some reason I should change this?

(I recognize that I could make the receiver be a mere iterable (whose __iter__ method would produce some other iterator) rather than an iterator itself (with its own __next__ method), but (a) this wouldn't support the familiar intuitive use of next( receiver ) to pop the next packet off the queue, and (b) it seems wasteful and inefficient to repeatedly spawn new iterator objects when I already have a perfectly fine iterator-like object whose only fault is that it is apparently "deemed broken", and (c) it would be misleading to present the receiver as a sort of iterable container since the receiver consumes the packets as it retrieves them (behavior built into the C-library that I'm making a Python wrapper for, and I don't think it makes sense for me to start caching them in Python too), so if somebody tried to make multiple iterators to traverse the receiver's queue at their own pace, the iterators would steal items from each other and yield much more confusing results than anything that I can see arising from my presenting this as a single stop-and-go iterator rather than as an iterable container.)

2

There are 2 answers

0
JustinFisher On

Another (now-deleted) answer pointed out that Python's built-in file objects produce an iterator that can be, and often is, restarted after stopping, which is some evidence that stop-and-go iterators can be perfectly functional, just not the way that they officially say iterators "should" work.

Here's another simple example illustrating that an iterator can liven up again after raising StopIteration. But this doesn't explain why python docs discourage doing things this way, nor what the hidden dangers (if any) might be of doing so.

class StopAfterEachWord:
    def __init__(self, phrase = "stop and go"):
        self.phrase = phrase
        self.i = -1

    def __iter__(self): return self

    def __next__(self):
        self.i += 1
        if self.i >= len(self.phrase) or self.phrase[self.i]==' ': raise StopIteration
        return self.phrase[self.i]

it = StopAfterEachWord("stop and go")
for letter in it: print(letter)
print("The iterator has now stopped.")
for letter in it: print(letter)
print("The iterator stopped again.")
for letter in it: print(letter)

Try it online!

This example uses a single iterator called it. The latter two for loops illustrate that this iterator can continue working even after it has raised StopIteration to halt the earlier for loops.

2
JustinFisher On

I'll add another not-fully-satisfactory answer, in case it helps anyone who's interested in exploring this. The requirement that an iterator never change its mind after issuing StopIteration dates back to the origin of iterators in Python in 2001, in PEP 234, which said:

Once a particular iterator object has raised StopIteration, will it also raise StopIteration on all subsequent next() calls? Some say that it would be useful to require this, others say that it is useful to leave this open to individual iterators. Note that this may require an additional state bit for some iterator implementations (e.g. function-wrapping iterators).

Resolution: once StopIteration is raised, calling it.next() continues to raise StopIteration.

Note: this was in fact not implemented in Python 2.2; there are many cases where an iterator's next() method can raise StopIteration on one call but not on the next. This has been remedied in Python 2.3.

The closest this comes to explaining the prohibition is saying "some say that it would be useful" (but also some don't). It also notes that "this may require an additional state bit for some iterator implementations (e.g. function-wrapping iterators)". But this seems to be more of a consideration against the prohibition, rather than for it, since obeying the prohibition may require that iterator implementations add an extra state bit to keep track of the fact that they're now officially retired? I guess one of the drawbacks of having a "benevolent dictator for life" is that his pronouncements often aren't all that well explained!

Also, it says that that this prohibition wasn't implemented back in Python 2.2, but that this was "remedied" in Python 2.3. Apparently it has been "unremedied" in the two decades (!!) since then! Or maybe what they "remedied" was just some particular iterators that hadn't obeyed this prohibition, not the fact that Python failed to enforce this prohibition?

I suspect that the goal here was just to tell people that they can generally expect iterators to keep saying that they've stopped once they've stopped, rather than risking causing errors or going into an infinite loop or or producing nonsensical return values if you accidentally try to next() them again. But it seems like the better pronouncement would have been to prohibit these sorts of "bad behavior", and not to prohibit well-motivated designs where an iterator intentionally produces more "good" values after having temporarily stopped. Unfortunately, this suspicion probably isn't enough to assure anyone that there isn't some hidden danger to violating this prohibition.