I'm trying to make a Markov text generator but I keep getting a KeyError that I don't understand.
In this function, I keep getting a Keyerror in the line w1, w2 = w2, random.choice(self.wordlog[w1]). 
self.gensize is the number of words I want to generate, 
self.size is the total number of words in my text, 
self.wordlog is the dictionary - i.e. {'secondly': ['because', 'because'], 'pardon': ['cried', 'cried', 'said', 'said', 'your', 'she'], 'limited': ['to', 'warranty', 'right', 'right', 'to'], etc...}
def generate(self):
    startseed = random.randint(0, self.size - 1)
    w1, w2 = self.words[startseed], self.words[startseed+1]
    #at this point, w1 and w2 are a random word and a following word-i.e. alice ran
    wordlist = [] 
    for i in range(self.gensize):
        wordlist.append(w1)
        w1, w2 = w2, random.choice(self.wordlog[w1])
    #i.e. self.wordlog[alice] should return a list of all the values the word alice precedes
    wordlist.append(w2)
    print wordlist
When I run the function (print markov("alice.txt", 5).generate()), I just keep getting a KeyError - a different word each time (which is to be expected, as the starting seed and the random.choice will lead to this).
Anyone see what's wrong with this and how to fix this?
EDIT:
Here's the rest of the code, so you can see where self.words and everything else is coming from:
class markov(object):
    def __init__(self, filename, gensize):
        self.wordlog = {}
        self.filename = filename
        self.words = self.file_to_words()
        self.size = len(self.words)
        self.gensize = gensize
    def file_to_words(self):
        with open(self.filename, "r") as file_opened:
            text = file_opened.read().translate(None, string.punctuation)
            mixedcasewords = text.split()
            words = [x.lower() for x in mixedcasewords]
            return words
    def doubles(self):
        for i in range((self.size)-1):
            yield (self.words[i], self.words[i+1])       
    def catalog(self):
        for w1, w2 in self.doubles():
            self.wordlog.setdefault(w1, []).append(w2)
        print self.wordlog
				
                        
I think that's because you're using
random.choicewith adictinstead of alist/set/tupleIt's difficult to say but maybe you should check
self.wordlogjust to make sure.[EDIT] Maybe it's just while trying to fulfill the given gensize reaches a key that doesn't exist.
starts a
forloop with five iterations. For each of the iteration you should be sure that the randomly picked keyw1is actually a key of thewordlog.To ensure this isn't a problem you can do 2 things:
Approach 1
Check
w1 in wordlogorelse: break.This approach may give a solution smaller than the asked gensize.
Approach 2
Make sure it works for ANY given gensize.
You can do this easyly linking the wordlog keys and values in loops,
like in
{'a':['b','a'],'b':['b','a']}