I am running into a TypeError when checking a strings' presence in a list:
I checked similar issues such as the one in this thread and this one but i couldn't find a solution there.
I expect the code to calculate (one iteration of) the page rank of all documents inside "pagerank.txt, which looks like this this, but it runs into an error.
My full code:
def calc_pageranks(pagerank_file = "pagerank.txt", damping=0.9):
pagerankScores = {}
pagerankData = {} #dict of what files a file (the dict key) points to
with open(pagerank_file, "r") as f:
for row in f:
row = row.split()
pagerankScores.update({row[0]:1}) #set starting pagerank for documents
if len(row) == 1:
pagerankData.update({row[0]:None})
else:
pagerankData.update({row[0]:row[1:]}) #pagerank graph
for docName in pagerankData:
pagerank = pagerankScores[docName]
temp = 0
for val in pagerankData.values():
if val == None:
pass
else:
if docName in val:
temp += pagerank / len(val)
pagerank = (1 - damping) + damping * temp
pagerankData.update({docName:pagerank})
return pagerankScores, pagerankData
docName and val look like this:
doc1.txt <class 'str'> ['doc2.txt', 'doc8.txt'] <class 'list'>
Full error message:
/Library/Frameworks/Python.framework/Versions/3.9/bin/python3 /Users/MacSuperior/Desktop/Coding/IR_system/search_engine.py
Traceback (most recent call last):
File "/Users/MacSuperior/Desktop/Coding/IR_system/search_engine.py", line 99, in <module>
print(calc_pageranks())
File "/Users/MacSuperior/Desktop/Coding/IR_system/search_engine.py", line 92, in calc_pageranks
if docName in val:
TypeError: argument of type 'float' is not iterable
Pagerank.txt:
doc1.txt doc2.txt doc8.txt
doc2.txt doc1.txt doc2.txt doc9.txt
doc3.txt doc4.txt
doc4.txt doc1.txt doc10.txt
doc5.txt doc6.txt
doc6.txt
doc7.txt doc1.txt
doc8.txt doc9.txt doc10.txt
doc9.txt doc10.txt
doc10.txt doc9.txt
Here's a couple of ideas to refactor to help eliminate typos.