I am trying to become familiar with llamaIndex and it's retrieval evaluation API. I created this quick program to fetch a bunch of paragraphs from a db and converted them into TextNode objects. Then I passed them to this method:
from llama_index.evaluation import generate_question_context_pairs
eval_questions = generate_question_context_pairs(
summary_nodes, llm=llm, num_questions_per_chunk=1
)
here I assumed each of the nodes would be treated as a single chunk it would generate a single question per node. However, the output contained 150 questions for 25 TextNodes. Some nodes had many questions and others had only 1.
I am a little confused here. Is a node not a chunk and if not how can I restrict llamaIndex to only generate 1 question per provided Node?