Skip to content

Questions on connecting adjacent chunks and on subsequent invocations of indexing #15

@Kekuaka

Description

@Kekuaka

Hello! Thank you for sharing your work. I've read the paper and then tested the code from this repo. I like it.
Could you please clarify few questions.

  1. The paper does not state that adjacent chunks (passages) are connected by the graph edges, although this is implemented in the code within the add_adjacent_passage_edges() method. Due to this, the prepared dataset should contain numbered chunks (passages). Is this a crucial feature of the indexing and the graph construction method, or is it optional? When the adjacent chunks come from different documents should they be connected by the graph edge?

  2. When the code is executed, the graph edges are doubled in subsequent invocations of the index() method with empty passage list. The simplest test is based on the run.py script. I think this is not expected behavior.
    ...
    rag_model = LinearRAG(global_config=config)
    questions,passages = load_dataset(args.dataset_name)
    rag_model.index(passages)
    rag_model.index([])

2025-12-30 14:22:25,591 - INFO - Using retrieval method: BFS Iteration
Indexed 11 passages
Graph vertices: 11
Graph edges: 10
Indexed 0 passages
Graph vertices: 11
Graph edges: 20

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions