Skip to content

[feature] Improve performance of Child and Sibling Node Retrieval #162

@adamretter

Description

@adamretter

Persistent DocumentImpl

The org.exist.dom.persistent.DocumentImpl class has a method getChildNodes(), it seems that the performance of this method could be improved.

  1. Instead of making multiple calls into the database storage by calling broker.objectWith for each child, we should instead we should take an Iterator over the storage (e.g. newXMLStreamReader or getNodeIterator)
  2. We could also likely avoid creating lots of new nodes on each call to getChildNodes() i.e. we could have a cache of some sort, or keep the results (until they need to be invalidated - document changes).

In addition, DocumentImpl#getFollowingSibling(Node) and DocumentImpl#getPrecedingSibling(Node) require retrieving all child nodes first (i.e. they call getChildNodes()) which seems very wasteful:

  1. we should instead be able to seek to the current node and then iterate forward/backward from there
  2. if we have cached the child nodes, we should be able to reuse that information here to do the lookup against the cache.

Location Step preceding/following

The class org.exist.xquery.LocationStep has a method getPrecedingOrFollowing, when looking for a preceding or following step, it always scans from the start of the document, however:

  1. For preceding axis, it could simply start from the current node and scan backwards until it finds the node (or nodes) that match the test.

  2. For the following axis, it could simply start from the current node and scan forwards until it finds the node (or nodes) that match the test.

These changes would reduced the amount of nodes that need to be scanned.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions