You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
VSI archive (/vsizip/, /vsiar/) performance improvements with large number of files (OSGeo#12939)
Traversal of large archives (many entries, especially in a large
hierarchy) can be slow.
This change massively improves performance of random lookups by path
within an archive, as well as traversal using VSIReadDirRecursive. The
main change is to avoid full traversal of the VSIArchiveContent::entries
array for each path lookup.
I achieved this by building an index mapping each directory entry
to the indices of its children in the entries list, and using that index
to speed up lookups by path.
This also has a massive perf improvement for
VSIReadDirRecursive since it works by calling ReadDirEx for each
subdirectory (which previously meant looking at all entries). Now it
uses the directory index to immediately jump to where the directory is
in the entries list, avoiding visiting other entries.
On my laptop, ReadDirRecursive on a zip file containing 600 dirs each
containing 600 files:
* previous master gdal: 728 seconds
* after this change: 4.2 seconds
0 commit comments