Skip to content

Still too slow when there is not enough RAM #145

@irevoire

Description

@irevoire

We made a huge rework of the arroy architecture to allow us to index large dataset with little RAM, but it's still not enough.

Context

A bit of context on why we did all that and the way we did it, then we'll see the issue I'm still encountering.

Before

What we were doing was basically:

  1. Get the pointer to all the vectors in a memory-mapped region that won't change during the whole indexing process
  2. Every time we update or create a node, we instead write it to a temporary file that will be written to the database at the end of the indexing process
  3. Insert the new elements in the current tree; each thread can work on its own tree in parallel
    1. If a descendant becomes too big in the process, we split it and create a new sub-tree from here
    2. We have to scan the whole database to remove the deleted elements
  4. If we need more trees, we can build them in parallel at the tree level
  5. We retrieve all the temporary files and write them to the database

When the whole database doesn't fit in RAM that process causes a few issues:

  • During 3, to know if an item must go to the left or right of a split node, we have to compare it against the normal of that node, which means we have to load all the items in RAM and compare them against the split node. If the items don't fit in RAM, that's an issue because for each split node, we're going to read on disk, and that's worth it because it's done in parallel with all cores.
  • During 3.1, the issue is exacerbated because even if all the "new items to insert" fit in RAM, we're now pulling new random vectors from the database and using them to create split nodes, which generates a ton of random reads as well
  • Finally, during 4, even if it happens less often than the rest, we're reading the whole database at once

Now

Our new strategy to avoid all these issues works like this:

  1. Get the pointer to all the vectors in a memory-mapped region that won't change during the whole indexing process
  2. Same as before, when we have to write something, we write it to a temporary file that will be written to the database at the end of the indexing
  3. First we want to insert the new elements in the current trees:
    1. Compute ahead of time how many vectors fit in RAM and retrieve their ptr
    2. Insert them in the tree. We have to compare them against the normal as before, but now, since everything fits in RAM, they should only be read once. Multiple threads can do this work in parallel for all the trees.
    3. When a leaf becomes too large and must be split into a subtree, we store all the IDs it contains in RAM and don't split it now.
    4. Come back to 1 until there are no vectors left
  4. We now have a list of leaves that are too large and must be exploded into a subtree
  5. For each missing tree, we add a new leaf to this list that contains the IDs of the whole database
  6. At this point, we won't ever read a tree node again from the database
  7. We explode the leaves to create all the subtrees required
    1. Each thread pops one large leaf and does the step 3 on it.
    2. Then it iterates over the list of leaves generated, the ones that are too large are pushed again in the list of large leaves and will be processed by this thread or another
  8. We retrieve all the temporary files and write them to the database

Most of the work was finished in #130, where I parallelized the code, and a few bug fixes were done afterwards.

The issue

When the size of the database exceeds the RAM, we're once again doing a ton of random reads and nothing else.
Here's what it looks like:

Image

  • The RAM increases by a few megabytes, and then we start swapping 2 GiB of RAM and spending 99% of our time on IO wait
  • During this time, we're barely writing anything to the disk, just reading and refreshing our pages over and over again

It's kinda expected that we have to do a lot of random reads on the disk. But here it's at a level where absolutely nothing makes any progress

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions