Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.
This repository was archived by the owner on Oct 10, 2025. It is now read-only.

Bug: DROP_VECTOR_INDEX leaves corrupted metadata preventing property updates #6040

@br41nlet

Description

@br41nlet

Bug Report: DROP_VECTOR_INDEX leaves corrupted metadata preventing property updates

Environment

  • Kuzu Version: 0.11.2
  • Storage Version: 39
  • API: REST API (kuzudb/api-server:latest)
  • Vector Extension: Loaded via LOAD EXTENSION VECTOR

Summary

After creating and dropping a vector index on a property, that property becomes permanently un-updatable via SET or MERGE operations. The error persists even after:

  • Dropping the vector index (confirmed via SHOW_INDEXES())
  • Restarting the database
  • No indexes remaining in the catalog

Expected Behavior

After DROP_VECTOR_INDEX successfully removes a vector index, the property should be updatable again via SET or MERGE operations.

Actual Behavior

The property remains permanently un-updatable with error:

Runtime exception: Cannot set property vec in table embeddings because it is used in one or more indexes. Try delete and then insert.

Note: No table named "embeddings" exists in the catalog, suggesting this is an internal metadata structure.

Impact

This bug makes vector indexes incompatible with mutable data:

  • Cannot update entity properties that have embeddings
  • Breaks MERGE-based upsert workflows
  • Makes incremental data updates impractical

Root Cause Analysis

After deeper investigation into the source code, I suspect the bug is in NodeTable::dropIndex():

File: src/storage/store/node_table.cpp

void NodeTable::addIndex(std::unique_ptr<Index> index) {
    // ...
    indexes.push_back(IndexHolder{std::move(index)});
    hasChanges = true;  // ← Marks table for checkpoint
}

void NodeTable::dropIndex(const std::string& name) {
    KU_ASSERT(getIndex(name) != nullptr);
    for (auto it = indexes.begin(); it != indexes.end(); ++it) {
        if (StringUtils::caseInsensitiveEquals(it->getName(), name)) {
            KU_ASSERT(it->isLoaded());
            indexes.erase(it);
            return;  // ← BUG: Missing hasChanges = true!
        }
    }
}

What happens:

  1. DROP_VECTOR_INDEX calls Catalog::dropIndex() → removes from catalog ✓
  2. DROP_VECTOR_INDEX calls NodeTable::dropIndex() → removes from in-memory indexes vector ✓
  3. BUT NodeTable::dropIndex() doesn't set hasChanges = true
  4. On checkpoint, the table's serialization includes the OLD indexes (from disk)
  5. On restart/reload, NodeTable::deserialize() loads the persisted indexes vector which still contains the dropped index

File: src/storage/store/node_table.cpp

void NodeTable::serialize(Serializer& serializer) const {
    nodeGroups->serialize(serializer);
    serializer.write<uint64_t>(indexes.size());  // ← Serializes indexes vector
    for (auto i = 0u; i < indexes.size(); ++i) {
        indexes[i].serialize(serializer);
    }
}

void NodeTable::deserialize(...) {
    // ...
    indexes.clear();
    indexes.reserve(indexInfos.size());
    for (auto i = 0u; i < indexInfos.size(); ++i) {
        indexes.push_back(IndexHolder(...));  // ← Reloads old indexes from disk
    }
}

Why updates fail:

When updating a property, NodeTable::initUpdateState() iterates through the indexes vector and calls each index's initUpdateState():

void NodeTable::initUpdateState(...) {
    for (auto i = 0u; i < indexes.size(); i++) {
        auto& indexHolder = indexes[i];
        auto index = indexHolder.getIndex();
        if (index->isBuiltOnColumn(nodeUpdateState.columnID)) {
            // Calls HNSWIndex::initUpdateState() which throws error
            nodeUpdateState.indexUpdateState[i] = index->initUpdateState(...);
        }
    }
}

The HNSW index's initUpdateState() is hardcoded to throw:

File: extension/vector/src/include/index/hnsw_index.h

std::unique_ptr<UpdateState> initUpdateState(...) override {
    throw common::RuntimeException{
        "Cannot set property vec in table embeddings because it is "
        "used in one or more indexes. Try delete and then insert."
    };
}

The Fix

Add hasChanges = true in NodeTable::dropIndex():

void NodeTable::dropIndex(const std::string& name) {
    KU_ASSERT(getIndex(name) != nullptr);
    for (auto it = indexes.begin(); it != indexes.end(); ++it) {
        if (StringUtils::caseInsensitiveEquals(it->getName(), name)) {
            KU_ASSERT(it->isLoaded());
            indexes.erase(it);
            hasChanges = true;  // ← Add this line
            return;
        }
    }
}

This ensures the NodeTable gets checkpointed with the updated indexes vector, and on restart/reload, the dropped index won't be in the collection.

Are there known steps to reproduce?

Minimal Reproducible Example

-- Step 1: Create a node table with vector property
CREATE NODE TABLE Person(id STRING, name STRING, embeddings FLOAT[128], PRIMARY KEY(id));

-- Step 2: Insert a test node
CREATE (p:Person {id: "person-1", name: "Alice", embeddings: [0.1, 0.2, 0.3, ..., 0.128]});

-- Step 3: Create vector index on embeddings property
CALL CREATE_VECTOR_INDEX(
    'Person',
    'person_embeddings_idx',
    'embeddings',
    mu := 30,
    ml := 60,
    pu := 0.05,
    metric := 'cosine',
    efc := 200,
    cache_embeddings := true
);

-- Step 4: Verify index was created
CALL SHOW_INDEXES() RETURN *;
-- Output: Shows person_embeddings_idx on Person.embeddings

-- Step 5: Drop the vector index
CALL DROP_VECTOR_INDEX('Person', 'person_embeddings_idx');
-- Output: "Table _X_person_embeddings_idx_LOWER has been dropped."

-- Step 6: Verify index was removed
CALL SHOW_INDEXES() RETURN *;
-- Output: Empty result (no indexes remain)

-- Step 7: Try to update the embeddings property
MATCH (p:Person {id: "person-1"})
SET p.embeddings = [0.2, 0.3, 0.4, ..., 0.129]
RETURN p.id;

-- ERROR: Runtime exception: Cannot set property vec in table embeddings because it is used in one or more indexes. Try delete and then insert.

Additional Observations

The problem is table-specific

-- Create another table that never had a vector index
CREATE NODE TABLE Organization(id STRING, name STRING, embeddings FLOAT[128], PRIMARY KEY(id));
CREATE (o:Organization {id: "org-1", name: "Company", embeddings: [0.1, 0.2, ...]});

-- This works fine (no index was ever created)
MATCH (o:Organization {id: "org-1"})
SET o.embeddings = [0.2, 0.3, ...]
RETURN o.id;
-- Success!

-- But Person table (which had an index) remains broken
MATCH (p:Person {id: "person-1"})
SET p.embeddings = [0.2, 0.3, ...]
RETURN p.id;
-- Still fails with same error

Other properties are unaffected

-- Non-indexed properties on the same table work fine
MATCH (p:Person {id: "person-1"})
SET p.name = "Updated Name"
RETURN p.name;
-- Success!

The issue persists across database restarts

After restarting the Kuzu server, the same error occurs, indicating the corrupted metadata is persisted to disk.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions