Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 18 additions & 3 deletions bertopic/_bertopic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3783,19 +3783,34 @@ def _reduce_dimensionality(
if partial_fit:
if hasattr(self.umap_model, "partial_fit"):
self.umap_model = self.umap_model.partial_fit(embeddings)
umap_embeddings = self.umap_model.transform(embeddings)
elif self.topic_representations_ is None:
self.umap_model.fit(embeddings)
umap_embeddings = self.umap_model.transform(embeddings)
else:
if hasattr(self.umap_model, "fit_transform"):
umap_embeddings = self.umap_model.fit_transform(embeddings)
else:
self.umap_model.fit(embeddings)
umap_embeddings = self.umap_model.transform(embeddings)

# Regular fit
else:
try:
# cuml umap needs y to be an numpy array
y = np.array(y) if y is not None else None
self.umap_model.fit(embeddings, y=y)
if hasattr(self.umap_model, "fit_transform"):
umap_embeddings = self.umap_model.fit_transform(embeddings, y=y)
else:
self.umap_model.fit(embeddings)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing the y=y here in the .fit. I can see it's there in the .fit_transform though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you moved the y=y to transform step but shouldn't it be used in the .fit instead?

Copy link
Author

@jemather jemather Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should. Sorry about that, I'll fix it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your latest commit, you moved the y parameter to the wrong line. It shouldn't be used in .transform but .fit instead.

umap_embeddings = self.umap_model.transform(embeddings, y=y)
except TypeError:
self.umap_model.fit(embeddings)
if hasattr(self.umap_model, "fit_transform"):
umap_embeddings = self.umap_model.fit_transform(embeddings)
else:
self.umap_model.fit(embeddings)
umap_embeddings = self.umap_model.transform(embeddings)

umap_embeddings = self.umap_model.transform(embeddings)
logger.info("Dimensionality - Completed \u2713")
return np.nan_to_num(umap_embeddings)

Expand Down
Loading