[QST] How to get candidate features after using pre-trained embedding tables?

# ❓ Questions & Help

## Details

I am following the [tutorial here](https://github.com/NVIDIA-Merlin/models/blob/main/examples/usecases/entertainment-with-pretrained-embeddings.ipynb)to include pre-computed embeddings when I train a Two Tower Retrieval model. Specifically, I am using this method to not to include the Embedding Table as part of the model:

```
loader = mm.Loader(
    train,
    batch_size=1024,
    transforms=[
        EmbeddingOperator(
            pretrained_movie_embs,
            lookup_key="movieId",
            embedding_name="pretrained_movie_embeddings",
        ),
    ],
)
```

I am trying to match this solution with the Retrieval Model[ tutorial here](https://github.com/NVIDIA-Merlin/models/blob/main/examples/05-Retrieval-Model.ipynb).


```
# Top-K evaluation
candidate_features = unique_rows_by_features(train, Tags.ITEM, Tags.ITEM_ID)
candidate_features.head()

topk = 20
topk_model = model.to_top_k_encoder(candidate_features, k=topk, batch_size=128)

# we can set `metrics` param in the `compile(), if we want
topk_model.compile(run_eagerly=False)
```



The problem is that `loader.output_schema` is different from `loader.dataset.schema`. The utility function `unique_rows_by_features` requires a dataset as the first argument, but passing `loader.dataset` doesn't work as this dataset doesn't contain the embedding vectors yet. 

My question is, using the method to include pre-trained embeddings described above, how should one get the `candidate_features`, required by the Candidate Tower from the `loader`?

Thank you in advance if you take your time to answer!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QST] How to get candidate features after using pre-trained embedding tables? #1239

❓ Questions & Help

Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QST] How to get candidate features after using pre-trained embedding tables? #1239

Description

❓ Questions & Help

Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions