Neural Click Models by arabel1a · Pull Request #8 · sb-ai-lab/Sim4Rec

arabel1a · 2024-11-24T22:13:19Z

No description provided.

pyproject.toml

sim4rec/response/nn_response.py

monkey0head · 2024-12-02T08:22:15Z

sim4rec/response/nn_response.py

+            print("Warning: the historical data is empty")
+            hist_data = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        hist_data = hist_data.join(new_recs, on="user_idx", how="inner").select(


what is going on here? why do you join all new_recs columns? you need to take at least new_recs.select("user_idx").distinct() and do not select(hist_data["*"])) after

Wow, this is really a huge bug. I think, it is an artifact of one of the intermediate versions, where I tried to work with tables whose user_idx is unique and each row represent the whole itertaion. I'll fix it.

seems the fix was wrong, see the new suggestion #8 (comment)

monkey0head · 2024-12-02T08:23:05Z

sim4rec/response/nn_response.py

+            print("Warning: the simulator log is empty")
+            simlog = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        simlog = simlog.join(new_recs, on="user_idx", how="inner").select(simlog["*"])


same as above

#8 (comment)

monkey0head · 2024-12-02T08:28:35Z

sim4rec/response/nn_response.py

+            )
+        )
+
+        # not very optimal way, it makes one worker to


need to discuss. you batch id should not influence the partitioning. one partition != one batch and the users are grouped to batches within one partition. do not now how to implement it for now.

BatchID won't influence the partition, because each batch must consist of the whole interaction history of a specific group of users. I

ok, i see, just remove the comment

monkey0head · 2024-12-02T12:29:47Z

sim4rec/response/nn_response.py

+        self.backbone_response_model = None
+
+    def _fit(self, train_data):
+        """


pls describe the dataframe format here and for transform. what should be included to properly convert dataframe to the RecommendationData. pls add corresponding docstrings

It is exactly the same as the simulator logs format. Please give me advice, where I can obtain it's description.

monkey0head · 2024-12-02T16:22:59Z

Thank you for your contribution! Please, have a look at the comments and add a time measurements to the notebook to show the speed of the main stages of simulation pipeline.

Veronika-Ivanova · 2024-12-18T10:32:46Z

sim4rec/response/nn_response.py

+        """
+        Predict responses for given dataframe with recommendations.
+
+        :param dataframe: new recommendations.


the param name is not correct, should be new_recs

monkey0head · 2024-12-20T08:20:44Z

sim4rec/response/nn_response.py

+            print("Warning: the historical data is empty")
+            hist_data = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        hist_data = hist_data.join(new_recs, on="user_idx", how="semi")


If you really want to leave the history of only distinct users from new_recs in hist_data.

Suggested change

hist_data = hist_data.join(new_recs, on="user_idx", how="semi")

hist_data = hist_data.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

monkey0head · 2024-12-20T08:21:45Z

sim4rec/response/nn_response.py

+            print("Warning: the simulator log is empty")
+            simlog = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        simlog = simlog.join(new_recs, on="user_idx", how="semi")


same as for hist data...

Suggested change

simlog = simlog.join(new_recs, on="user_idx", how="semi")

simlog = simlog.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

arabel1a added 9 commits October 1, 2024 09:45

update dependencies

21ac50a

fix recent jupyter issue

1295e6d

dockerfile for cuda 10.2

5e78916

save intermediate progress

4fd2589

working version

4a0e26c

rename

b787cc4

merge main

aa0c608

notebook move to notebooks, changed model to SlatewiseTransformer

9954b7d

delete dockerfile

e23704c

monkey0head requested changes Dec 2, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

monkey0head reviewed Dec 2, 2024

View reviewed changes

sim4rec/response/nn_response.py Outdated Show resolved Hide resolved

monkey0head reviewed Dec 2, 2024

View reviewed changes

after-review fix

a7b1325

Veronika-Ivanova reviewed Dec 18, 2024

View reviewed changes

monkey0head reviewed Dec 20, 2024

View reviewed changes

arabel1a and others added 5 commits December 24, 2024 13:35

clean up comments

e54cb85

update docstring for NNTransformer

74bbc31

cleanup

8412683

black

355f1f0

Update embeddings.py

ef41562

	hist_data = hist_data.join(new_recs, on="user_idx", how="semi")
	hist_data = hist_data.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

	simlog = simlog.join(new_recs, on="user_idx", how="semi")
	simlog = simlog.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

Conversation

arabel1a commented Nov 24, 2024

Uh oh!

Uh oh!

Uh oh!

monkey0head Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monkey0head Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arabel1a Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monkey0head commented Dec 2, 2024

Uh oh!

Veronika-Ivanova Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

monkey0head Dec 2, 2024 •

edited

Loading

monkey0head Dec 2, 2024 •

edited

Loading

arabel1a Dec 16, 2024 •

edited

Loading

Veronika-Ivanova Dec 18, 2024 •

edited

Loading