|
17 | 17 | "source": [ |
18 | 18 | "Recommendation systems are a common application of machine learning and serve many industries from e-commerce to music streaming platforms.\n", |
19 | 19 | "\n", |
20 | | - "There are many different architechtures that can be followed to build a recommendation system.\n", |
| 20 | + "There are many different architectures that can be followed to build a recommendation system. In a previous example notebook we demonstrated how to do [content filtering with RedisVL](content_filtering.ipynb). We encourage you to start there before diving into this notebook.\n", |
21 | 21 | "\n", |
22 | 22 | "In this notebook we'll demonstrate how to build a [collaborative filtering](https://en.wikipedia.org/wiki/Collaborative_filtering)\n", |
23 | 23 | "recommendation system and use the large IMDB movies dataset as our example data.\n", |
|
268 | 268 | } |
269 | 269 | ], |
270 | 270 | "source": [ |
271 | | - "# surprise casts userId and movieId to inner ids, so we have to use their mapping to now which rows to use\n", |
| 271 | + "# surprise casts userId and movieId to inner ids, so we have to use their mapping to know which rows to use\n", |
272 | 272 | "inner_uid = train_set.to_inner_uid(347) # userId\n", |
273 | 273 | "inner_iid = train_set.to_inner_iid(5515) # movieId\n", |
274 | 274 | "\n", |
|
582 | 582 | "movies_df['overview'] = movies_df['overview'].fillna('')\n", |
583 | 583 | "movies_df['popularity'] = movies_df['popularity'].fillna(0)\n", |
584 | 584 | "movies_df['release_date'] = movies_df['release_date'].fillna('1900-01-01').apply(lambda x: datetime.datetime.strptime(x, \"%Y-%m-%d\").timestamp())\n", |
585 | | - "movies_df['revenue'] = movies_df['revenue'].fillna(0) # fill with average?\n", |
586 | | - "movies_df['runtime'] = movies_df['runtime'].fillna(0) # fill with average?\n", |
| 585 | + "movies_df['revenue'] = movies_df['revenue'].fillna(0)\n", |
| 586 | + "movies_df['runtime'] = movies_df['runtime'].fillna(0)\n", |
587 | 587 | "movies_df['status'] = movies_df['status'].fillna('unknown')\n", |
588 | 588 | "movies_df['tagline'] = movies_df['tagline'].fillna('')\n", |
589 | 589 | "movies_df['title'] = movies_df['title'].fillna('')\n", |
|
1196 | 1196 | "## Adding All the Bells & Whistles\n", |
1197 | 1197 | "Vector search handles the bulk of our collaborative filtering recommendation system and is a great approach to generating personalized recommendations that are unique to each user.\n", |
1198 | 1198 | "\n", |
1199 | | - "To up our RecSys game even further we can leverage RedisVl filter logic to give more control to what users are shown. Why have only one feed of recommended movies when you can have several, each with its own theme and personalized to each user." |
| 1199 | + "To up our RecSys game even further we can leverage RedisVL Filter logic to give more control to what users are shown. Why have only one feed of recommended movies when you can have several, each with its own theme and personalized to each user." |
1200 | 1200 | ] |
1201 | 1201 | }, |
1202 | 1202 | { |
|
1428 | 1428 | "metadata": {}, |
1429 | 1429 | "source": [ |
1430 | 1430 | "## Keeping Things Fresh\n", |
1431 | | - "You've probably noticed that a few movies get repeated in these lists. That's not surprising as all our results are personalized and things like `popularity` and `user_rating` and `revenue` are likely highly correlated. And it's more that likely that at least some of the recommendations we're expecting to be highly rated by a given user are ones they've already watched and rated highly.\n", |
| 1431 | + "You've probably noticed that a few movies get repeated in these lists. That's not surprising as all our results are personalized and things like `popularity` and `user_rating` and `revenue` are likely highly correlated. And it's more than likely that at least some of the recommendations we're expecting to be highly rated by a given user are ones they've already watched and rated highly.\n", |
1432 | 1432 | "\n", |
1433 | 1433 | "Luckily Redis offers an easy answer to keeping recommendations new and interesting, and that answer is Bloom Filters." |
1434 | 1434 | ] |
|
0 commit comments