Big Data course project: Spark and Elasticsearch recommendation system.

This is my project for the Big Data course for Master's Degree at University of Verona.

Workflow

The notebook has the following workflow:

Requisites

Spark installation (I had v3.4.0)
Elasticsearch installation (I had 8.8.0)
Conda virtual environment with necessary packages installed (conda create --name <env> --file requirements.txt)

Usage

For first, activate your elasticsearch instance. After that, execute all cell of spark_es_recommendation_movies.ipynb notebook till you arrive to the last, where you can choose a movie id to get 10 recommended movies for it (in the example, recommended movies for La vita è bella).

If you want to save statistics about CPU and RAM usage you can set save_stats = True in the second notebook cell. To plot these statistics you can run the only cell contained in plots.ipynb notebook.

If you want to know more about this project, check out this pdf document, my report for the project. For this project I took inspiration from this medium article.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
stats		stats
.gitignore		.gitignore
README.md		README.md
big_data_report.pdf		big_data_report.pdf
monitor.sh		monitor.sh
plots.ipynb		plots.ipynb
requirements.txt		requirements.txt
spark_es_recommendation_movies.ipynb		spark_es_recommendation_movies.ipynb
spark_es_workflow.png		spark_es_workflow.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big Data course project: Spark and Elasticsearch recommendation system.

Workflow

Requisites

Usage

About

Uh oh!

Uh oh!

Languages

matteomarjanovic/spark-recommender

Folders and files

Latest commit

History

Repository files navigation

Big Data course project: Spark and Elasticsearch recommendation system.

Workflow

Requisites

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages