Skip to content

Latest commit

 

History

History
33 lines (23 loc) · 1.78 KB

File metadata and controls

33 lines (23 loc) · 1.78 KB

Problem Statement

As part of the assignment, you are required to explore IMDB dataset from Kaggle or an open-source repository of your choice. You will apply your knowledge of deep learning and PyTorch to develop a solution for a specific problem in the dataset. Use Azure ML Studio with Designer Pipelines for using Pre-Build components and Visualization of movie categories for recommendation system.


Steps for the Assignment

  1. Dataset Collection:

    • Choose a dataset from Kaggle or any open-source source. Ensure that the dataset is not overly complex and can be processed on a standard local machine.
    • Sample Dataset from kaggle is here
  2. Preprocessing:

    • Analyze the dataset and preprocess the data to make it suitable for training (e.g., normalization, encoding, splitting).
    • Use Stemming, Stopwords, Lemmatization, etc.,
  3. Feature Extraction:

    • Use Bag of Words or TF-IDF Vectorization technique
  4. Model Building:

    • Build a PyTorch-based deep learning model that solves the selected problem. You can use a simple neural network or experiment with architectures like CNNs or RNNs depending on your dataset.
  5. Training and Evaluation:

    • Train your model, evaluate its performance, and present metrics like accuracy, loss, or other relevant metrics.
  6. Visualization:

    • Include visualizations such as loss curves, accuracy trends, or sample predictions.
  7. Documentation:

    • Document your findings, observations, and challenges in a concise report.

Add that document to your Git repository and share the repository URL with your instructor.URL in this Microsoft Form - Here