Skip to content

box-key/Subjective-Class-Issue

Repository files navigation

Description

Codes for our article, "A Pitfall of Learning from User-generated Dataset", on a type of class noise specific to user-generated datasets (e.g. customer reviews) called Subjective Class Issue. We used datasets provided generaously by Donorschoose.org, Yelp Review, and Amazon Fine Food. By following the usage below, you can replicate the results shown in our paper.

Requirements

  • python3
  • jupyter notebook

Usage

If you would like to run our notebooks, please follow the steps below:

  1. Download "Project Essays" provided by Donorschoose.org
  2. git clone this repo
  3. cd into the repo
  4. open python virtual environment
  5. run pip install -r requirements.txt
  6. in the python virtual environment, open jupyter notebook
  7. open ipynb files, where the name represents the tasks of each notebook (*make sure you train doc2vec before you run tsne plots)

About

Codes for our article, "A Pitfall of Learning from User-generated Dataset"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published