Skip to content

Latest commit

 

History

History
9 lines (8 loc) · 531 Bytes

File metadata and controls

9 lines (8 loc) · 531 Bytes

twitter_scraping_labelling

Pulls tweets based on location keyword, and create a labelling application to label them as tourism-relevant or not!

Current status:

  • Pull Tweets using API based on timeframe and keywords
  • Prefilter tweets (only keep Japanese language tweets, or English tweets explicitly mentioning tourist activities. Remove emojis and mentions)
  • Load into database (SQLite or PostgreSQL)
  • Stratified time-based sampling for labelling
  • Working on: Labelling process, and then relevance model (different repo)