Skip to content

Latest commit

 

History

History
66 lines (51 loc) · 2.2 KB

File metadata and controls

66 lines (51 loc) · 2.2 KB

HEI scraper

HEI logo

Hi guys, it's movie time!

This repo is the scraper responsible for getting data from the HEI Network website and others in order to create a full On Cinema timeline.

On Cinema At The Cinema item count: 407

Progress

  • On Cinema (show)
  • Decker
  • Automate creating one large combined json file in root of repo
  • On Cinema (podcast)
  • HEI Network News v1
  • HEI Network News v2: Using the 'previous' button to go further back
  • Updated data structure (see below)
  • Non-scraped content, in the correct structure (for example the screening of Port of Call on YouTube)
  • Automate checking

Data structure

v1

  {
    "show": "on_cinema",
    "media_type": "episode",
    "collection": "Season 15",
    "title": "'Valiant One' & 'Dog Man'",
    "description": null,
    "date_published": "2025-01-29T00:00:00",
    "url": "https://heinetwork.tv/episode/valiant-one-dog-man/",
    "poster_url": "https://www.heinetwork.tv/wp-content/uploads/2025/01/on_cinema_s15_ep06.png",
  },

v2

  {
    "franchise": "on_cinema", // replacing 'show'
    "media_type": "episode", // Other options: Article, trailer, movie (for Mister America)
    "season_name": "Season 15",
    "season_number": 15, // useful for sorting, in particular the Decker seasons
    "title": "'Valiant One' & 'Dog Man'",
    "date_published": "2025-01-29T00:00:00",
    "published_by": null, // in the case of articles, we'll add the name
    "url": "https://heinetwork.tv/episode/valiant-one-dog-man/",
    "poster_url": "https://www.heinetwork.tv/wp-content/uploads/2025/01/on_cinema_s15_ep06.png",
    "is_bonus": "false", // to easily find if something is a 'bonus' bit of content
    "is_meta": "false", // to easily find if something is meta content, for example the wrap parties
  },

Acknowledgements

This work wouldn't be possible without the amazing On Cinema Timeline website to use as a resource and as inspiration.

Related repos

HEI api