Fix images parsing and add artist album to ChartEntry#80
Fix images parsing and add artist album to ChartEntry#80soubenz wants to merge 7 commits intoguoguo12:masterfrom
Conversation
|
Thanks! Can you rebase your branch? I fixed the tests in master. I'll leave some inline comments. |
billboard.py
Outdated
| weeks, | ||
| rank, | ||
| isNew, | ||
| artistImage=None): |
There was a problem hiding this comment.
For consistency, can you explicitly set artistImage to None in _parseOldStylePage instead of having a default value?
| @@ -65,8 +66,17 @@ class ChartEntry: | |||
| rank: The track's position on the chart, as an int. | |||
| isNew: Whether the track is new to the chart, as a boolean. | |||
There was a problem hiding this comment.
Can you add the new field to the doc comment?
billboard.py
Outdated
| dateElement = soup.select_one("button.date-selector__button.button--link") | ||
| artists = [] | ||
| # extract artists info from page | ||
| pageContent = json.loads(soup.select_one("div#charts")['data-charts']) |
There was a problem hiding this comment.
Does this raise if this fails to load? Can we catch the exception instead?
billboard.py
Outdated
| # TODO: Parse the image | ||
| image = None | ||
| # get image data from artists list | ||
| artistData = list(filter(lambda x: x['name'] == artist, |
There was a problem hiding this comment.
Sounds like artists should be a dict.
There was a problem hiding this comment.
Although, what happens if an artist appears multiple times on a chart with different albums? Does titleImage work correctly?
There was a problem hiding this comment.
Yes, you're right. I'll fix that
billboard.py
Outdated
| # searching for image in 3 different sizes | ||
| if artistImages: | ||
| _artistImage = artistImages['sizes'].get('original') | ||
| if not _artistImage: |
There was a problem hiding this comment.
I think
x = z1
if not x:
x = z2
if not x:
x = z3
is usually written x = z1 or z2 or z3.
There was a problem hiding this comment.
although it wouldn't be good for readability , don't you think ?
also, it is possible to add more later. Since sometimes the image is available only in small sizes
There was a problem hiding this comment.
What about this?
_artistImage = None
for variant in ['original', 'ye-landing-lg-2x', 'ye-landing-med-2x']:
_artistImage = _artistImage or artistImages['sizes'].get(variant)
billboard.py
Outdated
|
|
||
| def _parseNewStylePage(self, soup): | ||
| dateElement = soup.select_one("button.date-selector__button.button--link") | ||
| artists = [] |
There was a problem hiding this comment.
Can you move this whole block into a helper function?
billboard.py
Outdated
| 'ye-landing-med-2x') | ||
| artistImage = _artistImage['Name'] if _artistImage else None | ||
| if titleImages: | ||
| _titleImage = titleImages['sizes'].get('ye-landing-lg-2x') |
There was a problem hiding this comment.
Why do you prefer ye-landing-lg-2x here but original for the artist image?
There was a problem hiding this comment.
for some cases, ye-landing-lg-2x was available when original wasn't, but the approach you suggested above will fix that.
billboard.py
Outdated
| # searching for image in 3 different sizes | ||
| if artistImages: | ||
| _artistImage = artistImages['sizes'].get('original') | ||
| if not _artistImage: |
There was a problem hiding this comment.
What about this?
_artistImage = None
for variant in ['original', 'ye-landing-lg-2x', 'ye-landing-med-2x']:
_artistImage = _artistImage or artistImages['sizes'].get(variant)|
@guoguo12 I did some modifications added more sizes for the images |
Billboard website use Javascript to lazy load images. This caused the issue with parsing the images. I managed to trace the data to an object in the page and create the links.
I also added artist image next to the title/album image.