Open
Conversation
dixudx
requested changes
Mar 19, 2018
Owner
dixudx
left a comment
There was a problem hiding this comment.
Remove the redundant binary file.
tumblr-photo-video-ripper.py
Outdated
| try: | ||
| medium_url = self._handle_medium_url(medium_type, post) | ||
| medium_url_bak = medium_url | ||
| medium_url =re.sub(u'[^/]*media.tumblr.com', u'data.tumblr.com', medium_url) |
tumblr-photo-video-ripper.py
Outdated
| medium_url = self._handle_medium_url(medium_type, post) | ||
| medium_url_bak = medium_url | ||
| medium_url =re.sub(u'[^/]*media.tumblr.com', u'data.tumblr.com', medium_url) | ||
| if (b'_100.' in medium_url): |
Owner
There was a problem hiding this comment.
I don't like this exhaustive way. Hard coded is not a good choice.
Why not splitting the string and replacing with raw instead?
Owner
There was a problem hiding this comment.
And you should not change at here. Only photos/images are applicable with raw.
Method _download(**) is the right place.
added 2 commits
March 19, 2018 14:33
dixudx
requested changes
Mar 19, 2018
| medium_url = self._handle_medium_url(medium_type, post) | ||
| if medium_url is not None: | ||
| self._download(medium_type, medium_url, target_folder) | ||
| #print("medium url is %s", medium_url) |
| self._download(medium_type, medium_url, target_folder, resp_raw) | ||
| elif medium_type == "photo": | ||
| medium_url_bak = medium_url | ||
| medium_url_dot = medium_url.split('.') |
Owner
There was a problem hiding this comment.
The url parsing here seems complex and error-prone.
Below part is a better way. WDYT?
def download(self, medium_type, post, target_folder):
try:
medium_url = self._handle_medium_url(medium_type, post)
if medium_url is not None:
if medium_type == "photo":
try:
# try to download raw image
medium_url_raw = medium_url.replace("68.media.tumblr.com", "data.tumblr.com")
raw_matched = self.hd_photo_regex.match(medium_url_raw)
if raw_matched is not None:
replace_raw = raw_matched.groups()[0]
replace_raw = replace_raw.replace(raw_matched.groups()[1], "raw")
medium_url_raw = medium_url_raw.replace(raw_matched.groups()[0], replace_raw)
self._download(medium_type, medium_url_raw, target_folder)
return
except:
pass
self._download(medium_type, medium_url, target_folder)
except TypeError:
pass
# can register differnet regex match rules
def _register_regex_match_rules(self):
# will iterate all the rules
# the first matched result will be returned
self.regex_rules = [video_hd_match(), video_default_match()]
self.hd_photo_regex = re.compile(r".*(tumblr_\w+_(\d+))", re.IGNORECASE)
Author
|
|
Owner
@liyiecho So just use |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes: #65