Skip to content

Commit c94f09d

Browse files
committed
Change how to scrape a page of a video
Facebook goes to a 'not found' page when you try to visit a video with invalid video id on URL. So for scraping reason, we need to distinguish this 'not found' page from actual video pages. The previous method was also filtering some actual videos (because Facebook changed their HTML format) so, this is a try to filter only 'not found' pages.
1 parent 78e6453 commit c94f09d

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

lib/funky/html/page.rb

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ module HTML
44
class Page
55
def get(video_id:)
66
body = response_for(video_id).body
7-
if body.include? '<meta name="description"'
8-
body
9-
else
7+
8+
if body.include? '<title id="pageTitle">Facebook</title>'
109
raise ContentNotFound, 'Please double check the ID and try again.'
10+
else
11+
body
1112
end
1213
end
1314

0 commit comments

Comments
 (0)