Skip to content

Commit a624e3e

Browse files
authored
Support new page nav attribute changes made by SA (#1221)
* Support new page nav attribute changes made by SA The original scrapePageDropdown method is still in place as a fallback, but the new data atributes will be used first. See: https://forums.somethingawful.com/showthread.php?threadid=3944917&pagenumber=3#post548658773 - The tests using the updated showthread2.html fixture (testIgnoredPost and testWeirdSizeTags) are using the new data attributes: [PageNav] Using data attributes: page 17/18 - The tests with older fixtures are falling back to the select menu method: [PageNav] Using select menu fallback: page X/Y Changes Made: 1. Updated scrapePageDropdown function (AwfulCore/Sources/AwfulCore/Scraping/Helpers.swift:144-169): - First attempts to read the new data-current-page and data-total-pages attributes - Falls back to the old select menu counting method for backward compatibility - Added logging to show which method is being used: [PageNav] Using data attributes or [PageNav] Using select menu fallback 2. Added new PageNavigationData struct and scrapePageNavigationData function (Helpers.swift:111-142): - Captures all four new data attributes: current-page, total-pages, base-url, and per-page - Available for future use when you need the base URL or per-page values 3. Updated test fixture (showthread2.html): - Replaced this file with a modern sample (taken from the same thread and roughly the same page) - Tests confirm the new attributes are being read correctly Test Results: - All 60 tests pass successfully - Logging shows the new data attributes are being used when present - Fallback to select menu works for older fixtures without the attributes The implementation ensures the app won't break when the site changes the inner HTML of the pages div, as warned by the admin. * Removed print statements, removed scrapePageDropdown() fallback and solely use the new scrapePageNavigationData()
1 parent 74f2539 commit a624e3e

File tree

4 files changed

+2076
-2012
lines changed

4 files changed

+2076
-2012
lines changed

AwfulCore/Sources/AwfulCore/Scraping/Helpers.swift

Lines changed: 25 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -108,20 +108,35 @@ func scrapeCustomTitle(_ html: HTMLNode) -> RawHTML? {
108108
}
109109

110110

111-
func scrapePageDropdown(_ node: HTMLNode) -> (pageNumber: Int?, pageCount: Int?) {
111+
struct PageNavigationData {
112+
let currentPage: Int
113+
let totalPages: Int
114+
let baseURL: String?
115+
let perPage: Int?
116+
}
117+
118+
func scrapePageNavigationData(_ node: HTMLNode) -> PageNavigationData? {
112119
guard let pageDiv = node.firstNode(matchingSelector: "div.pages") else {
113-
return (nil, nil)
120+
return nil
114121
}
115-
let pageSelect = pageDiv.firstNode(matchingSelector: "select")
116-
let pageCount = pageSelect?.childElementNodes.count ?? 1
117122

118-
let pageNumber = pageSelect
119-
.flatMap { $0.firstNode(matchingSelector: "option[selected]") }
120-
.flatMap { $0["value"] }
121-
.flatMap { Int($0) }
122-
?? 1
123+
if let currentPageStr = pageDiv["data-current-page"],
124+
let totalPagesStr = pageDiv["data-total-pages"],
125+
let currentPage = Int(currentPageStr),
126+
let totalPages = Int(totalPagesStr) {
127+
128+
let baseURL = pageDiv["data-base-url"]
129+
let perPage = pageDiv["data-per-page"].flatMap { Int($0) }
130+
131+
return PageNavigationData(
132+
currentPage: currentPage,
133+
totalPages: totalPages,
134+
baseURL: baseURL,
135+
perPage: perPage
136+
)
137+
}
123138

124-
return (pageNumber: pageNumber, pageCount: pageCount)
139+
return nil
125140
}
126141

127142
enum ForumGroupID: String {

AwfulCore/Sources/AwfulCore/Scraping/PostsPageScrapeResult.swift

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,9 @@ public struct PostsPageScrapeResult: ScrapeResult {
3434

3535
isSingleUserFilterEnabled = body.firstNode(matchingSelector: "table.post a.user_jump[title *= 'Remove']") != nil
3636

37-
(pageNumber: pageNumber, pageCount: pageCount) = scrapePageDropdown(body)
37+
let pageNavData = scrapePageNavigationData(body)
38+
pageNumber = pageNavData?.currentPage
39+
pageCount = pageNavData?.totalPages
3840

3941
let posts = try body
4042
.nodes(matchingSelector: "table.post")

AwfulCore/Sources/AwfulCore/Scraping/ThreadListScrapeResult.swift

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,9 @@ public struct ThreadListScrapeResult: ScrapeResult {
8080

8181
isBookmarkedThreadsPage = body.firstNode(matchingSelector: "form[name='bookmarks']") != nil
8282

83-
(pageNumber: pageNumber, pageCount: pageCount) = scrapePageDropdown(body)
83+
let pageNavData = scrapePageNavigationData(body)
84+
pageNumber = pageNavData?.currentPage
85+
pageCount = pageNavData?.totalPages
8486
}
8587
}
8688

0 commit comments

Comments
 (0)