Skip to content

Commit 7aafd6c

Browse files
committed
Ensure image captions and authors are extracted correctly
1 parent 86e7dd2 commit 7aafd6c

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

src/fundus/publishers/de/der_freitag.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,4 +52,6 @@ def images(self) -> List[Image]:
5252
upper_boundary_selector=CSSSelector("header.bc-article-intro"),
5353
lower_boundary_selector=CSSSelector("span.freitag-article-end"),
5454
image_selector=CSSSelector("figure img,div[role='figure'] img"),
55+
caption_selector=XPath("./ancestor::figure//figcaption//span[@class='bo-image__caption__desc']"),
56+
author_selector=XPath("./ancestor::figure//figcaption//span[@class='bo-image__caption__credit']"),
5557
)

0 commit comments

Comments
 (0)