-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
In this function, when we have one query/page’s worth of duplicate links, it will stop fetching more from the server. We assumed that when the response is empty, all links have been fetched, but if the entire response consists of duplicates, it will stop as well. This results in the remaining links being duplicated again and again.
starwarden/starwarden/linkwarden_api.py
Lines 21 to 50 in 7ccb653
| while True: | |
| try: | |
| logger.debug(f"Fetching links from cursor {cursor} for collection {collection_id}") | |
| response = requests.get( | |
| url, | |
| params={"collectionId": collection_id, "cursor": cursor, "sort": 1}, | |
| headers=headers, | |
| timeout=30, | |
| ) | |
| response.raise_for_status() | |
| data = response.json() | |
| links = data.get("response", []) | |
| logger.debug(f"Fetched {len(links)} links from cursor {cursor}") | |
| new_links = [link["url"] for link in links if link["url"] not in seen_links] | |
| if not new_links: | |
| logger.info(f"No new links found from cursor {cursor}. Stopping pagination.") | |
| break | |
| seen_links.update(new_links) | |
| yield from new_links | |
| if not links: | |
| break | |
| cursor = links[-1].get("id") | |
| except requests.RequestException as e: | |
| logger.error(f"Error fetching links from cursor {cursor}: {str(e)}") | |
| if hasattr(e, "response") and e.response is not None: | |
| logger.error(f"Response status code: {e.response.status_code}") | |
| logger.error(f"Response content: {e.response.text}") |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels