Skip to content

Commit 6e26dae

Browse files
committed
properly track all urls
Signed-off-by: John Seekins <[email protected]>
1 parent c20f803 commit 6e26dae

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

scraper.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,14 @@ def scrape_facilities(self):
289289
self.facilities_data["facilities"][full_address] = self._update_facility(
290290
self.facilities_data["facilities"][full_address], facility
291291
)
292+
# update to the frequently nicer address from ice.gov
292293
self.facilities_data["facilities"][full_address]["address"] = addr
294+
# add scraped urls
295+
for url in facility["source_urls"]:
296+
# no dupes
297+
if url in self.facilities_data["facilities"][full_address]["source_urls"]:
298+
continue
299+
self.facilities_data["facilities"][full_address]["source_urls"].append(url)
293300
# this is likely to produce _some_ duplicates, but it's a reasonable starting place
294301
else:
295302
self.facilities_data["facilities"][facility["name"]] = facility

0 commit comments

Comments
 (0)