You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
_ICE Detention Facilities Data Scraper and Enricher_, a Python script managed by the [Open Security Mapping Project](https://github.com/Open-Security-Mapping-Project).
4
+
5
+
In short this will help identify the online profile of each ICE detention facility. Please see the [project home page](https://github.com/Open-Security-Mapping-Project)
6
+
for more about mapping these facilities and other detailed info sources.
7
+
8
+
This script scrapes ICE detention facility data from ICE.gov and enriches it
9
+
with information from [Wikipedia](https://en.wikipedia.org), [Wikidata](https://wikidata.org), and
10
+
[OpenStreetMap](https://openstreetmap.org).
11
+
12
+
The main purpose right now is to identify if the detention facilities have data on Wikipedia, Wikidata and OpenStreetMap,
13
+
which will help with documenting the facilities appropriately. As these entries get fixed up, you should be able to see
14
+
your CSV results change almost immediately.
15
+
16
+
You can also use `--load-existing` to leverage an existing
17
+
scrape of the data from ICE.gov. This is stored in data_loader.py and includes the official current addresses of facilities.
18
+
(Note ICE has been renaming known "detention center" sites to "processing center", and so on.)
19
+
20
+
It also shows the ICE "field office" managing each detention facility.
21
+
22
+
On the OpenStreetMap (OSM) CSV results, if the URL includes a "way" then it has probably identified the correctly tagged
23
+
polygon. If you visit that URL you should see the courthouse or "prison grounds" way / area info. (This info can always
24
+
be improved, but at least it exists.)
25
+
26
+
On Wikipedia results the result will tend to be the first hit on the list of suggested pages, if it can't find the page
27
+
directly.
28
+
29
+
The script is MIT license, please feel free to fork it and/or submit patches.
30
+
31
+
The script should be compliant with these websites' rate limiting for queries.
32
+
33
+
At this point of development you probably want "enable all debugging" to see the results below.
34
+
35
+
## Usage:
36
+
37
+
Run the script and by default it will put a CSV file called `ice_detention_facilities_enriched.csv` in the same
38
+
directory.
39
+
40
+
```
41
+
python main.py --scrape # Scrape fresh data from ICE website
42
+
python main.py --enrich # Enrich existing data with external sources
43
+
python main.py --scrape --enrich # Do both operations
0 commit comments