Skip to content

Commit b1bf3a0

Browse files
committed
updating README.md and the enrichers and ice_scrapers README, also some typos on US Marshals Service
1 parent a03d96b commit b1bf3a0

File tree

4 files changed

+32
-17
lines changed

4 files changed

+32
-17
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,12 +110,10 @@ Another command for installing mise in your session can also work (in bash):
110110
in hopes of finding similarly named pages but this is too aggressive, and it veers way off. (That is, it's looking for places
111111
that have simpler names, like the county name instead of `county + detention center`). Use the debug mode to see what
112112
it is doing.
113-
* ICE scraping is not robustly tested. The image URL extraction needs some work. (should be able to get the detention center image URLs.)
114113
* The user-agent for running ice.gov scrape web requests calls itself `'User-Agent': 'ICE-Facilities-Research/1.0 (Educational Research Purpose)'`.
115114
You can change this in `utils.py`.
116115
* It tells some pretty inaccurate percentages in the final summary - a lot of false positives, the Wikipedia debug percent
117116
seems wrong.
118-
* The remote query rate limiting is (I think) done in series but would go faster with parallel/async processing.
119117
* This is only targeted at English (EN) Wikipedia currently, but multi-lingual page checks would help a wider audience.
120118

121119
## Contributing & Code Standards
@@ -128,6 +126,9 @@ Pull requests and reviews are welcome on the main repo. For checking type safety
128126
uv run mypy .
129127
```
130128

129+
Please see the [ice_scrapers README.md](ice_scrapers/README.md) and [enrichers README.md](enrichers/README.md)
130+
for more details about the facilities scrapers and how to create new enrichers for new data sources.
131+
131132
## Credit
132133

133134
Original version by Dan Feidt ([@HongPong](https://github.com/HongPong)), with assistance from various AI gizmos. (My

enrichers/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ These functions let us collect data about facilities from additional sources.
44

55
## Enrichment class
66

7-
The base class we can build enrichment tools from. Largely ensures some consistent in functionality between enrichment tools.
7+
The base class we can build enrichment tools from. Largely ensures some consistent
8+
in functionality between enrichment tools.
89

910
### Available functions
1011

ice_scrapers/README.md

Lines changed: 20 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,38 @@
11
# ICE Facility scrapers
22

3-
These files maintain the code to collect (and collate) ICE facility data from a number of sources.
3+
----
4+
5+
These files maintain the code to collect (and collate) ICE facility data
6+
from a number of sources.
47

58
## utils.py
69

7-
Contains most of our collating functions and shared functions that scrapers may need.
10+
Contains most of our collating functions and shared functions that scrapers
11+
may need.
812

913
## __init__.py
1014

11-
Contains some static objects and import declarations (so we can `from ice_scrapers import` successfully)...
15+
Contains some static objects and import declarations (so we can `from ice_scrapers import`
16+
successfully)...
1217

1318
## spreadsheet_load.py
1419

15-
ICE is required by law to produce regular custody data. We can pull that data from here `https://www.ice.gov/detain/detention-management`. Because this spreadsheet is more "complete" than other sources we've found, we use it as our base scrape.
20+
ICE is required by law to produce regular custody data. We can pull that data from
21+
here `https://www.ice.gov/detain/detention-management`. Because this spreadsheet is
22+
more "complete" than other sources we've found, we use it as our base scrape.
1623

1724
## facilities_scraper.py
1825

19-
Pulls information about ICE detention facilities from `https://www.ice.gov/detention-facilities`. This can add additional (or corrected) data about facilities locations, contact information, and provides facility images.
26+
Pulls information about ICE detention facilities from
27+
`https://www.ice.gov/detention-facilities`. This can add additional (or corrected)
28+
data about facilities locations, contact information, and provides facility images.
2029

2130
## field_offices.py
2231

23-
Collects additional data about ICE/DHS field offices from `https://www.ice.gov/contact/field-offices`. Largely basic areas of responsibility and contact info for the field office.
32+
Collects additional data about ICE/DHS field offices from
33+
`https://www.ice.gov/contact/field-offices`. Largely basic areas of responsibility
34+
and contact info for the field office.
2435

25-
> The field-offices page shows information about a number of different offices. As we are largely focused on detention, ERO (Eforcement and Removal Operations) centers are the most interesting.
36+
> The field-offices page shows information about a number of different offices. As we
37+
> are largely focused on detention, ERO (Enforcement and Removal Operations) centers
38+
> are the most interesting.

ice_scrapers/__init__.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,11 @@
4444
},
4545
"DIGSA": {
4646
"expanded_name": "Dedicated Intergovernmental Service Agreement",
47-
"description": "A publicly-owned facility operated by state/local government(s), or private contractors, in which ICE contracts to use all bed space via a Dedicated Intergovernmental Service Agreement; or facilities used by ICE pursuant to Inter-governmental Service Agreements, which house only ICE detainees – typically these are operated by private contractors pursuant to their agreements with local governments.",
47+
"description": "A publicly-owned facility operated by state/local government(s), or private contractors, in which ICE contracts to use all bed space via a Dedicated Intergovernmental Service Agreement; or facilities used by ICE pursuant to Intergovernmental Service Agreements, which house only ICE detainees – typically these are operated by private contractors pursuant to their agreements with local governments.",
4848
},
4949
"IGSA": {
5050
"expanded_name": "Intergovernmental Service Agreement",
51-
"description": "A publicly-owned facility operated by state/local government(s), or private contractors, in which ICE contracts for bed space via an Intergovernmental Service Agreement; or local jails used by ICE pursuant to Inter-governmental Service Agreements, which house both ICE and non-ICE detainees, typically county prisoners awaiting trial or serving short sentences, but sometimes also USMS prisoners.",
51+
"description": "A publicly-owned facility operated by state/local government(s), or private contractors, in which ICE contracts for bed space via an Intergovernmental Service Agreement; or local jails used by ICE pursuant to Intergovernmental Service Agreements, which house both ICE and non-ICE detainees, typically county prisoners awaiting trial or serving short sentences, but sometimes also USMS prisoners.",
5252
},
5353
"SPC": {
5454
"expanded_name": "Service Processing Center",
@@ -60,15 +60,15 @@
6060
},
6161
# two keys for the same thing as it isn't consistently defined
6262
"USMSIGA": {
63-
"expanded_name": "United States Marshal Service Intergovernmental Agreement",
64-
"description": "A USMS Intergovernmental Agreement in which ICE agrees to utilize an already established US Marshal Service contract.",
63+
"expanded_name": "United States Marshals Service Intergovernmental Agreement",
64+
"description": "A USMS Intergovernmental Agreement in which ICE agrees to utilize an already established US Marshals Service contract.",
6565
},
6666
"USMS IGA": {
67-
"expanded_name": "United States Marshal Service Intergovernmental Agreement",
68-
"description": "A USMS Intergovernmental Agreement in which ICE agrees to utilize an already established US Marshal Service contract.",
67+
"expanded_name": "United States Marshals Service Intergovernmental Agreement",
68+
"description": "A USMS Intergovernmental Agreement in which ICE agrees to utilize an already established US Marshals Service contract.",
6969
},
7070
"USMS CDF": {
71-
"expanded_name": "United States Marshal Service Contract Detention Facility",
71+
"expanded_name": "United States Marshals Service Contract Detention Facility",
7272
"description": "Name derived from listing at https://www.vera.org/ice-detention-trends",
7373
},
7474
"CDF": {

0 commit comments

Comments
 (0)