-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Labels
Description
EPA Transportation and Air Quality (TAQ)
- Agency: Environmental Protection Agency
- Agency Division: EPA Transportation and Air Quality (TAQ)
- Data Type: Transport and Air Quality
- Data Format: Various
I have mined the metadata for the EPA's Transportation and Air Quality (TAQ) documents and have hosted the direct download links to the documents in my repository. I need help mining the documents themselves as I do not have the space to download them.
Downloading PDFs
You can execute the following command replacing the placeholders with the appropriate values to download files in bulk:
awk 'FNR>=[Starting_Line_Number] && FNR<=[Ending_Line_Number]' [Links_Location] | while read -r link; do curl --retry 10 -OL $(echo $link | tr -d '\r'); done
- [Starting_Line_Number] with the line number of the first link to download
- [Ending_Line_Number] with the line number of the last link to download
- [Links_Location] with the path to the downloaded Links.txt file from the repository
Download Information
| Property | Value |
|---|---|
| Number links/documents | 25690 |
| Estimated total filesize | 11 GB |
Reactions are currently unavailable