Skip to content

Commit 30e252f

Browse files
committed
Strip trailing slash from user input URL. Added count to success message. Added color to output.
1 parent 94421b1 commit 30e252f

File tree

2 files changed

+35
-15
lines changed

2 files changed

+35
-15
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Bash script to spider a site, follow links, and fetch urls -- with some filtering. A list of URLs will be generated and saved to a text file.
33

44
[![GitHub release](https://img.shields.io/github/release/adamdehaven/fetchurls.svg?maxAge=3600)](https://github.com/adamdehaven/fetchurls/archive/master.zip)
5-
[![GitHub commits](https://img.shields.io/github/commits-since/adamdehaven/fetchurls/v1.1.0.svg?maxAge=3600)](https://github.com/adamdehaven/fetchurls/compare/v1.1.0...master)
5+
[![GitHub commits](https://img.shields.io/github/commits-since/adamdehaven/fetchurls/v1.1.1.svg?maxAge=3600)](https://github.com/adamdehaven/fetchurls/compare/v1.1.1...master)
66
[![GitHub issues](https://img.shields.io/github/issues/adamdehaven/fetchurls.svg?maxAge=3600)](https://github.com/adamdehaven/fetchurls/issues)
77
[![license](https://img.shields.io/github/license/adamdehaven/fetchurls.svg?maxAge=3600)](https://raw.githubusercontent.com/adamdehaven/fetchurls/master/LICENSE)
88

@@ -57,7 +57,7 @@ The script will crawl the site and compile a list of valid URLs into a text file
5757

5858
## Extra Info
5959

60-
* To change the default file output location, edit line #21. **Default**: `~/Desktop`
60+
* To change the default file output location, edit line #7. **Default**: `~/Desktop`
6161

6262
* Ensure that you enter the correct protocol and subdomain for the URL or the outputted file may be empty or incomplete. For example, entering the incorrect, HTTP, protocol for [https://adamdehaven.com](https://adamdehaven.com) generates an empty file. Entering the proper protocol, HTTPS, allows the script to successfully run.
6363

@@ -84,4 +84,4 @@ The script will crawl the site and compile a list of valid URLs into a text file
8484
* /wp-json/
8585
* xmlrpc
8686

87-
* To change or edit the regular expressions that filter out some pages, directories, and file types, you may edit lines #27 through #36. **Caution**: If you're not familiar with grep and regular expressions, you can easily break the script.
87+
* To change or edit the regular expressions that filter out some pages, directories, and file types, you may edit lines #35 through #44. **Caution**: If you're not familiar with grep and regular expressions, you can easily break the script.

fetchurls.sh

Lines changed: 32 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,23 +3,31 @@
33
# Ensure you have wget installed and added to environment variable PATH
44
# Example source: https://eternallybored.org/misc/wget/
55

6+
# ----------- SET DEFAULT SAVE LOCATION -----------
7+
savelocation=~/Desktop
8+
9+
# ----------- SET COLORS -----------
10+
COLOR_RED=$'\e[31m'
11+
COLOR_CYAN=$'\e[36m'
12+
COLOR_YELLOW=$'\e[93m'
13+
COLOR_GREEN=$'\e[32m'
14+
COLOR_RESET=$'\e[0m'
15+
616
displaySpinner()
717
{
818
local pid=$!
919
local delay=0.3
1020
local spinstr='|/-\'
1121
while [ "$(ps a | awk '{print $1}' | grep $pid)" ]; do
1222
local temp=${spinstr#?}
13-
printf "# Please wait... [%c] " "$spinstr" # Count number of backspaces needed (A = 25)
23+
printf "${COLOR_RESET}# ${COLOR_YELLOW}Please wait... [%c] " "$spinstr${COLOR_RESET}" # Count number of backspaces needed (A = 25)
1424
local spinstr=$temp${spinstr%"$temp"}
1525
sleep $delay
1626
printf "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b" # Number of backspaces from (A)
1727
done
1828
printf " \b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b" # Number of spaces, then backspaces from (A)
1929
} # // displaySpinner()
2030

21-
savelocation=~/Desktop
22-
2331
fetchSiteUrls() {
2432
cd $savelocation && wget --spider -r -nd --max-redirect=30 $DOMAIN 2>&1 \
2533
| grep '^--' \
@@ -44,23 +52,35 @@ echo "# Fetch a list of unique URLs for a domain."
4452
echo "# "
4553
echo "# Enter the full URL ( http://example.com )"
4654

47-
read -e -p "# URL: " DOMAIN
48-
DOMAIN=$DOMAIN
55+
read -e -p "# URL: ${COLOR_CYAN}" DOMAIN
56+
DOMAIN="${DOMAIN%/}"
4957
displaydomain=$(echo ${DOMAIN} | grep -oP "^http(s)?://(www\.)?\K.*")
5058
filename=$(echo ${DOMAIN} | grep -oP "^http(s)?://(www\.)?\K.*" | tr "." "-")
5159

52-
echo "# "
53-
read -e -p "# Save txt file as: " -i "${filename}" SAVEFILENAME
60+
echo "${COLOR_RESET}# "
61+
read -e -p "# Save txt file as: ${COLOR_CYAN}" -i "${filename}" SAVEFILENAME
5462
savefilename=$SAVEFILENAME
5563

56-
echo "# "
57-
echo "# Fetching URLs for ${displaydomain} "
64+
echo "${COLOR_RESET}# "
65+
echo "# ${COLOR_YELLOW}Fetching URLs for ${displaydomain} ${COLOR_RESET}"
5866

5967
# Start process
6068
fetchSiteUrls $savefilename & displaySpinner
6169

62-
# Process is complete, output message
63-
echo "# Finished!"
70+
# Process is complete
71+
72+
# Count number of results
73+
RESULT_COUNT="$(cat ${savelocation}/$savefilename.txt | sed '/^\s*$/d' | wc -l)"
74+
if [ "$RESULT_COUNT" = 1 ]; then
75+
RESULT_MESSAGE="${RESULT_COUNT} Result"
76+
else
77+
RESULT_MESSAGE="${RESULT_COUNT} Results"
78+
fi
79+
80+
# Output message
81+
echo "${COLOR_RESET}# "
82+
echo "# ${COLOR_GREEN}Finished with ${RESULT_MESSAGE}!${COLOR_RESET}"
6483
echo "# "
65-
echo "# File Location: ${savelocation}/$savefilename.txt"
84+
echo "# ${COLOR_GREEN}File Location:${COLOR_RESET}"
85+
echo "# ${COLOR_GREEN}${savelocation}/$savefilename.txt${COLOR_RESET}"
6686
echo "# "

0 commit comments

Comments
 (0)