Skip to content

Commit fa2a7db

Browse files
committed
adds script to create a word frequency file
1 parent 01faf54 commit fa2a7db

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

scripts/word_frequency.sh

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
#!/bin/bash
2+
3+
# Usage: ./wordcloud-filter.sh _posts/
4+
5+
# Check for directory argument
6+
if [ -z "$1" ]; then
7+
echo "Usage: $0 <directory>"
8+
exit 1
9+
fi
10+
11+
INPUT_DIR="$1"
12+
13+
find "$INPUT_DIR" -type f -name '*' -exec cat {} + | \
14+
tr '[:upper:]' '[:lower:]' | \
15+
tr -c '[:alnum:]' '[\n*]' | \
16+
grep -E '.{3,}' | \
17+
sort | \
18+
uniq -c | \
19+
awk '$1 > 1000' | \
20+
sort -nr > word_freq.txt
21+
22+

0 commit comments

Comments
 (0)