Terminal transcript

alone.  (The ASCII tab character should also be included for good
measure in a production script.)

   At this point, we have data consisting of words separated by blank
space.  The words only contain alphanumeric characters (and the
underscore).  The next step is break the data apart so that we have one
word per line.  This makes the counting operation much easier, as we
will see shortly.

     $ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
     > tr -s ' ' '\n' | ...

   This command turns blanks into newlines.  The ‘-s’ option squeezes
multiple newline characters in the output into just one, removing blank            lines.  (The ‘>’ is the shell’s “secondary prompt.” This is what the
shell prints when it notices you haven’t finished typing in all of a
command.)

   We now have data consisting of one word per line, no punctuation, all
one case.  We’re ready to count each word:

     $ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
     > tr -s ' ' '\n' | sort | uniq -c | ...

   At this point, the data might look something like this:

          60 a
           2 able
           6 about
           1 above
           2 accomplish
           1 acquire
           1 actually
           2 additional

   The output is sorted by word, not by count!  What we want is the most
frequently used words first.  Fortunately, this is easy to accomplish,
with the help of two more ‘sort’ options:

‘-n’
     do a numeric sort, not a textual one

‘-r’
     reverse the order of the sort

   The final pipeline looks like this:

     $ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
     > tr -s ' ' '\n' | sort | uniq -c | sort -n -r
     ⊣    156 the
     ⊣     60 a
     ⊣     58 to
     ⊣     51 of
     ⊣     51 and
     ...

   Whew!  That’s a lot to digest.  Yet, the same principles apply.  With
six commands, on two lines (really one long one split for convenience),
we’ve created a program that does something interesting and useful, in
much less time than we could have written a C program to do the same
thing.

   A minor modification to the above pipeline can give us a simple
spelling checker!  To determine if you’ve spelled a word correctly, all
you have to do is look it up in a dictionary.  If it is not there, then
chances are that your spelling is incorrect.  So, we need a dictionary.
The conventional location for a dictionary is ‘/usr/share/dict/words’.

   Now, how to compare our file with the dictionary?  As before, we
generate a sorted list of words, one per line:

-----Info: (coreutils)Putting the tools together, 317 lines --60%------------------

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terminal transcript #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Terminal transcript #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions