blueprints/oddsAndEnds.tex at master · divilian/blueprints · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851

\chapter{Java odds 'n' ends}

Before we continue our study of OOA\&D proper, let's look at a few
Java-specific idiosyncrasies which will be all up in our business soon enough.

\section{Garbage collection}

\index{garbage collection}
No, I didn't make that term up just to be funny. \textbf{Garbage collection}
is actually the official name for a Java feature which was super innovative at
the time, but which we now often take for granted.

\index{ball@\texttt{Ball}}
\index{play@\texttt{.play()}}
Consider the code for a \texttt{Ball} class given in
Figure~\ref{fig:ballCode}. When we run it, Java calls \texttt{main()}, which
calls \texttt{play()}. At the end of \texttt{play()}, right before it returns,
a memory diagram would look like Figure~\ref{fig:ballMemory}. Take a moment to
see if you agree with all the details.

\begin{figure}[ht]
\begin{Verbatim}[fontsize=\scriptsize,samepage=true,frame=single]
class Ball {
    private String color;
    private int airPressure;

    Ball(String color) {
        this.color = color;
        this.airPressure = 0;
    }

    void bounce() {
        System.out.println("Boing!!");
    }

    static Ball play(int numBalls) {
        ArrayList equipment = new ArrayList();
        Ball b;
        int i;
        for (i=0; i<numBalls; i++) {
            b = new Ball("red");
            b.bounce();
            equipment.add(b);
        }
        Ball basket = new Ball("orange");
        return basket;
    }

    public static void main(String args[]) {
        int x = 3;
        Ball myBall = play(2);
        System.out.println("My ball is " + myBall.color + ".");
    }
}
\end{Verbatim}
\caption{A class to illustrate the utility of garbage collection.}
\label{fig:ballCode}
\end{figure}

\index{stack frame}
At this moment, we have an active stack frame for the \texttt{play()} function
which contains five variables of various types. And we're getting ready to
return the reference variable \texttt{basket} back to \texttt{main()}, which
means that about a nanosecond from now \texttt{main()} will be assigning its
new \texttt{myBall} variable to point to that orange ball.

\begin{figure}[ht]
\centering
\includegraphics[width=1\textwidth]{ballMemory.pdf}    % 750x400
\caption{A snapshot of memory, taken just before the \texttt{play()} function
returns back to \texttt{main()}.}
\label{fig:ballMemory}
\end{figure}

\index{memory diagram}
Okay, now let's do it. We return to \texttt{main()}. As soon as we do, the
memory diagram looks like Figure~\ref{fig:ballMemory2}. Take a close look.
That second diagram is all correct, but something about it may strike you as a
bit weird; namely, there are three objects on the heap \textit{with nothing on
the stack referencing them}. The \texttt{myBall} variable dutifully points to
the orange ball that was returned, but the two red balls, and the
\texttt{ArrayList} that contained them, are now disembodied from everything
else. And in fact, they are effectively \textit{lost} to the program. There's
simply no way to reference them.

\begin{figure}[ht]
\centering
\includegraphics[width=1\textwidth]{ballMemory2.pdf}    % 750x400
\caption{The state of memory right \textit{after} the return to
\texttt{main()}. Notice there are now three unreachable objects on the heap.}
\label{fig:ballMemory2}
\end{figure}

If you're unsure that there's truly no way, ask yourself this question: ``what
line of code could we write to (say) change one of the \texttt{Ball} object's
color from red to blue?'' The answer is: there is no possible line of code we
could write to do that. To even get off the ground we'd have to start with a
\textit{name}, and there is no name we could possibly use to get at either of
those red \texttt{Ball} objects.

\index{C++}
\index{memory leak}
Now with C++, a language that preceded Java by decades, this would be a bad
situation called (I kid you not) a \textbf{memory leak}. The memory the
program used to store those now-unreachable \texttt{Ball}s is now inaccessible
to the program, and what's worse, \textit{C++ doesn't realize that's the
case.} So those old objects just sit there, growing stale, occupying system
memory that could be used to store other things instead. The program never
realizes this, and so never reclaims that space. So the actual amount of
memory the program has available to it has effectively shrunk.

\index{delete@\texttt{delete}}
In C++, the only remedy for this situation is for the programmer herself to
keep track of which objects no longer have any stack references to them, and
to explicitly \texttt{delete} those objects. This is a delicate task: fail to
\texttt{delete} what is in fact delete-able and you'll have a memory leak;
eagerly \texttt{delete} what actually does have other references to it and
your program will crash when that memory is reused by something else writing
over the top of it. The whole situation is fraught with peril.

\index{garbage collection}
Enter Java, in 1995. Java featured \textbf{automatic garbage collection} which
outsourced the whole responsibility for this from the programmer to a special
Java background task called the \textbf{garbage collector}. Whenever the
garbage collector runs, it intelligently sifts through the contents of memory,
looking for junk that can't be legally accessed anymore anyway. Whenever it
finds such junk, it \textit{automatically} tells the memory manager that that
memory is no longer in use, and can therefore be repurposed the next time the
program requests some memory with \texttt{new}.

Automatic garbage collection is lit. It means that pictures like
Figure~\ref{fig:ballMemory2} aren't scary at all. Sure, we have three objects
in the heap that can't ever be reached, but the garbage collector will soon
run, figure that out, and reacquire that memory so it can be used again. All
this without the programmer having to think a thing about it. Memory leaks are
in principle a thing of the past.

\index{pacemaker}
\index{C++}
As with most good things, there are downsides too. One downside to Java's
approach is that the garbage collector thread decides to run ``any time it
darn well pleases.'' As developers, we don't ever tell Java to get off its
butt and take out the trash (although there is a way to suggest this); rather,
we just wait for it to run when it periodically thinks it needs to. This isn't
normally a problem, but it can be in mission-critial real-time applications
with absolute performance deadlines. Think of the software running on a
pacemaker, which is embedded in a human heart. The code \textit{must} respond
in a certain amount of time in order to trigger the heart to take its next
beat! Now if, during our pacemaker program's operation, the garbage collector
suddenly decided to run in order to clean up lost memory, it might be a
time-consuming operation in its own right. And our program could literally
skip a beat (or beat later than it should) while waiting for it to finish. In
situations like these, C++'s style of manual control does give us more
fine-tuned flexibility over when exactly the reclamation of lost memory
occurs. For most applications, though, we don't need that flexibility and
Java's way of handling it is much appreciated.

\section{The \texttt{import} statement}
\label{sec:import}

\index{import statement@\texttt{import} statement}
\index{include@\texttt{\#include}}
You've probably used \texttt{import} in almost every Java program you've ever
written. Yet I've found most developers don't really understand what it does.
Java itself is partially to blame here: the word ``import'' was a poor choice
for this, since nothing is ``imported'' at all.\footnote{Also, the fact that
it begins with a lower-case \texttt{i} just like C++'s ``\texttt{\#include}''
statement reinforces this misconception -- C++'s \texttt{\#include} actually
\textit{does} include/import content.}

Here's a statement that surprises a lot of Java programmers: you can write
\textit{any} Java program -- even one that uses stuff in the Java API --
\textit{without} an \texttt{import} statement. The left side of
Figure~\ref{fig:dontNeedImport} is such a program.

\begin{figure}[ht]
\begin{Verbatim}[fontsize=\tiny,samepage=true,frame=single]
                                            import java.util.ArrayList;

class YouDontNeedImport {                   class YouDontNeedImport {
  public static void main(String args[])      public static void main(String args[])
  {                                           {
    java.util.ArrayList celebs =                ArrayList celebs =
      new java.util.ArrayList();                  new ArrayList();
    celebs.add("Kim Kardashian");               celebs.add("Kim Kardashian");
    celebs.add("Justin Bieber");                celebs.add("Justin Bieber");
    celebs.add("Taylor Swift");                 celebs.add("Taylor Swift");
    p(celebs);                                  p(celebs);
  }                                           }

  static void p(java.util.ArrayList l) {      static void p(ArrayList l) {
    for (int i=0; i<l.size(); i++) {            for (int i=0; i<l.size(); i++) {
      System.out.println("People love "           System.out.println("People love "
        + l.get(i));                                + l.get(i));
    }                                           }
  }                                           }
}                                           }
\end{Verbatim}
\caption{The same Java program, using explicit inline package names (left),
and the \texttt{import} statement (right).}
\label{fig:dontNeedImport}
\end{figure}

No \texttt{import} required. Instead, on the left-hand side of the figure,
every time we want to refer to the \texttt{ArrayList} class, we specified it
as \texttt{java.util.ArrayList}. That does require us to type out the full
name three times, but it turns out to be all Java needs to understand
perfectly which \texttt{ArrayList} class we want to use.

Now since programmers (myself included) are lazy, and want to avoid typing
when possible, the Java gods invented the \texttt{import} statement. The
\textit{only} thing it does is tell Java ``I don't really feel like typing out
\texttt{java.util.ArrayList} every time. That's a pain. So Java, please know
that when I type \texttt{ArrayList} in this file, I really mean
\texttt{java.util.ArrayList}.''

The right-side of Figure~\ref{fig:dontNeedImport} is exactly the same program,
but now using the \texttt{import} statement to avoid a little typing.

Whether the savings are worth it in any particular case is up to you. My point
here is just to demonstrate that the statement ``\texttt{import
java.util.ArrayList}'' does \textit{not} do anything remotely like ``go and
find the \texttt{ArrayList} code in the \texttt{java.util} package, and bring
it in here so I can use it.'' Nor is it true that ``unless you import that
class, you can't use it.'' Both are common Java myths.

\subsubsection{Don't \texttt{import *}}

By the way, you may have seen the use of ``\texttt{*}'' syntax, like this:

\begin{Verbatim}[fontsize=\small,samepage=true]
import java.util.*;
import com.google.search.engines.*;
import edu.umw.stephen.coolclasses.*;
\end{Verbatim}

\index{scanner@\texttt{Scanner}}
This isn't a good idea. The reason is that it's ambiguous. Suppose in my code
I refer to a class called \texttt{Scanner}. Which \texttt{Scanner} do I mean?
Should Java assume I wanted to avoid typing ``\path{java.util.Scanner}'' or
``\path{com.google.search.engines.Scanner}'' or
``\path{edu.umw.stephen.coolclasses.Scanner}''? It has no way of knowing. So
if more than one of those packages defines a class with the same name (which
is very possible), Java will at best not compile, and at worst give unexpected
runtime behavior.

For this reason, using ``\texttt{*}'' in \texttt{import} statements is only
acceptable when writing quick and dirty one-off code, not for anything that
will stick around longer than your current coding session.


\section{Java ``generics''}
\label{sec:generics}

Speaking of misleading terms, let me give you another one.

There are two ways to use a container class from \texttt{java.util} (like
\texttt{ArrayList}, \texttt{Set}, or \texttt{PriorityQueue}). One way is to
just create one without any syntactic fuss:

\begin{Verbatim}[fontsize=\small,samepage=true,frame=single]
import java.util.ArrayList;
...
ArrayList stuff = new ArrayList();
\end{Verbatim}

and then put stuff in it:

\begin{Verbatim}[fontsize=\small,samepage=true,frame=single]
stuff.add("Laundry");
stuff.add("Lunchbox");
stuff.add(new Car("Mazda", "MX-5"));
\end{Verbatim}

\index{collection!heterogeneous}
\index{heterogeneous collection}
This is called a \textbf{heterogeneous} collection, because the things it
contains are of different types (\texttt{String}s and \texttt{Car}s, this
case).

\index{cast}
\index{typecast}
\index{downcast}
Java allows this perfectly well, for reasons we'll understand in detail later.
Here's the rub, though: when we get something \textit{out} of the collection,
we have to \textbf{cast} (or \textbf{typecast}, or \textbf{downcast}) it to
the correct type before doing anything specific with it. This code, for
example, does \textit{not} compile:

\begin{Verbatim}[fontsize=\small,samepage=true]
// Doesn't work:
System.out.println("First item has " +
    stuff.get(0).length() + " letters.");
\end{Verbatim}

The reason is that we're calling \texttt{.length()} on whatever object happens
to be at position \texttt{0} of the list, and Java can't know for sure whether
that object will end up being a \texttt{String}, a \texttt{Car}, or something
else. In particular, it can't know that that object will even \textit{have} a
\texttt{.length()} method. So Java makes us do this:

\begin{Verbatim}[fontsize=\small,samepage=true]
// Works, due to explicit cast:
System.out.println("First item has " +
    ((String) stuff.get(0)).length() + " letters.");
\end{Verbatim}

This says, ``Java, I give you my word. I promise the thing at position
\texttt{0} \textit{will} be a \texttt{String}, and I'll stake my reputation on
it. Please force it to be treated as a \texttt{String}, and if it turns out
I'm lying, you have permission to embarrass me at runtime with a program
crash.''

\index{generics}
Having to do that is a rather high price to pay for the flexibility of being
able to store any old thing in \texttt{stuff}. That's why back in 2004, Java
introduced the idea of \textbf{generics}, which allow you to restrict a
collection to having only items of a single type. Here's how you do it:

\begin{Verbatim}[fontsize=\small,samepage=true]
import java.util.ArrayList;
...
ArrayList<String> stuff = new ArrayList<String>();
stuff.add("Laundry");
stuff.add("Lunchbox");
stuff.add(new Car("Mazda", "MX-5"));    // Compile error
\end{Verbatim}

Our \texttt{stuff} is no longer a plain old \texttt{ArrayList}, but
specifically an \texttt{ArrayList<String>} (pronounced ``array list of
strings.'') This tells Java, ``please don't let me put anything into this
\texttt{ArrayList} \textit{except} \texttt{String}s. I'm asking for this
restriction because it's for my own good.''

When we do this, trying to insert a \texttt{Car} (or anything else) will bomb
at compile time, as it should. And now, we don't have to typecast anything we
get out of it: since Java knows it would only have put \texttt{String}s in, it
knows that it will only get \texttt{String}s out. Therefore,
``\texttt{stuff.get(0).length()}'' works just fine, no cast necessary.

This is a good feature, and you should pretty much always use it. My only
complaint is that I think the word ``generic'' is exactly the wrong word for
it. They almost couldn't have chosen a worse term. Saying you have ``a generic
\texttt{ArrayList}'' sounds like it ought to mean the \textit{first} style
(above), where the type wasn't specifically mentioned and therefore any object
was fair game to put in it. But in actual fact, ``a generic
\texttt{ArrayList}'' means ``a specific, decidedly \textit{non}-generic
\texttt{ArrayList} that is declared to only hold values of a certain type.''
Go figure.

\section{``Wrapper'' classes}

\index{pass-by-value}
\index{pass-by-reference}
This is a good time to explain the usage of Java's \textbf{wrapper} classes.
Recall that there are two kinds of Java variables: those that store primitive
types (like \texttt{int}) and those that refer to objects (like \texttt{Car}).
The biggest practical difference between these two, as we've seen, is where
they're stored: primitive types go on the stack (and hence are pass-by-value)
whereas the objects pointed to by reference variables go on the heap (and
hence are pass-by-reference).

\index{primitive type}
Another difference relates to the container classes like \texttt{ArrayList}
that we've just been discussing. All those \texttt{java.util} goodies, it
turns out, can store any type of object...but it must be exactly that: an
\textit{object}. In particular, they can't store primitive types.

So this means we can't do this:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true]
  import java.util.ArrayList;
  ...
  ArrayList<int> uniformNumbers = new ArrayList<int>();   // NOPE
\end{Verbatim}

because there's no such thing as ``an \texttt{ArrayList} of \texttt{int}s.''
That sucks because \texttt{int}s, \texttt{double}s, and \texttt{boolean}s are
things we'd like to make \texttt{ArrayList}s of all the time.

\index{wrapper class}
Fortunately, there's an easy way around this. Each primitive type has a
``wrapper class'' in Java: it is the mold for objects so simple that they don't
do anything except hold a piece of data: an \texttt{int} (or \texttt{char}, or
\texttt{double}, \textit{etc.}) We could do this, for instance:

\begin{Verbatim}[fontsize=\small,samepage=true]
    Integer michaelJordan = new Integer(23);
    Integer derekJeter = new Integer(2);
    Boolean gameOfThronesRocks = new Boolean(true);
\end{Verbatim}

and produce the memory diagram in Figure~\ref{fig:wrapper}. Each object is
nothing more than a shell that ``wraps'' a primitive piece of data.

\begin{figure}[ht]
\centering
\includegraphics[width=0.8\textwidth]{wrapper.pdf}  % 650x250
\caption{Wrapper objects live on the heap.}
\label{fig:wrapper}
\end{figure}

So far this isn't very exciting. But one reason we need it is so we can store
primitive types in container objects like \texttt{ArrayLists}s. Here's all we
need to do to create our list of uniform numbers:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true]
  import java.util.ArrayList;
  ...
  ArrayList<Integer> uniformNumbers = new ArrayList<Integer>();
\end{Verbatim}

and it works, since the elements of the \texttt{ArrayList} are declared to be
objects, as required. (Really teensy-tiny objects that don't really do
anything, but objects nonetheless.)

It's nice that after this, Java lets us work with primitive variables rather
than the wrapper classes:

\begin{Verbatim}[fontsize=\small,samepage=true]
    uniformNumbers.add(23);
    uniformNumbers.add(2);
    uniformNumbers.add(20);
    int mikeSchmidt = uniformNumbers.get(20);
\end{Verbatim}

so we really only notice the wrapper part at instantiation time. This isn't
the only time we need to use wrappers, but it's the only one we need in the
immediate future.

\section{The \texttt{Hashtable} data structure}
\label{sec:hashtable}

\index{hashtable@\texttt{Hashtable}}
\index{dictionary}
\index{Python}
\index{key-value pair}
One very, very common container type that we'll use is
\path{java.util.Hashtable}. (If you've used a \textbf{dictionary} in Python,
this is essentially the same thing.) A \texttt{Hashtable} holds a container of
\textbf{key-value pairs}. Each key-value pair represents a \textit{named}
piece of data -- the key is the name, and the value is the data. So unlike an
\texttt{ArrayList}, where the data elements are numbered, and thus accessed by
a numerical index, a \texttt{Hashtable} uses the keys to specify which piece
of data is required.

\index{alterEgos@\texttt{alterEgos}}
An example will make this all clear. Suppose we want to keep track of
superheroes and the names of their secret identities, so that if the
government decides to legislate against super powers, we can hunt down all the
potential perpetrators. We'll do so in a \texttt{Hashtable} called
\texttt{alterEgos}:

\index{Batman}
\begin{Verbatim}[fontsize=\scriptsize,samepage=true,frame=single]
  import java.util.Hashtable;
  ...
  Hashtable<String,String> alterEgos = new Hashtable<String,String>();
  alterEgos.put("Superman","Clark Kent");
  alterEgos.put("Batman","Bruce Wayne");
  alterEgos.put("Elastigirl","Helen Incredible");
\end{Verbatim}

\index{instantiation}
This instantiation syntax may make you bug-eyed. You'll see that we include
not one but \textit{two} types inside the ``$<...>$'' markers; in this
case, both are \texttt{String}s. These two specify \textit{the type of the
keys, and the type of the values}. Since our keys are text (superhero names)
and our values are also text (the names of the alter egos) it makes sense to
make this a \texttt{String}-to-\texttt{String} hash table.

\index{put@\texttt{.put()}}
The \texttt{.put()} method is used to add a new key-value pair to the table.
It is also used to \textit{change} the value that goes with a particular key.
That works because in a hash table, \textit{every key goes with just one
value}. If we ran a line of code like this:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true,frame=single]
  alterEgos.put("Batman","Rich Dude");
\end{Verbatim}

then the \texttt{String} ``\texttt{Bruce Wayne}'' would be permanently removed
from memory, and replaced by ``\texttt{Rich Dude}''.

To retrieve the value that goes with a particular key, we use \texttt{.get()}:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true,frame=single]
  String elastigirlTrueIdentity = alterEgos.get("Elastigirl");
  System.out.println("Pssst...Superman is really: " +
      alterEgos.get("Superman"));
\end{Verbatim}

This looks much like the way we obtain items from an \texttt{ArrayList},
except that with \texttt{ArrayList.get()}, we passed an integer, and for\\
\texttt{Hashtable.get()}, we pass a key (whatever type that may be).

A very common use of \texttt{Hashtable}s is to store objects based on some
kind of name or identifier. For example:

\index{customer@\texttt{Customer}}
\begin{Verbatim}[fontsize=\scriptsize,samepage=true,frame=single]
import java.util.Hashtable;
...
Hashtable<Integer,Customer> customers = new Hashtable<Integer,Customer>();
...
customers.put(7533, new Customer("Joe Blow", "New Plymouth, ID"));
customers.put(6717, new Customer("Jill Hill", "New York, NY"));
...
Customer custNum6717 = customers.get(6717);
\end{Verbatim}

Here we're storing \texttt{Customer} objects by their customer ID numbers, for
easy retrieval by that later.

In this book, I'll draw \texttt{Hashtable} objects as shown in
Figure~\ref{fig:hashtable}. Its \texttt{contents} table is full of references:
each box in the left column points to a key (in this case, an
\texttt{Integer}) and the box to its immediate right points to the
corresponding value (a \texttt{Customer}). Notice that there is no
well-defined order to the entries in the table -- in this case, even though
\texttt{Joe Blow} was the first one inserted, he occupies the second row of
the table. This is to emphasize that when you add a key-value pair to a
\texttt{Hashtable}, it doesn't remember anything about when you added it, only
that you \textit{did} add it.

\begin{figure}[ht]
\centering
\includegraphics[width=1\textwidth]{hashtable.pdf}
\caption{A \texttt{Hashtable} of \texttt{Customer} objects, stored by their
customer IDs.}
\label{fig:hashtable}
\end{figure}

\subsection{Iterating through \texttt{Hashtable}s}

\index{iteration!through a collection}
One last thing about \texttt{Hashtable}s: you can iterate through them as you
can iterate through any other collection (like an \texttt{ArrayList}). But
it's kind of tricky. Remember that there isn't any inherent ``order'' to the
key-value pairs, so you can't say ``give me the first one, then the second,
then ..., all the way to the end.''

\index{enumeration@\texttt{Enumeration}}
\index{design pattern!Iterator}
\index{Iterator pattern}
The way you achieve this is by using an \textbf{\texttt{Enumeration}} object,
also from the \texttt{java.util} package. This is actually an example of the
\textbf{Iterator} design pattern, which we'll see in a later chapter. For now,
mostly just memorize the approach.

\index{keys@\texttt{.keys()}}
\index{nextElement@\texttt{.nextElement()}}
You call \texttt{.keys()} on the \texttt{Hashtable} which returns an
\texttt{Enumeration} of the ``key halves'' of the key/value pairs. Think of an
\texttt{Enumeration} as just ``a way to iterate through a group of things, one
by one.'' In this case, the ``things'' are the \texttt{Hashtable}'s keys. Every
time you call \texttt{.nextElement()} on the \texttt{Enumeration} object, it
gives you the next key and advances the ``cursor'' to point further on down the
line. When you call \texttt{.hasMoreElements()} on the \texttt{Enumeration},
it tells you whether or not you can continue further. These two methods are
just what you need to write a \texttt{while} loop to cycle through all the
keys; and each time you get a key, you can use the original \texttt{Hashtable}
to retrieve the corresponding value. To wit:

\begin{Verbatim}[fontsize=\scriptsize,samepage=true,frame=single]
  import java.util.Hashtable;
  import java.util.Enumeration;
  ...
  Hashtable<String,String> alterEgos = new Hashtable<String,String>();
  ...
  Enumeration<String> superheroNames = alterEgos.keys();
  while (superheroNames.hasMoreElements()) {
      String superheroName = superheroNames.nextElement();
      System.out.println(superheroName + " is really " +
          alterEgos.get(superheroName));
  }
\end{Verbatim}

So the \textit{key} is retrieved directly from the \texttt{.nextElement()}
method, but to get the \textit{value} we have to go back to the
\texttt{Hashtable} itself and call \texttt{.get()} with the key.

Also, remember that the order in which you're given the key-value pairs here
is \textit{not} necessarily the order in which you inserted them. That order
is irretrievable after the fact -- if you need to keep track of it, you'll
need a separate data structure (perhaps an \texttt{ArrayList} of the keys, in
order of insertion) to remember it.

\section{Command-line arguments}

Every Java program you've ever written has this line in it:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true,frame=single]
    public static void main(String args[]) {
\end{Verbatim}

\index{args@\texttt{args}}
Have you ever wondered what that ``\texttt{args}'' thing is for?

\index{command-line argument}
Like all programming languages, Java lets you access the \textbf{command-line
arguments} that the user typed after the program name when she ran the
program. Command-line arguments are nothing new for a budding Linux user. For
instance, this sort of command is second-nature to you by now:

\index{cp@\texttt{cp}}
\medskip
\quad \texttt{\$ cp \ Program1.java \ \freakingtilde/backup }
\medskip

It consists of a command name (``\texttt{cp}'') plus two command-line
arguments, which are: ``\texttt{Program1.java}'', and
``\texttt{\freakingtilde/backup}.''

Now it turns out that \textit{every} Linux program is capable of taking
command-line arguments, including Java programs. Suppose I type this:

\medskip
\quad \texttt{\$ java \ Simulator \ UMW \ Marymount}
\medskip

\index{main method@\texttt{main()} method}
As you know, this is a command to run a Java program whose \texttt{main()}
method is in \texttt{Simulator.java}. Apparently, the user is trying to
simulate a basketball game between two opponents, and is using command-line
arguments to specify which opponents.

The only question that remains is: how does the Java code get access to those
strings that were typed on the command line when it was run? The answer is
\texttt{args}. When \texttt{main()} is called, \textit{Java passes the
command-line arguments to it as an array of Strings.} It's that simple. Inside
\texttt{main()}, you can treat the variable ``\texttt{args}'' in the same way
as any other array. For instance, if your code says:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true,frame=single]
class Simulator {
    public static void main(String args[]) {
        System.out.println("There were " + args.length +
            " command-line args.");
    }
}
\end{Verbatim}

then the output, when run with the command above, will be:

\medskip
\quad \texttt{There were 2 command-line args.}
\medskip

And if your code says:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true,frame=single]
class Simulator {
    public static void main(String args[]) {
        System.out.println("First arg was: " + args[0] + ".");
        System.out.println("Second arg was: " + args[1] + ".");
    }
}
\end{Verbatim}

then the output, when run with the command above, will be:

\begin{Verbatim}[fontsize=\small,samepage=true,frame=none]
    First arg was: UMW.
    Second arg was: Marymount.
\end{Verbatim}

This is powerful because it allows users to run your program with different
inputs and options \textit{without} having to recompile it (or even have
access to a compiler at all).

\section{Sameness vs.~identicality}

\index{sameness}
\index{identicality}
Our last odd/end has to do with how Java determines whether two objects are
``equal'' to each other. It turns out that there are two different definitions
of equality, which must be kept firmly separate in your mind:
\textbf{sameness} and \textbf{identicality}.

Consider the memory diagram in Figure~\ref{fig:sameness}. Here we have four
named variables: \texttt{p1}, \texttt{p2}, \texttt{p3}, and \texttt{p4}. Now
riddle me this, Batman: at the moment this snapshot was taken, which of these
variables do you consider to be \textit{equal} to each other?

\begin{figure}[ht]
\centering
\includegraphics[width=1\textwidth]{sameness.pdf}
\caption{What does ``equal'' mean?}
\label{fig:sameness}
\end{figure}

\index{equality}
I think we can agree that \texttt{p2} does not ``equal'' any of the others.
That leaves the other three. Are \texttt{p1} and \texttt{p3} ``equal?'' Are
\texttt{p3} and \texttt{p4} ``equal?''

It all depends on what your definition of ``equal'' is, of course. And here's
how Java does it: it says that \texttt{p1} and \texttt{p3} are \textbf{the
same}, while \texttt{p3} and \texttt{p4} are \textbf{identical}. ``The same''
means that the two reference variables refer to \textit{the very same} object
in memory. They are literally pointing to the same memory address, and so they
are pointing to the same ``copy'' of the object. This is easy to see by
imagining changes to that object -- if the left-most Josh Gibson changed
uniform numbers when he was traded from the Homestead Grays to the Pittsburgh
Crawfords, both \texttt{p1} and \texttt{p3} would automatically see that
change. They point to the same object, so they refer to the same uniform
number variable: if it changes, they both change.

But \texttt{p3} and \texttt{p4} are not the same -- they're merely identical.
They refer to two \textit{different} objects which just happen to have the
same internal state. Changing one would not affect the other. At the moment,
they're duplicate copies, but copies nevertheless.

(By the way, realize that sameness \textit{implies} identicality. The objects
\texttt{p1} and \texttt{p3} refer to the same, and therefore they are
\textit{also} identical.)

\subsection{Testing for sameness and identicality}

Sometimes a programmer cares about one of these conditions, and sometimes the
other. Java has two syntaxes to distinguish between the two tests.

\index{== (double equals)}
\index{equals@\texttt{.equals()}}
\textbf{To test for sameness, use ``\texttt{==}''. To test for identicality, use
``\texttt{.equals()}''.}

\begin{figure}
\begin{Verbatim}[fontsize=\scriptsize,samepage=true,frame=single]
    if (p1 == p3) {
        System.out.println("This WILL print.");
    }
    if (p3 == p4) {
        System.out.println("This will NOT print.");
    }
    if (p1.equals(p3)) {
        System.out.println("This WILL print.");
    }
    if (p3.equals(p4)) {
        System.out.println("This WILL probably print (but see below).");
    }
\end{Verbatim}
\caption{Testing for sameness vs.~identicality.}
\label{fig:sameness}
\end{figure}

Figure~\ref{fig:sameness} gives an example. In the first two lines of that
code, we are testing for sameness. So even though the objects referred to by
\texttt{p3} and \texttt{p4} are spittin' images of each other, identical in
every conceivable way, they are nevertheless \textit{not} ``\texttt{==}'' to
each other. But they might well be \texttt{.equals()} to each other...if we
take the special step described next.

\subsection{Overriding \texttt{.equals()}}

\index{equals@\texttt{.equals()}}
\index{ballplayer@\texttt{Ballplayer}}
It turns out that the last print statement of the preceding code statement
will actually \textit{not} get printed \textit{unless we inform Java about
what ``identical'' actually means for this class.} By default, Java will fall
back to just using the sameness test when \texttt{.equals()} is used. The way
to tell it how to test for identicality is to \textit{override} the
\texttt{.equals()} method for the \texttt{Ballplayer} class. Here's how:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true,frame=single]
class Ballplayer {
    private String name;
    private int uni;
    private String pos;

    ...
    public boolean equals(Ballplayer b) {
        if (this.name.equals(b.name) &&
            this.uni == b.uni &&
            this.pos.equals(b.pos)) {
            return true;
        } else {
            return false;
        }
    }
    ...
}
\end{Verbatim}

We've told Java that for \texttt{Ballplayers}, ``identical'' means objects that
have the same \texttt{name}, \texttt{uni}form number, and \texttt{pos}ition.
We could have done anything else we wanted; for instance:

\begin{Verbatim}[fontsize=\footnotesize,samepage=true,frame=single]
class Ballplayer {
    private String name;
    private int uni;
    private String pos;

    ...
    public boolean equals(Ballplayer b) {
        if (this.name.charAt(0) == b.name.charAt(0)) {
            return true;
        } else {
            return false;
        }
    }
    ...
}
\end{Verbatim}

\index{Gibson, Josh}
\index{Dimaggio, Joe}
Now any two \texttt{Ballplayer}s whose name started with the same letter would
be considered ``identical'' -- \texttt{Josh Gibson} would be considered
identical to \texttt{Joe Dimaggio}. This is a weird idea, and not normally
encouraged. But sometimes a situation does come up where it makes sense to say
that one object \texttt{.equals()} another even if only \textit{some} of their
information matches.

\subsection{\textit{Beware} comparing \texttt{String}s! (Use \texttt{.equals()})}

\index{equals@\texttt{.equals()}}
Notice how in the very definition of \texttt{Ballplayer.equals()} we called
the \texttt{.equals()} method for \texttt{String}s. You may have expected it
to say:

\begin{Verbatim}[fontsize=\scriptsize,samepage=true]
    public boolean equals(Ballplayer b) {
        // WRONG
        if (this.name == b.name &&
            this.uni == b.uni &&
            this.pos == b.pos) {
            return true;
        } else {
            return false;
        }
    }
\end{Verbatim}

instead. After all, that's less typing, right?

But my choice here was deliberate, and when comparing \texttt{String}s in
particular \textit{you must always use \texttt{.equals()}} like this. The
reason is esoteric, and has to do with how Java tries to conserve memory by
re-using space for different \texttt{String} objects with identical contents.
The upshot of this is that ``\texttt{==}'' \textit{sometimes} works the way you
expect, and other times does not. One minute you'll find that
\texttt{"Satchell" == "Satchell"} and then a moment later you'll discover that
\texttt{"Buck" != "Buck"}. It'll seem random, and it basically is. But if you
always use .\texttt{equals()} to compare \texttt{String}s, it'll always return
true when their contents are exactly the same, and you won't have any painful
late-night debugging sessions (for that reason, at least).