You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/main/java/org/apache/commons/codec/language/Soundex.java
+42-65Lines changed: 42 additions & 65 deletions
Original file line number
Diff line number
Diff line change
@@ -21,21 +21,20 @@
21
21
importorg.apache.commons.codec.StringEncoder;
22
22
23
23
/**
24
-
* Encodes a string into a Soundex value. Soundex is an encoding used to relate similar names, but can also be used as a
25
-
* general purpose scheme to find word with similar phonemes.
24
+
* Encodes a string into a Soundex value. Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word
25
+
* with similar phonemes.
26
26
*
27
-
* <p>This class is thread-safe.
28
-
* Although not strictly immutable, the mutable fields are not actually used.</p>
27
+
* <p>
28
+
* This class is thread-safe. Although not strictly immutable, the mutable fields are not actually used.
29
+
* </p>
29
30
*/
30
31
publicclassSoundeximplementsStringEncoder {
31
32
32
33
/**
33
-
* The marker character used to indicate a silent (ignored) character.
34
-
* These are ignored except when they appear as the first character.
34
+
* The marker character used to indicate a silent (ignored) character. These are ignored except when they appear as the first character.
35
35
* <p>
36
-
* Note: The {@link #US_ENGLISH_MAPPING_STRING} does not use this mechanism
37
-
* because changing it might break existing code. Mappings that don't contain
38
-
* a silent marker code are treated as though H and W are silent.
36
+
* Note: The {@link #US_ENGLISH_MAPPING_STRING} does not use this mechanism because changing it might break existing code. Mappings that don't contain a
37
+
* silent marker code are treated as though H and W are silent.
39
38
* </p>
40
39
* <p>
41
40
* To override this, use the {@link #Soundex(String, boolean)} constructor.
@@ -46,69 +45,57 @@ public class Soundex implements StringEncoder {
46
45
publicstaticfinalcharSILENT_MARKER = '-';
47
46
48
47
/**
49
-
* This is a default mapping of the 26 letters used in US English. A value of {@code 0} for a letter position
50
-
* means do not encode, but treat as a separator when it occurs between consonants with the same code.
48
+
* This is a default mapping of the 26 letters used in US English. A value of {@code 0} for a letter position means do not encode, but treat as a separator
49
+
* when it occurs between consonants with the same code.
51
50
* <p>
52
-
* (This constant is provided as both an implementation convenience and to allow Javadoc to pick
53
-
* up the value for the constant values page.)
51
+
* (This constant is provided as both an implementation convenience and to allow Javadoc to pick up the value for the constant values page.)
54
52
* </p>
55
53
* <p>
56
-
* <strong>Note that letters H and W are treated specially.</strong>
57
-
* They are ignored (after the first letter) and don't act as separators
58
-
* between consonants with the same code.
54
+
* <strong>Note that letters H and W are treated specially.</strong> They are ignored (after the first letter) and don't act as separators between
* An instance of Soundex using the US_ENGLISH_MAPPING mapping.
73
-
* This treats H and W as silent letters.
74
-
* Apart from when they appear as the first letter, they are ignored.
75
-
* They don't act as separators between duplicate codes.
68
+
* An instance of Soundex using the US_ENGLISH_MAPPING mapping. This treats H and W as silent letters. Apart from when they appear as the first letter, they
69
+
* are ignored. They don't act as separators between duplicate codes.
* An instance of Soundex using the Simplified Soundex mapping, as described here:
83
-
* http://west-penwith.org.uk/misc/soundex.htm
76
+
* An instance of Soundex using the Simplified Soundex mapping, as described here: http://west-penwith.org.uk/misc/soundex.htm
84
77
* <p>
85
-
* This treats H and W the same as vowels (AEIOUY).
86
-
* Such letters aren't encoded (after the first), but they do
87
-
* act as separators when dropping duplicate codes.
88
-
* The mapping is otherwise the same as for {@link #US_ENGLISH}
78
+
* This treats H and W the same as vowels (AEIOUY). Such letters aren't encoded (after the first), but they do act as separators when dropping duplicate
79
+
* codes. The mapping is otherwise the same as for {@link #US_ENGLISH}.
* An instance of Soundex using the mapping as per the Genealogy site: http://www.genealogy.com/articles/research/00000060.html
98
88
* <p>
99
-
* This treats vowels (AEIOUY), H and W as silent letters.
100
-
* Such letters are ignored (after the first) and do not
101
-
* act as separators when dropping duplicate codes.
89
+
* This treats vowels (AEIOUY), H and W as silent letters. Such letters are ignored (after the first) and do not act as separators when dropping duplicate
90
+
* codes.
102
91
* </p>
103
92
* <p>
104
-
* The codes for consonants are otherwise the same as for
105
-
* {@link #US_ENGLISH_MAPPING_STRING} and {@link #US_ENGLISH_SIMPLIFIED}
93
+
* The codes for consonants are otherwise the same as for {@link #US_ENGLISH_MAPPING_STRING} and {@link #US_ENGLISH_SIMPLIFIED}.
* The maximum length of a Soundex code - Soundex codes are only four characters by definition.
@@ -119,17 +106,16 @@ public class Soundex implements StringEncoder {
119
106
privateintmaxLength = 4;
120
107
121
108
/**
122
-
* Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each
123
-
* letter is mapped. This implementation contains a default map for US_ENGLISH
109
+
* Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each letter is mapped. This implementation
110
+
* contains a default map for US_ENGLISH.
124
111
*/
125
112
privatefinalchar[] soundexMapping;
126
113
127
114
/**
128
115
* Should H and W be treated specially?
129
116
* <p>
130
-
* In versions of the code prior to 1.11,
131
-
* the code always treated H and W as silent (ignored) letters.
132
-
* If this field is false, H and W are no longer special-cased.
117
+
* In versions of the code prior to 1.11, the code always treated H and W as silent (ignored) letters. If this field is false, H and W are no longer
118
+
* special-cased.
133
119
* </p>
134
120
*/
135
121
privatefinalbooleanspecialCaseHW;
@@ -146,33 +132,30 @@ public Soundex() {
146
132
}
147
133
148
134
/**
149
-
* Creates a Soundex instance using the given mapping. This constructor can be used to provide an internationalized
150
-
* mapping for a non-Western character set.
135
+
* Creates a Soundex instance using the given mapping. This constructor can be used to provide an internationalized mapping for a non-Western character set.
151
136
* <p>
152
-
* Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each
153
-
* letter is mapped. This implementation contains a default map for US_ENGLISH
137
+
* Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each letter is mapped. This implementation
138
+
* contains a default map for US_ENGLISH
154
139
* </p>
155
140
* <p>
156
141
* If the mapping contains an instance of {@link #SILENT_MARKER} then H and W are not given special treatment.
157
142
* </p>
158
143
*
159
-
* @param mapping
160
-
* Mapping array to use when finding the corresponding code for a given character.
144
+
* @param mapping Mapping array to use when finding the corresponding code for a given character.
0 commit comments