-
Notifications
You must be signed in to change notification settings - Fork 424
CLDR-15830 Refactor getName: new class NameGetter #4226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
-Each CLDRFile has a NameGetter; all getName methods moved to NameGetter
-All three CLDRFile constructors need to initialize NameGetter
-Further renaming for clarity -Make methods private where possible -Move some calls to nameGetter() out of loops, re-use -Remove some dead code -Comments
macchiati
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just had a few comments.
| */ | ||
| public NameGetter(CLDRFile cldrFile) { | ||
| if (cldrFile == null) { | ||
| throw new RuntimeException("NameGetter must have non-null CLDRFile"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IllegalArgumentException is more specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in latest commit
| this.cldrFile = cldrFile; | ||
| } | ||
|
|
||
| private static final String GETNAME_LOCALE_SEPARATOR = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might put PATH in the name, also for similar ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in latest commit
| * @return the name | ||
| */ | ||
| public String getNameFromTypestrCode(String type, String code) { | ||
| return getNameFromTypenumCode(CLDRFile.typeNameToCode(type), code); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you get this merged, a second refactoring is to inline this and similar ones. 'Typenum' is hard to read; TypeEnum is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll address that in a subsequent PR. By "inline", do you mean, for example, to change getNameFromTypestrCode("territory", territory) to getNameFromTypenumCode(CLDRFile.TERRITORY_NAME, territory)? And then subsequently (or in the same PR) change CLDRFile.TERRITORY_NAME, etc., to an enum?
| * @param code a code such as "JP" | ||
| * @return the name | ||
| */ | ||
| public String getNameFromTypenumCode(int type, String code) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also get rid of the CLDR fake enums like CLDRFile.TERRITORY_NAME in favor of enums. This is probably easier in a second PR, later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right; see above comment
| * @return the name of the given bcp47 identifier | ||
| */ | ||
| public synchronized String getNameFromBCP47(String localeOrTZID) { | ||
| return getNameFromBCP47Bool(localeOrTZID, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as we are writing new code, best to avoid having a boolean in any API, and rather use a meaningful enum. Again, could be done in later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I'll include that in a subsequent PR
-Make CLDRFile.nameGetter final -Remove 2 unused constants in CLDRFile -Change RuntimeException to IllegalArgumentException in NameGetter constructor -Rename 3 constants to end in _PATH -In TestUtilities.printCountries call nameGetter() once instead of repeatedly
|
I've made this ready for review. So far this is fairly trivial refactoring but affects many files, so better not to keep changing on the same PR |
srl295
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK but the naming is awkward. Many of the functions could have the same name, just with different overloaded parameters. A builder (fluent) API model might work better, to mutate a set of options rather than have a confusing and similar set of functions to choose from. BCP47 does not have @ signs in it, so a different name should be used.
I'm approving this because it's an important refactor.
| import java.util.regex.Matcher; | ||
| import java.util.regex.Pattern; | ||
|
|
||
| public class NameGetter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is a little awkward, isn't it a display name formatter?
Perhaps the naming should parallel ICU4J at least somewhat. https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/LocaleDisplayNames.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it a display name formatter?
Good question! Is that what it is?
BCP47 does not have @ signs in it, so a different name should be used.
That sounds important. Some of the methods already had comments saying parameters were bcp47 identifiers, and I carried that over to the related methods (callers/callees). If that's inaccurate, what would be a more accurate description? Part of the problem with this whole module is that there are strings and integers coming in without a clear explanation of what strings and integers are actually expected/allowed, and that's why Mark and I have been discussing an Enum instead of strings and integers, wherever that's possible...
| * @param code a code such as "JP" | ||
| * @return the name | ||
| */ | ||
| public String getNameFromTypestrCode(String type, String code) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe forTypeAndCode() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, eventually we might not need/want to keep "getName" in the method names
|
|
||
| /** | ||
| * Returns the name of the given bcp47 identifier. Note that extensions must be specified using | ||
| * the old "\@key=type" syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then it's not BCP47!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is just moved here from CLDRFile; I haven't really written any new code other than the glue between CLDRFile and NameGetter. Is the given identifier really "a bcp47 identifier or an extension thereof"? Pardon my ignorance!
| return getNameFromBCP47BoolAlt(localeOrTZID, onlyConstructCompound, null); | ||
| } | ||
|
|
||
| public String getNameFromParserBool(LanguageTagParser lparser, boolean onlyConstructCompound) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be an overload, forLocale() taking a LanguageTagParser as a et
| null); | ||
| } | ||
|
|
||
| public synchronized String getNameFromBCP47Etc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Etc is a little awkward, a Builder model might be better.
| * @param paths if non-null, fillin with contributory paths | ||
| * @return the name of the given bcp47 identifier | ||
| */ | ||
| private synchronized String getNameFromManyThings( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, a builder model that sets up options might be better than a kitchen sink set of parameters.
new NameFormatter(locale).onlyConstructCompound(true).withLocalePatter(pattern)…
Steven, one reason for this refactor is that it was difficult to see how any particular version of getName() was actually used. With current main, IntelliJ reports getName "has 347 callers". Probably there's a way to find callers without including overloaded versions, but anyway 14 different methods in the same class with the same name is just too much for my little brain to work with. Refactoring is often easier if you can use global search-and-replace; methods with identical names get in the way of that. Another reason for the refactor is that these methods were tangled up in the middle of CLDRFile which is just too large for maintainable code. I agree that the names are somewhat awkward or arbitrary, but for me at least they give me a fighting chance to make sense what's calling what. After the code is reasonably clean, a nice finishing touch might be to restore some overloading, within reason. I complete agree with you that some of these methods have a "kitchen sink of parameters" -- best practice is that no method should have more than about 4 parameters. A possible solution is to make them members of a class, and I suspect that's what you mean by a Builder model. I can still hardly understand half of what's going on in this new class, so I can only revise so much at a time :-) |
That's a good point for that analysis. Since it's a library, I think it's harder to get this kind of analysis otherwise - SurveyTool might call it deep inside something for example (cross library), so "function is unused" warnings don't help.
💯
sounds good…
At leisure take a look at https://www.baeldung.com/java-fluent-interface-vs-builder-pattern I think it's the Builder model (not fluent, though they have similarities). Yes, you'd have basically a Settings-type object that gets modified by calling functions on it (which you could chain) and then it ends up with a toString() call when you're ready to output it as text.
I'm with you! But it's a good refactor—— and Bottom line is that's why I hit Approve. |
-Each CLDRFile has a NameGetter; all getName methods moved to NameGetter
-Give GetName methods distinct names for clarify and to avoid excessive overloading
-Make methods private where possible
-Move some calls to nameGetter() out of loops, re-use
-Remove some dead code
-Comments
CLDR-15830
ALLOW_MANY_COMMITS=true