-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Hi, I just tried using this library to check against the ISO-3166 codes supported by the libphonenumber library, which uses the C bindings of Google's libphonenumber.
It appears, there are two regions that are in that list, but not in here:
(which, it appears, are both sub-divisions of SH, but also have their own codes)
Since we have the temporarily assigned XK added to this list, should we also include these two above?
--
on a side note,
The countries.csv file in the repository has not been updated for the last 8 years. The list of supported Regions for libphonenumber library hasn't changed for the last 4 years either. But, is it possible to get some details on what conditions are required to make the changes into the CSV, and also on how that CSV is being generated?
You probably know about it, but, there is a repository called datasets/country-codes which contains a list of countries with a lot of details at https://github.com/datasets/country-codes/blob/main/data/country-codes.csv
(which also has these two ISO codes missing for AC and TA, as well as XK which the current CSV includes, but, the datasets project [rejected(https://github.com/datasets/country-codes/issues/66) to include for now)
I tried to re-generate it with some python code via that source:
import pandas as pd
cs = pd.read_csv('https://github.com/byteverse/country/raw/refs/heads/main/countries.csv', dtype=str)
csn = pd.read_csv('https://github.com/datasets/country-codes/raw/refs/heads/main/data/country-codes.csv', dtype=str).rename(columns={
"official_name_en": "name",
"ISO3166-1-Alpha-2":"alpha-2",
"ISO3166-1-Alpha-3":"alpha-3",
"ISO3166-1-numeric":"country-code",
"Region Name": "region",
"Sub-region Name": "sub-region",
"Region Code": "region-code",
"Sub-region Code": "sub-region-code",
})
def zeroPadDigits(s):
if type(s) == str:
return f"{int(s):03d}"
else:
return s
csn["iso_3166-2"] = csn["alpha-2"].apply(lambda s: f"ISO 3166-2:{s}")
csn["country-code"] = csn["country-code"].apply(zeroPadDigits)
csn["region-code"] = csn["region-code"].apply(zeroPadDigits)
csn["sub-region-code"] = csn["sub-region-code"].apply(zeroPadDigits)
csn = csn[["name", "alpha-2", "alpha-3","country-code","iso_3166-2","region","sub-region","region-code","sub-region-code"]]
csn.to_csv('~/Downloads/countries-new.csv', index=False)
columns_to_compare = pd.Index(['alpha-2', 'alpha-3', 'country-code'])
pd.concat([cs[columns_to_compare], csn[columns_to_compare]]).drop_duplicates(keep=False)(which also compares the two versions against three relevant codes)
But, the downside is that I see a lot of changes in the sub-region and sub-region-code columns. In any case, I'm just going to drop the generated CSV if it's any useful.