Keeping the country list up-to-date

Hi, I just tried using this library to check against the [ISO-3166 codes supported](https://hackage.haskell.org/package/libphonenumber-0.1.3.0/docs/Data-PhoneNumber-Util.html#v:supportedRegions) by the libphonenumber library, which uses the C bindings of [Google's libphonenumber](https://github.com/google/libphonenumber).

It appears, there are two regions that are in that list, but not in here:
(which, [it appears](https://en.wikipedia.org/wiki/ISO_3166-2:SH#Current_codes), are both sub-divisions of `SH`, but also have their own codes)

* [`AC`](https://www.iso.org/obp/ui#iso:code:3166:AC) - Ascension Island
* [ `TA`](https://www.iso.org/obp/ui#iso:code:3166:TA) - Tristan da Cunha

Since we have the temporarily assigned XK added to this list, should we also include these two above?

--
on a side note,

The `countries.csv` file in the repository has not been updated for the last 8 years. The list of supported Regions for `libphonenumber` library hasn't changed for the last 4 years either. But, is it possible to get some details on what conditions are required to make the changes into the CSV, and also on how that CSV is being generated?

You probably know about it, but, there is a repository called `datasets/country-codes` which contains a list of countries with a lot of details at https://github.com/datasets/country-codes/blob/main/data/country-codes.csv
(which also has these two ISO codes missing for `AC` and `TA`, as well as `XK` which the current CSV includes, but, the datasets project [rejected(https://github.com/datasets/country-codes/issues/66) to include for now)

I tried to re-generate it with some python code via that source:
``` python
import pandas as pd
cs = pd.read_csv('https://github.com/byteverse/country/raw/refs/heads/main/countries.csv', dtype=str)
csn = pd.read_csv('https://github.com/datasets/country-codes/raw/refs/heads/main/data/country-codes.csv', dtype=str).rename(columns={
    "official_name_en": "name", 
    "ISO3166-1-Alpha-2":"alpha-2", 
    "ISO3166-1-Alpha-3":"alpha-3",
    "ISO3166-1-numeric":"country-code",
    "Region Name": "region",
    "Sub-region Name": "sub-region",
    "Region Code": "region-code",
    "Sub-region Code": "sub-region-code",
})

def zeroPadDigits(s):
    if type(s) == str:
        return f"{int(s):03d}"
    else:
        return s

csn["iso_3166-2"] = csn["alpha-2"].apply(lambda s: f"ISO 3166-2:{s}")
csn["country-code"] = csn["country-code"].apply(zeroPadDigits)
csn["region-code"] = csn["region-code"].apply(zeroPadDigits)
csn["sub-region-code"] = csn["sub-region-code"].apply(zeroPadDigits)
csn = csn[["name", "alpha-2", "alpha-3","country-code","iso_3166-2","region","sub-region","region-code","sub-region-code"]]
csn.to_csv('~/Downloads/countries-new.csv', index=False)
columns_to_compare = pd.Index(['alpha-2', 'alpha-3', 'country-code'])
pd.concat([cs[columns_to_compare], csn[columns_to_compare]]).drop_duplicates(keep=False)
```
(which also compares the two versions against three relevant codes)

But, the downside is that I see a lot of changes in the **sub-region** and **sub-region-code** columns. In any case, I'm just going to drop the generated CSV if it's any useful.

[countries-new.csv](https://github.com/user-attachments/files/19109386/countries-new.csv)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keeping the country list up-to-date #44

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Keeping the country list up-to-date #44

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions