Skip to content

Commit 1f419ab

Browse files
committed
pwhois display bugfix, and massively improved normalization: removal of empty-like values (-, n/a, etc.), mapping of abbreviations for countries, US/CA/AU states and airport codes to full locality names.
1 parent a417f60 commit 1f419ab

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

74 files changed

+8506
-68
lines changed

README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,23 @@ The manual (including install instructions) can be found in the doc/ directory.
5151

5252
If any of those apply, don't hesitate to file an issue! The goal is 100% coverage, and we need your feedback to reach that goal.
5353

54+
## License
55+
56+
This library may be used under the WTFPL - or, if you take issue with that, consider it to be under the CC0.
57+
58+
## Data sources
59+
60+
This library uses a number of third-party datasets for normalization:
61+
62+
* `airports.dat`: [OpenFlights Airports Database](http://openflights.org/data.html) ([Open Database License 1.0](http://opendatacommons.org/licenses/odbl/1.0/), [Database Contents License 1.0](http://opendatacommons.org/licenses/dbcl/1.0/))
63+
* `countries.dat`: [Country List](https://github.com/umpirsky/country-list) (MIT license)
64+
* `countries3.dat`: [ISO countries list](https://gist.github.com/eparreno/205900) (license unspecified)
65+
* `states_au.dat`: Part of `pythonwhois` (WTFPL/CC0)
66+
* `states_us.dat`: [State Table](http://statetable.com/) (license unspecified, free reuse encouraged)
67+
* `states_ca.dat`: [State Table](http://statetable.com/) (license unspecified, free reuse encouraged)
68+
69+
Be aware that the OpenFlights database in particular has potential licensing consequences; if you do not wish to be bound by these potential consequences, you may simply delete the `airports.dat` file from your distribution. `pythonwhois` will assume there is no database available, and will not perform airport code conversion (but still function correctly otherwise). This also applies to other included datasets.
70+
5471
## Contributing
5572

5673
Feel free to fork and submit pull requests (to the `develop` branch)! If you change any parsing or normalization logic, ensure to run the full test suite before opening a pull request. Instructions for that are below.

pwhois

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env python2
22

3-
import argparse, pythonwhois, json, datetime
3+
import argparse, pythonwhois, json, datetime, sys
44
try:
55
from collections import OrderedDict
66
except ImportError as e:
@@ -100,7 +100,10 @@ else:
100100
for key, value in data_map.items():
101101
if key in contact_data and contact_data[key] is not None:
102102
label = " " + value + (" " * (widest_label - len(value))) + " :"
103-
actual_data = contact_data[key]
103+
if sys.version_info < (3, 0):
104+
actual_data = unicode(contact_data[key])
105+
else:
106+
actual_data = str(contact_data[key])
104107
if "\n" in actual_data: # Indent multi-line values properly
105108
lines = actual_data.split("\n")
106109
actual_data = "\n".join([lines[0]] + [(" " * (widest_label + 7)) + line for line in lines[1:]])

pythonwhois/airports.dat

Lines changed: 7733 additions & 0 deletions
Large diffs are not rendered by default.

pythonwhois/countries.dat

Lines changed: 265 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,265 @@
1+
iso,name
2+
AF,Afghanistan
3+
AL,Albania
4+
DZ,Algeria
5+
AS,"American Samoa"
6+
AD,Andorra
7+
AO,Angola
8+
AI,Anguilla
9+
AQ,Antarctica
10+
AG,"Antigua and Barbuda"
11+
AR,Argentina
12+
AM,Armenia
13+
AW,Aruba
14+
AU,Australia
15+
AT,Austria
16+
AZ,Azerbaijan
17+
BS,Bahamas
18+
BH,Bahrain
19+
BD,Bangladesh
20+
BB,Barbados
21+
BY,Belarus
22+
BE,Belgium
23+
BZ,Belize
24+
BJ,Benin
25+
BM,Bermuda
26+
BT,Bhutan
27+
BO,Bolivia
28+
BA,"Bosnia and Herzegovina"
29+
BW,Botswana
30+
BV,"Bouvet Island"
31+
BR,Brazil
32+
BQ,"British Antarctic Territory"
33+
IO,"British Indian Ocean Territory"
34+
VG,"British Virgin Islands"
35+
BN,Brunei
36+
BG,Bulgaria
37+
BF,"Burkina Faso"
38+
BI,Burundi
39+
KH,Cambodia
40+
CM,Cameroon
41+
CA,Canada
42+
CT,"Canton and Enderbury Islands"
43+
CV,"Cape Verde"
44+
KY,"Cayman Islands"
45+
CF,"Central African Republic"
46+
TD,Chad
47+
CL,Chile
48+
CN,China
49+
CX,"Christmas Island"
50+
CC,"Cocos [Keeling] Islands"
51+
CO,Colombia
52+
KM,Comoros
53+
CG,"Congo - Brazzaville"
54+
CD,"Congo - Kinshasa"
55+
CK,"Cook Islands"
56+
CR,"Costa Rica"
57+
HR,Croatia
58+
CU,Cuba
59+
CY,Cyprus
60+
CZ,"Czech Republic"
61+
CI,"Côte d’Ivoire"
62+
DK,Denmark
63+
DJ,Djibouti
64+
DM,Dominica
65+
DO,"Dominican Republic"
66+
NQ,"Dronning Maud Land"
67+
DD,"East Germany"
68+
EC,Ecuador
69+
EG,Egypt
70+
SV,"El Salvador"
71+
GQ,"Equatorial Guinea"
72+
ER,Eritrea
73+
EE,Estonia
74+
ET,Ethiopia
75+
FK,"Falkland Islands"
76+
FO,"Faroe Islands"
77+
FJ,Fiji
78+
FI,Finland
79+
FR,France
80+
GF,"French Guiana"
81+
PF,"French Polynesia"
82+
TF,"French Southern Territories"
83+
FQ,"French Southern and Antarctic Territories"
84+
GA,Gabon
85+
GM,Gambia
86+
GE,Georgia
87+
DE,Germany
88+
GH,Ghana
89+
GI,Gibraltar
90+
GR,Greece
91+
GL,Greenland
92+
GD,Grenada
93+
GP,Guadeloupe
94+
GU,Guam
95+
GT,Guatemala
96+
GG,Guernsey
97+
GN,Guinea
98+
GW,Guinea-Bissau
99+
GY,Guyana
100+
HT,Haiti
101+
HM,"Heard Island and McDonald Islands"
102+
HN,Honduras
103+
HK,"Hong Kong"
104+
HU,Hungary
105+
IS,Iceland
106+
IN,India
107+
ID,Indonesia
108+
IR,Iran
109+
IQ,Iraq
110+
IE,Ireland
111+
IM,"Isle of Man"
112+
IL,Israel
113+
IT,Italy
114+
JM,Jamaica
115+
JP,Japan
116+
JE,Jersey
117+
JT,"Johnston Island"
118+
JO,Jordan
119+
KZ,Kazakhstan
120+
KE,Kenya
121+
KI,Kiribati
122+
KW,Kuwait
123+
KG,Kyrgyzstan
124+
LA,Laos
125+
LV,Latvia
126+
LB,Lebanon
127+
LS,Lesotho
128+
LR,Liberia
129+
LY,Libya
130+
LI,Liechtenstein
131+
LT,Lithuania
132+
LU,Luxembourg
133+
MO,"Macau SAR China"
134+
MK,Macedonia
135+
MG,Madagascar
136+
MW,Malawi
137+
MY,Malaysia
138+
MV,Maldives
139+
ML,Mali
140+
MT,Malta
141+
MH,"Marshall Islands"
142+
MQ,Martinique
143+
MR,Mauritania
144+
MU,Mauritius
145+
YT,Mayotte
146+
FX,"Metropolitan France"
147+
MX,Mexico
148+
FM,Micronesia
149+
MI,"Midway Islands"
150+
MD,Moldova
151+
MC,Monaco
152+
MN,Mongolia
153+
ME,Montenegro
154+
MS,Montserrat
155+
MA,Morocco
156+
MZ,Mozambique
157+
MM,"Myanmar [Burma]"
158+
NA,Namibia
159+
NR,Nauru
160+
NP,Nepal
161+
NL,Netherlands
162+
AN,"Netherlands Antilles"
163+
NT,"Neutral Zone"
164+
NC,"New Caledonia"
165+
NZ,"New Zealand"
166+
NI,Nicaragua
167+
NE,Niger
168+
NG,Nigeria
169+
NU,Niue
170+
NF,"Norfolk Island"
171+
KP,"North Korea"
172+
VD,"North Vietnam"
173+
MP,"Northern Mariana Islands"
174+
NO,Norway
175+
OM,Oman
176+
PC,"Pacific Islands Trust Territory"
177+
PK,Pakistan
178+
PW,Palau
179+
PS,"Palestinian Territories"
180+
PA,Panama
181+
PZ,"Panama Canal Zone"
182+
PG,"Papua New Guinea"
183+
PY,Paraguay
184+
YD,"People's Democratic Republic of Yemen"
185+
PE,Peru
186+
PH,Philippines
187+
PN,"Pitcairn Islands"
188+
PL,Poland
189+
PT,Portugal
190+
PR,"Puerto Rico"
191+
QA,Qatar
192+
RO,Romania
193+
RU,Russia
194+
RW,Rwanda
195+
RE,Réunion
196+
BL,"Saint Barthélemy"
197+
SH,"Saint Helena"
198+
KN,"Saint Kitts and Nevis"
199+
LC,"Saint Lucia"
200+
MF,"Saint Martin"
201+
PM,"Saint Pierre and Miquelon"
202+
VC,"Saint Vincent and the Grenadines"
203+
WS,Samoa
204+
SM,"San Marino"
205+
SA,"Saudi Arabia"
206+
SN,Senegal
207+
RS,Serbia
208+
CS,"Serbia and Montenegro"
209+
SC,Seychelles
210+
SL,"Sierra Leone"
211+
SG,Singapore
212+
SK,Slovakia
213+
SI,Slovenia
214+
SB,"Solomon Islands"
215+
SO,Somalia
216+
ZA,"South Africa"
217+
GS,"South Georgia and the South Sandwich Islands"
218+
KR,"South Korea"
219+
ES,Spain
220+
LK,"Sri Lanka"
221+
SD,Sudan
222+
SR,Suriname
223+
SJ,"Svalbard and Jan Mayen"
224+
SZ,Swaziland
225+
SE,Sweden
226+
CH,Switzerland
227+
SY,Syria
228+
ST,"São Tomé and Príncipe"
229+
TW,Taiwan
230+
TJ,Tajikistan
231+
TZ,Tanzania
232+
TH,Thailand
233+
TL,Timor-Leste
234+
TG,Togo
235+
TK,Tokelau
236+
TO,Tonga
237+
TT,"Trinidad and Tobago"
238+
TN,Tunisia
239+
TR,Turkey
240+
TM,Turkmenistan
241+
TC,"Turks and Caicos Islands"
242+
TV,Tuvalu
243+
UM,"U.S. Minor Outlying Islands"
244+
PU,"U.S. Miscellaneous Pacific Islands"
245+
VI,"U.S. Virgin Islands"
246+
UG,Uganda
247+
UA,Ukraine
248+
SU,"Union of Soviet Socialist Republics"
249+
AE,"United Arab Emirates"
250+
GB,"United Kingdom"
251+
US,"United States"
252+
ZZ,"Unknown or Invalid Region"
253+
UY,Uruguay
254+
UZ,Uzbekistan
255+
VU,Vanuatu
256+
VA,"Vatican City"
257+
VE,Venezuela
258+
VN,Vietnam
259+
WK,"Wake Island"
260+
WF,"Wallis and Futuna"
261+
EH,"Western Sahara"
262+
YE,Yemen
263+
ZM,Zambia
264+
ZW,Zimbabwe
265+
AX,"Åland Islands"

0 commit comments

Comments
 (0)