Skip to content
This repository was archived by the owner on Feb 3, 2024. It is now read-only.

Commit 7ce4f4d

Browse files
committed
add new test domains for test.sh; add DONE reporting what was done in this update; fix email regex; add skip all text after >>>
1 parent 04413f8 commit 7ce4f4d

File tree

29 files changed

+632
-19
lines changed

29 files changed

+632
-19
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,4 +70,5 @@ reformat-code.sh
7070
t1.py
7171
typescript
7272
test.out
73+
diff.out
7374
tmp/

DONE

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
DONE
2+
2022-11-17:mboot;
3+
4+
- tld ac; add registrant_country
5+
6+
- email regex use \. in: "emails": r"[\w\.-]+@[\w\.-]+\.[\w]{2,4}"
7+
this now results in valid data on google.com and meta.com and all derived from .com tld
8+
9+
- add comment in tld_resgepr.py that emails, status, name_serevers are multi items (lists)
10+
and all the rest are single results.
11+
12+
- add comments to beginning of tld_regexpr.py explaining that all matches are actually case insensitive (findall)
13+
and that many whois responses have trailing whitespace and may end in \r\n
14+
this helps with constructing regexes for future use
15+
16+
- add skipFromHere in _2_parse.py: lines starting with ^>>> signify the end of a normal whois response
17+
after this line there is only human or legal information so we can simply skip that text
18+
(a similar construct with ^--\s will be done later)
19+
20+
- add commment that unfortunately we cannot currently use rtrim on input from whois response and may regexes expect
21+
either \r or trailing whitespace, this can be done later and would make many regexes simpler in end detection
22+

bin/find_input_no_output.sh

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
#! /bin/bash
2+
3+
get_status_output()
4+
{
5+
ls ./tmp/ |
6+
while read item
7+
do
8+
d="tmp/$item"
9+
[ -s "$d/input" ] && {
10+
[ ! -f "$d/output" ] && {
11+
echo "# NO_OUTPUT for $item"
12+
}
13+
}
14+
done
15+
}
16+
17+
main()
18+
{
19+
get_status_output
20+
}
21+
22+
main

test.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ testOneDomain()
2222
echo "testing: $domain"
2323
./test2.py -d "$domain" >"$TestDataDir/$domain/test.out"
2424

25-
diff "$TestDataDir/$domain/output" "$TestDataDir/$domain/test.out" | tee "$TestDataDir/$domain/out"
25+
diff "$TestDataDir/$domain/output" "$TestDataDir/$domain/test.out" |
26+
tee "$TestDataDir/$domain/diff.out"
2627
}
2728

2829
main()

testdata/example.com/output

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,4 @@ statuses list ['']
1212
dnssec bool False
1313
name_servers list []
1414
registrant str ''
15+
emails list ['']

testdata/example.net/output

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,4 @@ statuses list ['']
1212
dnssec bool False
1313
name_servers list []
1414
registrant str ''
15+
emails list ['']

testdata/example.org/input

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,8 @@ Name Server: a.iana-servers.net
5858
Name Server: b.iana-servers.net
5959
DNSSEC: signedDelegation
6060
URL of the ICANN Whois Inaccuracy Complaint Form: https://www.icann.org/wicf/
61-
>>> Last update of WHOIS database: 2022-11-07T21:05:05Z <<<
61+
>>> Last update of WHOIS database: 2022-11-17T10:47:37Z <<<
6262

6363
For more information on Whois status codes, please visit https://icann.org/epp
6464

65-
Terms of Use: Access to Public Interest Registry WHOIS information is provided to assist persons in determining the contents of a domain name registration record in the Public Interest Registry registry database. The data in this record is provided by Public Interest Registry for informational purposes only, and Public Interest Registry does not guarantee its accuracy. This service is intended only for query-based access. You agree that you will use this data only for lawful purposes and that, under no circumstances will you use this data to (a) allow, enable, or otherwise support the transmission by e-mail, telephone, or facsimile of mass unsolicited, commercial advertising or solicitations to entities other than the data recipient's own existing customers; or (b) enable high volume, automated, electronic processes that send queries or data to the systems of Registry Operator, a Registrar, or Donuts except as reasonably necessary to register domain names or modify existing registrations. All rights reserved. Public Interest Registry reserves the right to modify these terms at any time. By submitting this query, you agree to abide by this policy. The Registrar of Record identified in this output may have an RDDS service that can be queried for additional information on how to contact the Registrant, Admin, or Tech contact of the queried domain name.
65+
Terms of Use: Access to Public Interest Registry WHOIS information is provided to assist persons in determining the contents of a domain name registration record in the Public Interest Registry registry database. The data in this record is provided by Public Interest Registry for informational purposes only, and Public Interest Registry does not guarantee its accuracy. This service is intended only for query-based access. You agree that you will use this data only for lawful purposes and that, under no circumstances will you use this data to (a) allow, enable, or otherwise support the transmission by e-mail, telephone, or facsimile of mass unsolicited, commercial advertising or solicitations to entities other than the data recipient's own existing customers; or (b) enable high volume, automated, electronic processes that send queries or data to the systems of Registry Operator, a Registrar, or Identity Digital except as reasonably necessary to register domain names or modify existing registrations. All rights reserved. Public Interest Registry reserves the right to modify these terms at any time. By submitting this query, you agree to abide by this policy. The Registrar of Record identified in this output may have an RDDS service that can be queried for additional information on how to contact the Registrant, Admin, or Tech contact of the queried domain name.

testdata/example.org/output

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,4 @@ statuses list ['serverDeleteProhibited https://icann.org/
1212
dnssec bool True
1313
name_servers list ['a.iana-servers.net', 'b.iana-servers.net']
1414
registrant str 'ICANN'
15+
emails list ['']

testdata/google.com/input

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
[Querying whois.verisign-grs.com]
2+
[Redirected to whois.markmonitor.com]
3+
[Querying whois.markmonitor.com]
4+
[whois.markmonitor.com]
5+
Domain Name: google.com
6+
Registry Domain ID: 2138514_DOMAIN_COM-VRSN
7+
Registrar WHOIS Server: whois.markmonitor.com
8+
Registrar URL: http://www.markmonitor.com
9+
Updated Date: 2019-09-09T15:39:04+0000
10+
Creation Date: 1997-09-15T07:00:00+0000
11+
Registrar Registration Expiration Date: 2028-09-13T07:00:00+0000
12+
Registrar: MarkMonitor, Inc.
13+
Registrar IANA ID: 292
14+
Registrar Abuse Contact Email: [email protected]
15+
Registrar Abuse Contact Phone: +1.2086851750
16+
Domain Status: clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)
17+
Domain Status: clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)
18+
Domain Status: clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)
19+
Domain Status: serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)
20+
Domain Status: serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)
21+
Domain Status: serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)
22+
Registrant Organization: Google LLC
23+
Registrant State/Province: CA
24+
Registrant Country: US
25+
Registrant Email: Select Request Email Form at https://domains.markmonitor.com/whois/google.com
26+
Admin Organization: Google LLC
27+
Admin State/Province: CA
28+
Admin Country: US
29+
Admin Email: Select Request Email Form at https://domains.markmonitor.com/whois/google.com
30+
Tech Organization: Google LLC
31+
Tech State/Province: CA
32+
Tech Country: US
33+
Tech Email: Select Request Email Form at https://domains.markmonitor.com/whois/google.com
34+
Name Server: ns2.google.com
35+
Name Server: ns3.google.com
36+
Name Server: ns1.google.com
37+
Name Server: ns4.google.com
38+
DNSSEC: unsigned
39+
URL of the ICANN WHOIS Data Problem Reporting System: http://wdprs.internic.net/
40+
>>> Last update of WHOIS database: 2022-11-17T10:39:21+0000 <<<
41+
42+
For more information on WHOIS status codes, please visit:
43+
https://www.icann.org/resources/pages/epp-status-codes
44+
45+
If you wish to contact this domain’s Registrant, Administrative, or Technical
46+
contact, and such email address is not visible above, you may do so via our web
47+
form, pursuant to ICANN’s Temporary Specification. To verify that you are not a
48+
robot, please enter your email address to receive a link to a page that
49+
facilitates email communication with the relevant contact(s).
50+
51+
Web-based WHOIS:
52+
https://domains.markmonitor.com/whois
53+
54+
If you have a legitimate interest in viewing the non-public WHOIS details, send
55+
your request and the reasons for your request to [email protected]
56+
and specify the domain name in the subject line. We will review that request and
57+
may ask for supporting documentation and explanation.
58+
59+
The data in MarkMonitor’s WHOIS database is provided for information purposes,
60+
and to assist persons in obtaining information about or related to a domain
61+
name’s registration record. While MarkMonitor believes the data to be accurate,
62+
the data is provided "as is" with no guarantee or warranties regarding its
63+
accuracy.
64+
65+
By submitting a WHOIS query, you agree that you will use this data only for
66+
lawful purposes and that, under no circumstances will you use this data to:
67+
(1) allow, enable, or otherwise support the transmission by email, telephone,
68+
or facsimile of mass, unsolicited, commercial advertising, or spam; or
69+
(2) enable high volume, automated, or electronic processes that send queries,
70+
data, or email to MarkMonitor (or its systems) or the domain name contacts (or
71+
its systems).
72+
73+
MarkMonitor reserves the right to modify these terms at any time.
74+
75+
By submitting this query, you agree to abide by this policy.
76+
77+
MarkMonitor Domain Management(TM)
78+
Protecting companies and consumers in a digital world.
79+
80+
Visit MarkMonitor at https://www.markmonitor.com
81+
Contact us at +1.8007459229
82+
In Europe, at +44.02032062220
83+
--

testdata/google.com/output

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
2+
test domain: <<<<<<<<<< google.com >>>>>>>>>>>>>>>>>>>>
3+
name str 'google.com'
4+
tld str 'com'
5+
registrar str 'MarkMonitor, Inc.'
6+
registrant_country str 'US'
7+
creation_date datetime.datetime 1997-09-15 09:00:00
8+
expiration_date NoneType None
9+
last_updated datetime.datetime 2019-09-09 17:39:04
10+
status str 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)'
11+
statuses list ['clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)', 'clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)', 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)', 'serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)', 'serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)', 'serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)']
12+
dnssec bool False
13+
name_servers list ['ns1.google.com', 'ns2.google.com', 'ns3.google.com', 'ns4.google.com']
14+
registrant str 'Google LLC'
15+
emails list ['[email protected]']

0 commit comments

Comments
 (0)