Skip to content

Commit 76f6c3c

Browse files
committed
Add the PSL class documentation
1 parent cdf84fb commit 76f6c3c

File tree

1 file changed

+59
-1
lines changed

1 file changed

+59
-1
lines changed

README.md

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ If the binary wheel is not available for your platform, then you will need a C++
1616

1717
First, you need to import classes:
1818
```python
19-
from upa_url import URL, URLSearchParams
19+
from upa_url import PSL, URL, URLSearchParams
2020
```
2121

2222
### URL class
@@ -128,6 +128,64 @@ There are functions to manipulate search parameters:
128128
print(params) # a=3&b=2&c=1
129129
```
130130

131+
### PSL class
132+
133+
The PSL class allows getting the [public suffix](https://url.spec.whatwg.org/#host-public-suffix) and [registrable domain](https://url.spec.whatwg.org/#host-registrable-domain) of a given host.
134+
135+
First, you need to create a PSL object and load the [Public Suffix List](https://publicsuffix.org/). This list can be downloaded from https://publicsuffix.org/list/public_suffix_list.dat. The downloaded file can be loaded using one of the following methods:
136+
1. Use the `load` function:
137+
```python
138+
psl = PSL.load('public_suffix_list.dat')
139+
if (psl is not None):
140+
print(psl.public_suffix('upa-url.github.io')) # github.io
141+
```
142+
2. Use the `PSL` constructor:
143+
```python
144+
try:
145+
psl = PSL('public_suffix_list.dat')
146+
# Use psl
147+
except Exception:
148+
print('PSL loading error')
149+
```
150+
151+
The Public Suffix List can be loaded from memory using the push interface:
152+
1. Line by line:
153+
```python
154+
psl = PSL()
155+
with open('public_suffix_list.dat', 'r', encoding='utf-8') as f:
156+
for line in f:
157+
psl.push_line(line.rstrip())
158+
if psl.finalize():
159+
# Use psl
160+
```
161+
2. Using the memory buffer, for example, to load a list from the web:
162+
```python
163+
import urllib.request
164+
url = 'https://upa-url.github.io/demo/public_suffix_list.dat'
165+
psl = PSL()
166+
with urllib.request.urlopen(url) as response:
167+
while (chunk := response.read(4096)):
168+
psl.push(chunk)
169+
if psl.finalize():
170+
# Use psl
171+
```
172+
173+
The following examples show how to get a [public suffix](https://url.spec.whatwg.org/#host-public-suffix) and a [registrable domain](https://url.spec.whatwg.org/#host-registrable-domain):
174+
```python
175+
# Get from the host string
176+
print(psl.public_suffix('abc.ålgård.no')) # xn--lgrd-poac.no
177+
print(psl.registrable_domain('abc.ålgård.no')) # abc.xn--lgrd-poac.no
178+
179+
# Get from the host string and do not convert the output to ASCII
180+
print(psl.public_suffix('abc.ålgård.no', ascii=False)) # ålgård.no
181+
print(psl.registrable_domain('abc.ålgård.no', ascii=False)) # abc.ålgård.no
182+
183+
# Get from the URL
184+
url = URL('https://upa-url.github.io/docs/')
185+
print(psl.public_suffix(url)) # github.io
186+
print(psl.registrable_domain(url)) # upa-url.github.io
187+
```
188+
131189
## License
132190

133191
This package is licensed under the [BSD 2-Clause License](https://opensource.org/license/bsd-2-clause/) (see `LICENSE` file).

0 commit comments

Comments
 (0)