Skip to content

Commit 2c9fa53

Browse files
committed
Reworked Language Normalization logic and some fixes
1 parent cc083f4 commit 2c9fa53

File tree

4 files changed

+192
-150
lines changed

4 files changed

+192
-150
lines changed

Pipfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ GitPython = "*"
1010
google-api-python-client = "*"
1111
h11 = ">=0.16.0" # Ensure dependency is secure
1212
internetarchive = ">=5.5.1"
13+
iso639-lang = "*"
1314
jupyterlab = ">=3.6.7"
1415
matplotlib = "*"
1516
numpy = "*"
@@ -19,7 +20,6 @@ pillow = ">=11.3.0" # Ensure dependency is secure
1920
Pyarrow = "*"
2021
Pygments = "*"
2122
python-dotenv = "*"
22-
python-iso639 = "*"
2323
requests = ">=2.31.0"
2424
seaborn = "*"
2525
urllib3 = ">=2.5.0"

Pipfile.lock

Lines changed: 70 additions & 58 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

data/license_url_to_identifier_mapping.csv

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
"https://creativecommons.org/licenses/by-nc-sa/4.0","CC BY-NC-SA 4.0"
66
"https://creativecommons.org/licenses/by-nd/4.0","CC BY-ND 4.0"
77
"https://creativecommons.org/licenses/by-sa/4.0","CC BY-SA 4.0"
8+
"https://creativecommons.org/publicdomain/certification/1.0/us","CC PUBLIC DOMAIN 1.0 US"
89
"https://creativecommons.org/publicdomain/mark/1.0","PDM 1.0"
910
"https://creativecommons.org/publicdomain/zero/1.0","CC0 1.0"
1011
"https://creativecommons.org/licenses/by/3.0","CC BY 3.0"
@@ -648,6 +649,6 @@
648649
"https://creativecommons.org/licenses/nd-nc/1.0/nl","CC ND-NC 1.0 NL"
649650
"https://creativecommons.org/licenses/sa/1.0/nl","CC SA 1.0 NL"
650651
"https://creativecommons.org/licenses/nc-sampling+/1.0/tw","CC NC-SAMPLING+ 1.0 TW"
651-
"https://creativecommons.org/licenses/sampling/1.0/tw","CC SAMPLING 1.0 TW"
652+
"https://creativecommons.org/licenses/sampling/1.0/tw","CC CERTIFICATION 1.0 US
652653
"https://creativecommons.org/licenses/sampling+/1.0/tw","CC SAMPLING+ 1.0 TW"
653654
"https://creativecommons.org/licenses/publicdomain","CC PUBLICDOMAIN"

0 commit comments

Comments
 (0)