Skip to content

Commit 6dfa082

Browse files
fix: Workflow Bot -- Build Dictionaries (#5129)
Co-authored-by: street-side-software-automation[bot] <74785433+street-side-software-automation[bot]@users.noreply.github.com>
1 parent 78890f4 commit 6dfa082

File tree

22 files changed

+727913
-1
lines changed

22 files changed

+727913
-1
lines changed

dictionaries/aoo-mozilla-en-dict/dicts/.sync-github-files.json

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,5 +23,26 @@
2323
"en_ZA (Dwayne Bailey) (2012-07-10)/en_ZA.dic": "422c6dfde437276e0590f616f15ab08f6e4b65bb",
2424
"en_ZA (Marco Pinto) (-ise) (2025+)/README_en_ZA.txt": "5a79307244f11b35fba9c06edc4dafb0eec35a2d",
2525
"en_ZA (Marco Pinto) (-ise) (2025+)/en_ZA.aff": "b840ce2b48235a6c98571342489e3d8a1fe7c20b",
26-
"en_ZA (Marco Pinto) (-ise) (2025+)/en_ZA.dic": "b7c6d75172f42af8ee999ff3cea6c926d7955e37"
26+
"en_ZA (Marco Pinto) (-ise) (2025+)/en_ZA.dic": "b7c6d75172f42af8ee999ff3cea6c926d7955e37",
27+
"en_AU (Marco Pinto) (-ise) (alt)/README_en_GB_ZA_US_CA_AU.txt": "2be41f0887acd3979cd4f7e130bac46e168718f0",
28+
"en_AU (Marco Pinto) (-ise) (alt)/en_AU.aff": "3af8a1c5f63e122156fc35a5c8ce0c387abd73c3",
29+
"en_AU (Marco Pinto) (-ise) (alt)/en_AU.dic": "30537b4990e848028925dc1970b40a67c98f7641",
30+
"en_CA (Marco Pinto) (-ise) (alt)/README_en_GB_ZA_US_CA_AU.txt": "2be41f0887acd3979cd4f7e130bac46e168718f0",
31+
"en_CA (Marco Pinto) (-ise) (alt)/en_CA.aff": "f53abd871d19f1374047c3659a39f572d8b7b716",
32+
"en_CA (Marco Pinto) (-ise) (alt)/en_CA.dic": "e81fe9f58502ce6e6b653c23d1773c313afac171",
33+
"en_GB (Marco Pinto) (-ise -ize)/README_en_GB_ZA_US_CA_AU.txt": "2be41f0887acd3979cd4f7e130bac46e168718f0",
34+
"en_GB (Marco Pinto) (-ise -ize)/en_GB.aff": "fcd99e7e43944bb93521335c747a6ffd0b00a76b",
35+
"en_GB (Marco Pinto) (-ise -ize)/en_GB.dic": "6903bdbb6f2a3aab7963e54ad807ba6ebc3aa923",
36+
"en_GB (Marco Pinto) (-ise)/README_en_GB_ZA_US_CA_AU.txt": "2be41f0887acd3979cd4f7e130bac46e168718f0",
37+
"en_GB (Marco Pinto) (-ise)/en_GB.aff": "a3decb2fb8e92c90edbc95599d66a3e197a6f54b",
38+
"en_GB (Marco Pinto) (-ise)/en_GB.dic": "43535a534ad378d9138a0e56c7e35e70534bf1c8",
39+
"en_GB (Marco Pinto) (-ize) (Oxford)/README_en_GB_ZA_US_CA_AU.txt": "2be41f0887acd3979cd4f7e130bac46e168718f0",
40+
"en_GB (Marco Pinto) (-ize) (Oxford)/en_GB-oxendict.aff": "67fe07ccc75989f7a6e5b743a964e1d94b6046fc",
41+
"en_GB (Marco Pinto) (-ize) (Oxford)/en_GB-oxendict.dic": "58ee4d26d8e38849d12e245314eeb589925404e2",
42+
"en_US (Marco Pinto) (-ize) (alt)/README_en_GB_ZA_US_CA_AU.txt": "2be41f0887acd3979cd4f7e130bac46e168718f0",
43+
"en_US (Marco Pinto) (-ize) (alt)/en_US.aff": "4d1549fcdfd6761fc7b6ccccb61e1624d6f919a9",
44+
"en_US (Marco Pinto) (-ize) (alt)/en_US.dic": "dda806daa4751875b85f5dcaf393239bc53c711a",
45+
"en_ZA (Marco Pinto) (-ise)/README_en_GB_ZA_US_CA_AU.txt": "2be41f0887acd3979cd4f7e130bac46e168718f0",
46+
"en_ZA (Marco Pinto) (-ise)/en_ZA.aff": "58ad5b82e54abdacaa9fb9069923e5005a98fcf2",
47+
"en_ZA (Marco Pinto) (-ise)/en_ZA.dic": "f38562ee620c07b1f1cf405be93d8fbe369afd66"
2748
}
Lines changed: 362 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,362 @@
1+
README – English Hunspell Dictionaries
2+
Maintained by Marco A.G.Pinto
3+
4+
5+
FREE SOFTWARE
6+
=============
7+
These spellcheckers are free software — free to use, share, and modify.
8+
9+
They are intended for a broad audience, including students, professionals,
10+
writers, and non-native speakers.
11+
12+
Feedback is welcome to help keep these dictionaries accurate and reliable.
13+
14+
15+
DICTIONARY MAINTENANCE
16+
======================
17+
Marco maintains five main English variants:
18+
• en_GB (British, “ise”), since 25.Aug.2013
19+
• en_ZA (South African, “ise”), since 01.Jan.2025
20+
• en_US (American, “ize”), since 01.Jan.2026 (alternative version)
21+
• en_CA (Canadian, “ise”), since 01.Jan.2026 (alternative version)
22+
• en_AU (Australian, “ise”), since 01.Jan.2026 (alternative version)
23+
24+
Notes:
25+
• Both “-ise” and “-ize” forms are provided; some variants may still have
26+
missing or misplaced forms.
27+
• en_ZA and en_GB are maintained in parallel, with the GB dictionary having
28+
been merged into the ZA dictionary while retaining regional terms.
29+
• en_US, en_CA, and en_AU are converted from en_GB using Proofing Tool GUI,
30+
then refined for regional vocabulary.
31+
32+
en_ZA uses the same base list as en_GB, originally from the Aspell English
33+
wordlists (LGPL).
34+
35+
Marco began maintaining en_GB and en_ZA after previous maintainers became
36+
unavailable, and later began work on alternative en_US, en_CA, and en_AU
37+
dictionaries alongside existing upstream versions.
38+
39+
The en_GB dictionary began as a subset of Kevin Atkinson's wordlist for
40+
Pspell/Aspell (LGPL) and has been expanded by David Bartlett, Brian Kelk,
41+
Andrew Brown, and Marco A.G.Pinto, including:
42+
• Removal of Americanisms in en_GB
43+
• Addition of missing words
44+
• Correction of errors
45+
• Addition of compound hyphenated forms
46+
• Thousands of proper/place names
47+
• Thousands of possessives and plurals
48+
• Removal of duplicates
49+
• Support for ordinals and affixes (e.g., 1st, 111th, 1990s)
50+
51+
52+
CONTRIBUTORS
53+
============
54+
Many people have provided valuable input over the years — too numerous to list
55+
individually. Special thanks to:
56+
57+
• Cyberknight – Submitted numerous scientific terms in the early days and
58+
helped create a legacy British Dictionary for Mozilla when modern
59+
WebExtensions were not yet widely supported.
60+
Although no longer in contact, his contributions remain a valued part
61+
of the project's history.
62+
63+
• Babelfish (Peter C.) – Regularly contributed words and offered practical
64+
suggestions that significantly refined and updated the dictionary.
65+
66+
• Peter C. (not Babelfish) – Advocated consistently for an “-ise” British
67+
Dictionary, resulting in the inclusion of many such entries and ensuring
68+
alignment with modern linguistic standards.
69+
70+
These wordlists aim to represent modern English for the Commonwealth and
71+
North America.
72+
73+
The .AFF file was originally created by David Bartlett and Andrew Brown based
74+
on MySpell rules (LGPL), focusing on accurate morphology, not file compression.
75+
Marco has refined the rules since 2013.
76+
77+
78+
MARCO A.G.PINTO
79+
===============
80+
Marco forked the British dictionary in 2013 after years without updates,
81+
choosing Mozilla's unobfuscated version.
82+
83+
Spelling is verified using:
84+
• Oxford Dictionaries
85+
• Collins Dictionary
86+
• Cambridge Dictionary
87+
• Merriam-Webster Dictionary
88+
• Wiktionary (used with caution ⚠)
89+
• Wikipedia (used with caution ⚠)
90+
• Physical dictionaries
91+
92+
Main challenges:
93+
• Proper names
94+
• Possessives
95+
• Plurals
96+
97+
98+
HUNSPELL ENGINE
99+
===============
100+
The Hunspell engine was developed by Németh László, an open-source tool used for
101+
spellchecking and morphological analysis.
102+
103+
It is capable of handling complex word forms, affixes, and spelling rules with
104+
efficiency.
105+
106+
Hunspell has become the standard engine for many major software projects, and
107+
Németh László's work continues to support and enhance modern language tools.
108+
109+
110+
TABOO/OFFENSIVE WORDS
111+
=====================
112+
A NOSUGGEST flag is applied to certain taboo or offensive words to prevent them
113+
from being suggested in spellchecking results.
114+
115+
While every effort has been made to mark the most offensive terms, it is not
116+
guaranteed that all possible taboo words are flagged.
117+
118+
Some words have intentionally been left unflagged because, although considered
119+
taboo in some dictionaries, they are not generally regarded as swear words in
120+
modern usage.
121+
122+
123+
CONTACT
124+
=======
125+
Marco A.G.Pinto
126+
• E-mail: [email protected]
127+
• Website: https://proofingtoolgui.org
128+
• FAQ: https://proofingtoolgui.org/faq.html
129+
• Changelog: https://proofingtoolgui.org/en_GB_CHANGES.txt
130+
• GitHub: https://github.com/marcoagpinto/aoo-mozilla-en-dict
131+
132+
133+
CHANGELOG (2025+)
134+
=================
135+
2026-01-01 (Marco A.G.Pinto)
136+
- Marco began maintaining alternative versions of the US, CA, and AU dictionaries.
137+
- Merged the GB dictionary into the ZA dictionary.
138+
- Aligned all five dictionaries to the GB versioning format for consistency.
139+
- Removed numerous Americanisms from the GB dictionary.
140+
- The hyph US and GB files now use UTF-8-BOM Unix (LF).
141+
- The thesaurus now uses UTF-8-BOM Unix (LF).
142+
- Improved images for extensions.
143+
144+
2025-03-02 to 2025-12-31 (Marco A.G.Pinto)
145+
- Better -ise/-ize handling.
146+
- Fixed, improved, and added flags for U.S. compatibility on 1-JAN-2026.
147+
148+
2025-03-01 (Marco A.G.Pinto)
149+
- New extension icons for Mozilla (GB) in 96×96 and for LibreOffice/OpenOffice
150+
in 128×128 pixels.
151+
- The default GB and ZA spelling is now -ise.
152+
- Unified the GB and ZA .AFF file.
153+
- Merged hundreds of thousands of proper names from GB to ZA.
154+
- Fixed/improved: flag “O” and “W”.
155+
156+
2025-01-01 (Marco A.G.Pinto)
157+
- Official fork of ZA Dictionary.
158+
159+
160+
LGPL3 LICENSE
161+
=============
162+
163+
GNU LESSER GENERAL PUBLIC LICENSE
164+
Version 3, 29 June 2007
165+
166+
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
167+
Everyone is permitted to copy and distribute verbatim copies
168+
of this license document, but changing it is not allowed.
169+
170+
171+
This version of the GNU Lesser General Public License incorporates
172+
the terms and conditions of version 3 of the GNU General Public
173+
License, supplemented by the additional permissions listed below.
174+
175+
0. Additional Definitions.
176+
177+
As used herein, "this License" refers to version 3 of the GNU Lesser
178+
General Public License, and the "GNU GPL" refers to version 3 of the GNU
179+
General Public License.
180+
181+
"The Library" refers to a covered work governed by this License,
182+
other than an Application or a Combined Work as defined below.
183+
184+
An "Application" is any work that makes use of an interface provided
185+
by the Library, but which is not otherwise based on the Library.
186+
Defining a subclass of a class defined by the Library is deemed a mode
187+
of using an interface provided by the Library.
188+
189+
A "Combined Work" is a work produced by combining or linking an
190+
Application with the Library. The particular version of the Library
191+
with which the Combined Work was made is also called the "Linked
192+
Version".
193+
194+
The "Minimal Corresponding Source" for a Combined Work means the
195+
Corresponding Source for the Combined Work, excluding any source code
196+
for portions of the Combined Work that, considered in isolation, are
197+
based on the Application, and not on the Linked Version.
198+
199+
The "Corresponding Application Code" for a Combined Work means the
200+
object code and/or source code for the Application, including any data
201+
and utility programs needed for reproducing the Combined Work from the
202+
Application, but excluding the System Libraries of the Combined Work.
203+
204+
1. Exception to Section 3 of the GNU GPL.
205+
206+
You may convey a covered work under sections 3 and 4 of this License
207+
without being bound by section 3 of the GNU GPL.
208+
209+
2. Conveying Modified Versions.
210+
211+
If you modify a copy of the Library, and, in your modifications, a
212+
facility refers to a function or data to be supplied by an Application
213+
that uses the facility (other than as an argument passed when the
214+
facility is invoked), then you may convey a copy of the modified
215+
version:
216+
217+
a) under this License, provided that you make a good faith effort to
218+
ensure that, in the event an Application does not supply the
219+
function or data, the facility still operates, and performs
220+
whatever part of its purpose remains meaningful, or
221+
222+
b) under the GNU GPL, with none of the additional permissions of
223+
this License applicable to that copy.
224+
225+
3. Object Code Incorporating Material from Library Header Files.
226+
227+
The object code form of an Application may incorporate material from
228+
a header file that is part of the Library. You may convey such object
229+
code under terms of your choice, provided that, if the incorporated
230+
material is not limited to numerical parameters, data structure
231+
layouts and accessors, or small macros, inline functions and templates
232+
(ten or fewer lines in length), you do both of the following:
233+
234+
a) Give prominent notice with each copy of the object code that the
235+
Library is used in it and that the Library and its use are
236+
covered by this License.
237+
238+
b) Accompany the object code with a copy of the GNU GPL and this license
239+
document.
240+
241+
4. Combined Works.
242+
243+
You may convey a Combined Work under terms of your choice that,
244+
taken together, effectively do not restrict modification of the
245+
portions of the Library contained in the Combined Work and reverse
246+
engineering for debugging such modifications, if you also do each of
247+
the following:
248+
249+
a) Give prominent notice with each copy of the Combined Work that
250+
the Library is used in it and that the Library and its use are
251+
covered by this License.
252+
253+
b) Accompany the Combined Work with a copy of the GNU GPL and this license
254+
document.
255+
256+
c) For a Combined Work that displays copyright notices during
257+
execution, include the copyright notice for the Library among
258+
these notices, as well as a reference directing the user to the
259+
copies of the GNU GPL and this license document.
260+
261+
d) Do one of the following:
262+
263+
0) Convey the Minimal Corresponding Source under the terms of this
264+
License, and the Corresponding Application Code in a form
265+
suitable for, and under terms that permit, the user to
266+
recombine or relink the Application with a modified version of
267+
the Linked Version to produce a modified Combined Work, in the
268+
manner specified by section 6 of the GNU GPL for conveying
269+
Corresponding Source.
270+
271+
1) Use a suitable shared library mechanism for linking with the
272+
Library. A suitable mechanism is one that (a) uses at run time
273+
a copy of the Library already present on the user's computer
274+
system, and (b) will operate properly with a modified version
275+
of the Library that is interface-compatible with the Linked
276+
Version.
277+
278+
e) Provide Installation Information, but only if you would otherwise
279+
be required to provide such information under section 6 of the
280+
GNU GPL, and only to the extent that such information is
281+
necessary to install and execute a modified version of the
282+
Combined Work produced by recombining or relinking the
283+
Application with a modified version of the Linked Version. (If
284+
you use option 4d0, the Installation Information must accompany
285+
the Minimal Corresponding Source and Corresponding Application
286+
Code. If you use option 4d1, you must provide the Installation
287+
Information in the manner specified by section 6 of the GNU GPL
288+
for conveying Corresponding Source.)
289+
290+
5. Combined Libraries.
291+
292+
You may place library facilities that are a work based on the
293+
Library side by side in a single library together with other library
294+
facilities that are not Applications and are not covered by this
295+
License, and convey such a combined library under terms of your
296+
choice, if you do both of the following:
297+
298+
a) Accompany the combined library with a copy of the same work based
299+
on the Library, uncombined with any other library facilities,
300+
conveyed under the terms of this License.
301+
302+
b) Give prominent notice with the combined library that part of it
303+
is a work based on the Library, and explaining where to find the
304+
accompanying uncombined form of the same work.
305+
306+
6. Revised Versions of the GNU Lesser General Public License.
307+
308+
The Free Software Foundation may publish revised and/or new versions
309+
of the GNU Lesser General Public License from time to time. Such new
310+
versions will be similar in spirit to the present version, but may
311+
differ in detail to address new problems or concerns.
312+
313+
Each version is given a distinguishing version number. If the
314+
Library as you received it specifies that a certain numbered version
315+
of the GNU Lesser General Public License "or any later version"
316+
applies to it, you have the option of following the terms and
317+
conditions either of that published version or of any later version
318+
published by the Free Software Foundation. If the Library as you
319+
received it does not specify a version number of the GNU Lesser
320+
General Public License, you may choose any version of the GNU Lesser
321+
General Public License ever published by the Free Software Foundation.
322+
323+
If the Library as you received it specifies that a proxy can decide
324+
whether future versions of the GNU Lesser General Public License shall
325+
apply, that proxy's public statement of acceptance of any version is
326+
permanent authorization for you to choose that version for the
327+
Library.
328+
329+
330+
WORDNET LICENSE (THESAURUS FOR OPENOFFICE/LIBREOFFICE)
331+
======================================================
332+
WordNet Release 2.1
333+
334+
This software and database is being provided to you, the LICENSEE, by
335+
Princeton University under the following license. By obtaining, using
336+
and/or copying this software and database, you agree that you have
337+
read, understood, and will comply with these terms and conditions.:
338+
339+
Permission to use, copy, modify and distribute this software and
340+
database and its documentation for any purpose and without fee or
341+
royalty is hereby granted, provided that you agree to comply with
342+
the following copyright notice and statements, including the disclaimer,
343+
and that the same appear on ALL copies of the software, database and
344+
documentation, including modifications that you make for internal
345+
use or for distribution.
346+
347+
WordNet 2.1 Copyright 2005 by Princeton University. All rights reserved.
348+
349+
THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON
350+
UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
351+
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON
352+
UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES OF MERCHANT-
353+
ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE
354+
OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT
355+
INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR
356+
OTHER RIGHTS.
357+
358+
The name of Princeton University or Princeton may not be used in
359+
advertising or publicity pertaining to distribution of the software
360+
and/or database. Title to copyright in this software, database and
361+
any associated documentation shall at all times remain with
362+
Princeton University and LICENSEE agrees to preserve same.

0 commit comments

Comments
 (0)