Skip to content

Commit cd1e33f

Browse files
Prepare release 3.1 (follow-up) (#90)
* Fix Typos * Improve docs * Remove allBlockRanges' * prototypeIfConfusable → confusablePrototype * Hold back prototype * Fix Readme and changelogs * Rework Unfold * Add tests for idempotency * Add experimental `unicode-data-text` package for case operation on `Text`. Co-authored-by: Harendra Kumar <[email protected]>
1 parent 747df48 commit cd1e33f

File tree

30 files changed

+877
-131
lines changed

30 files changed

+877
-131
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ This repository provides packages to use the
88
- [`unicode-data`](#unicode-data) for general character properties.
99
- [`unicode-data-names`](#unicode-data-names) for characters names and aliases.
1010
- [`unicode-data-scripts`](#unicode-data-scripts) for characters scripts.
11+
- [`unicode-data-security`](#unicode-data-security) for security mechanisms.
1112

1213
The Haskell data structures are generated programmatically from the UCD files.
1314
The latest Unicode version supported by these libraries is

cabal.project

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
1-
packages: */*.cabal
1+
packages:
2+
*/*.cabal,
3+
experimental/*/*.cabal
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Changelog
2+
3+
## 0.1.0
4+
5+
- Initial release
Lines changed: 249 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,249 @@
1+
2+
Apache License
3+
Version 2.0, January 2004
4+
http://www.apache.org/licenses/
5+
6+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7+
8+
1. Definitions.
9+
10+
"License" shall mean the terms and conditions for use, reproduction,
11+
and distribution as defined by Sections 1 through 9 of this document.
12+
13+
"Licensor" shall mean the copyright owner or entity authorized by
14+
the copyright owner that is granting the License.
15+
16+
"Legal Entity" shall mean the union of the acting entity and all
17+
other entities that control, are controlled by, or are under common
18+
control with that entity. For the purposes of this definition,
19+
"control" means (i) the power, direct or indirect, to cause the
20+
direction or management of such entity, whether by contract or
21+
otherwise, or (ii) ownership of fifty percent (50%) or more of the
22+
outstanding shares, or (iii) beneficial ownership of such entity.
23+
24+
"You" (or "Your") shall mean an individual or Legal Entity
25+
exercising permissions granted by this License.
26+
27+
"Source" form shall mean the preferred form for making modifications,
28+
including but not limited to software source code, documentation
29+
source, and configuration files.
30+
31+
"Object" form shall mean any form resulting from mechanical
32+
transformation or translation of a Source form, including but
33+
not limited to compiled object code, generated documentation,
34+
and conversions to other media types.
35+
36+
"Work" shall mean the work of authorship, whether in Source or
37+
Object form, made available under the License, as indicated by a
38+
copyright notice that is included in or attached to the work.
39+
40+
"Derivative Works" shall mean any work, whether in Source or Object
41+
form, that is based on (or derived from) the Work and for which the
42+
editorial revisions, annotations, elaborations, or other modifications
43+
represent, as a whole, an original work of authorship. For the purposes
44+
of this License, Derivative Works shall not include works that remain
45+
separable from, or merely link (or bind by name) to the interfaces of,
46+
the Work and Derivative Works thereof.
47+
48+
"Contribution" shall mean any work of authorship, including
49+
the original version of the Work and any modifications or additions
50+
to that Work or Derivative Works thereof, that is intentionally
51+
submitted to Licensor for inclusion in the Work by the copyright owner
52+
or by an individual or Legal Entity authorized to submit on behalf of
53+
the copyright owner. For the purposes of this definition, "submitted"
54+
means any form of electronic, verbal, or written communication sent
55+
to the Licensor or its representatives, including but not limited to
56+
communication on electronic mailing lists, source code control systems,
57+
and issue tracking systems that are managed by, or on behalf of, the
58+
Licensor for the purpose of discussing and improving the Work, but
59+
excluding communication that is conspicuously marked or otherwise
60+
designated in writing by the copyright owner as "Not a Contribution."
61+
62+
"Contributor" shall mean Licensor and any individual or Legal Entity
63+
on behalf of whom a Contribution has been received by Licensor and
64+
subsequently incorporated within the Work.
65+
66+
2. Grant of Copyright License. Subject to the terms and conditions of
67+
this License, each Contributor hereby grants to You a perpetual,
68+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69+
copyright license to reproduce, prepare Derivative Works of,
70+
publicly display, publicly perform, sublicense, and distribute the
71+
Work and such Derivative Works in Source or Object form.
72+
73+
3. Grant of Patent License. Subject to the terms and conditions of
74+
this License, each Contributor hereby grants to You a perpetual,
75+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76+
(except as stated in this section) patent license to make, have made,
77+
use, offer to sell, sell, import, and otherwise transfer the Work,
78+
where such license applies only to those patent claims licensable
79+
by such Contributor that are necessarily infringed by their
80+
Contribution(s) alone or by combination of their Contribution(s)
81+
with the Work to which such Contribution(s) was submitted. If You
82+
institute patent litigation against any entity (including a
83+
cross-claim or counterclaim in a lawsuit) alleging that the Work
84+
or a Contribution incorporated within the Work constitutes direct
85+
or contributory patent infringement, then any patent licenses
86+
granted to You under this License for that Work shall terminate
87+
as of the date such litigation is filed.
88+
89+
4. Redistribution. You may reproduce and distribute copies of the
90+
Work or Derivative Works thereof in any medium, with or without
91+
modifications, and in Source or Object form, provided that You
92+
meet the following conditions:
93+
94+
(a) You must give any other recipients of the Work or
95+
Derivative Works a copy of this License; and
96+
97+
(b) You must cause any modified files to carry prominent notices
98+
stating that You changed the files; and
99+
100+
(c) You must retain, in the Source form of any Derivative Works
101+
that You distribute, all copyright, patent, trademark, and
102+
attribution notices from the Source form of the Work,
103+
excluding those notices that do not pertain to any part of
104+
the Derivative Works; and
105+
106+
(d) If the Work includes a "NOTICE" text file as part of its
107+
distribution, then any Derivative Works that You distribute must
108+
include a readable copy of the attribution notices contained
109+
within such NOTICE file, excluding those notices that do not
110+
pertain to any part of the Derivative Works, in at least one
111+
of the following places: within a NOTICE text file distributed
112+
as part of the Derivative Works; within the Source form or
113+
documentation, if provided along with the Derivative Works; or,
114+
within a display generated by the Derivative Works, if and
115+
wherever such third-party notices normally appear. The contents
116+
of the NOTICE file are for informational purposes only and
117+
do not modify the License. You may add Your own attribution
118+
notices within Derivative Works that You distribute, alongside
119+
or as an addendum to the NOTICE text from the Work, provided
120+
that such additional attribution notices cannot be construed
121+
as modifying the License.
122+
123+
You may add Your own copyright statement to Your modifications and
124+
may provide additional or different license terms and conditions
125+
for use, reproduction, or distribution of Your modifications, or
126+
for any such Derivative Works as a whole, provided Your use,
127+
reproduction, and distribution of the Work otherwise complies with
128+
the conditions stated in this License.
129+
130+
5. Submission of Contributions. Unless You explicitly state otherwise,
131+
any Contribution intentionally submitted for inclusion in the Work
132+
by You to the Licensor shall be under the terms and conditions of
133+
this License, without any additional terms or conditions.
134+
Notwithstanding the above, nothing herein shall supersede or modify
135+
the terms of any separate license agreement you may have executed
136+
with Licensor regarding such Contributions.
137+
138+
6. Trademarks. This License does not grant permission to use the trade
139+
names, trademarks, service marks, or product names of the Licensor,
140+
except as required for reasonable and customary use in describing the
141+
origin of the Work and reproducing the content of the NOTICE file.
142+
143+
7. Disclaimer of Warranty. Unless required by applicable law or
144+
agreed to in writing, Licensor provides the Work (and each
145+
Contributor provides its Contributions) on an "AS IS" BASIS,
146+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147+
implied, including, without limitation, any warranties or conditions
148+
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149+
PARTICULAR PURPOSE. You are solely responsible for determining the
150+
appropriateness of using or redistributing the Work and assume any
151+
risks associated with Your exercise of permissions under this License.
152+
153+
8. Limitation of Liability. In no event and under no legal theory,
154+
whether in tort (including negligence), contract, or otherwise,
155+
unless required by applicable law (such as deliberate and grossly
156+
negligent acts) or agreed to in writing, shall any Contributor be
157+
liable to You for damages, including any direct, indirect, special,
158+
incidental, or consequential damages of any character arising as a
159+
result of this License or out of the use or inability to use the
160+
Work (including but not limited to damages for loss of goodwill,
161+
work stoppage, computer failure or malfunction, or any and all
162+
other commercial damages or losses), even if such Contributor
163+
has been advised of the possibility of such damages.
164+
165+
9. Accepting Warranty or Additional Liability. While redistributing
166+
the Work or Derivative Works thereof, You may choose to offer,
167+
and charge a fee for, acceptance of support, warranty, indemnity,
168+
or other liability obligations and/or rights consistent with this
169+
License. However, in accepting such obligations, You may act only
170+
on Your own behalf and on Your sole responsibility, not on behalf
171+
of any other Contributor, and only if You agree to indemnify,
172+
defend, and hold each Contributor harmless for any liability
173+
incurred by, or claims asserted against, such Contributor by reason
174+
of your accepting any such warranty or additional liability.
175+
176+
END OF TERMS AND CONDITIONS
177+
178+
-------------------------------------------------------------------------------
179+
This distribution includes portions of code from the "unicode-transforms"
180+
package (https://github.com/composewell/unicode-transforms/) which is
181+
available under BSD-3-Clause license as described below.
182+
-------------------------------------------------------------------------------
183+
184+
Copyright (c) 2016, Harendra Kumar
185+
All rights reserved.
186+
187+
Redistribution and use in source and binary forms, with or without
188+
modification, are permitted provided that the following conditions are met:
189+
190+
1. Redistributions of source code must retain the above copyright notice, this
191+
list of conditions and the following disclaimer.
192+
193+
2. Redistributions in binary form must reproduce the above copyright notice,
194+
this list of conditions and the following disclaimer in the documentation
195+
and/or other materials provided with the distribution.
196+
197+
3. Neither the name of the copyright holder nor the names of its contributors
198+
may be used to endorse or promote products derived from this software without
199+
specific prior written permission.
200+
201+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
202+
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
203+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
204+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
205+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
206+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
207+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
208+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
209+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
210+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
211+
212+
-------------------------------------------------------------------------------
213+
This distribution includes portions of code from the "unicode-transforms"
214+
package (https://github.com/composewell/unicode-transforms/)
215+
which included portions of code from the "prose"
216+
(https://github.com/llelf/prose) package available under BSD-3-Clause
217+
license as described below.
218+
-------------------------------------------------------------------------------
219+
220+
Copyright (c) 2014–2015, Antonio Nikishaev
221+
222+
All rights reserved.
223+
224+
Redistribution and use in source and binary forms, with or without
225+
modification, are permitted provided that the following conditions are met:
226+
227+
* Redistributions of source code must retain the above copyright
228+
notice, this list of conditions and the following disclaimer.
229+
230+
* Redistributions in binary form must reproduce the above
231+
copyright notice, this list of conditions and the following
232+
disclaimer in the documentation and/or other materials provided
233+
with the distribution.
234+
235+
* Neither the name of Antonio Nikishaev nor the names of other
236+
contributors may be used to endorse or promote products derived
237+
from this software without specific prior written permission.
238+
239+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
240+
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
241+
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
242+
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
243+
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
244+
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
245+
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
246+
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
247+
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
248+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
249+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# README
2+
3+
`unicode-data-text` provides Unicode features from
4+
[`unicode-data`](https://hackage.haskell.org/package/unicode-data) package
5+
for the [`text`](https://hackage.haskell.org/package/text) package.
6+
7+
__This package is for internal use and is not meant to be published.__
8+
9+
## Licensing
10+
11+
`unicode-data-text` is an [open source](https://github.com/composewell/unicode-data)
12+
project available under a liberal [Apache-2.0 license](LICENSE).
13+
14+
## Contributing
15+
16+
As an open project we welcome contributions.
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
{-# LANGUAGE CPP #-}
2+
3+
import Control.DeepSeq (NFData, force)
4+
import Control.Exception (evaluate)
5+
import Test.Tasty.Bench
6+
(Benchmark, bgroup, bench, bcompare, defaultMain, env, nf)
7+
8+
import qualified Data.Text as T
9+
import qualified Unicode.Char.General as G
10+
import qualified Unicode.Text.Case as C
11+
12+
main :: IO ()
13+
main = defaultMain
14+
[ bgroup "Unicode.Char.Case"
15+
[ bgroup "toLower"
16+
[ benchCaseConv "text" T.toLower
17+
, bcompare' "toLower" "text"
18+
(benchCaseConv "unicode-data (fusion)" C.toLowerFusion)
19+
#if MIN_VERSION_text(2,0,0)
20+
, bcompare' "toLower" "text"
21+
(benchCaseConv "unicode-data (no fusion)" C.toLower)
22+
#endif
23+
]
24+
, bgroup "toUpper"
25+
[ benchCaseConv "text" T.toUpper
26+
, bcompare' "toUpper" "text"
27+
(benchCaseConv "unicode-data (fusion)" C.toUpperFusion)
28+
#if MIN_VERSION_text(2,0,0)
29+
, bcompare' "toUpper" "text"
30+
(benchCaseConv "unicode-data (no fusion)" C.toUpper)
31+
#endif
32+
]
33+
, bgroup "toTitleText"
34+
[ benchCaseConv "text" T.toTitle
35+
, bcompare' "toTitleText" "text"
36+
(benchCaseConv "unicode-data (fusion)" C.toTitleFusion)
37+
]
38+
, bgroup "toCaseFold"
39+
[ benchCaseConv "text" T.toCaseFold
40+
, bcompare' "toCaseFold" "text"
41+
(benchCaseConv "unicode-data (fusion)" C.toCaseFoldFusion)
42+
#if MIN_VERSION_text(2,0,0)
43+
, bcompare' "toCaseFold" "text"
44+
(benchCaseConv "unicode-data (no fusion)" C.toCaseFold)
45+
#endif
46+
]
47+
]
48+
]
49+
where
50+
51+
-- [NOTE] Works if groupTitle uniquely identifies the benchmark group.
52+
bcompare' :: String -> String -> Benchmark -> Benchmark
53+
bcompare' groupTitle ref = bcompare
54+
(mconcat ["$NF == \"", ref, "\" && $(NF-1) == \"", groupTitle, "\""])
55+
56+
benchCaseConv
57+
:: String
58+
-> (T.Text -> T.Text)
59+
-> Benchmark
60+
benchCaseConv t f = benchNF t f (T.pack (filter isValid [minBound..maxBound]))
61+
-- where isValid c = G.generalCategory c < G.Surrogate
62+
where isValid = G.isAlphabetic
63+
64+
benchNF
65+
:: forall a b. (NFData a, NFData b)
66+
=> String
67+
-> (a -> b)
68+
-> a
69+
-> Benchmark
70+
benchNF t f a =
71+
-- Avoid side-effects with garbage collection (see tasty-bench doc)
72+
env
73+
(evaluate (force a)) -- initialize
74+
(bench t . nf f) -- benchmark

0 commit comments

Comments
 (0)