You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.rst
+21Lines changed: 21 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,27 @@ Uses
13
13
14
14
This is used as the scanner inside `Mathics <https://mathics.org>`_ but it can also be used for tokenizing and formatting WL code. In fact we intend to write one.
15
15
16
+
Implementation
17
+
==============
18
+
19
+
mathics_scaner.characters
20
+
-------------------------
21
+
22
+
This module consists mostly of translation tables between WL and unicode/ascii.
23
+
Because of the large size of this tables, it was decided to store them in a
24
+
file and read them from disk at runtime (when the module is imported). Our
25
+
tests showed that storing the tables as JSON and using
26
+
[ujson](https://github.com/ultrajson/ultrajson) to read them is the most
27
+
efficient way to access them. However, this is merelly an implementation
28
+
detail and consumers of this library should not relly on this assumption.
29
+
30
+
For maintainability and effeciency, we decided to store this data in a
31
+
human-readable YAML file (`data/named-characters.yml`) and compile them into
32
+
the JSON tables used internally by the library (`data/characters.json`) for
33
+
faster access at runtime. The conversion of the data is performed by the
34
+
script `admin-tools/compile-translation-tables.py` at each commit to the
0 commit comments