You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Reconstruct Arabic sentences to be used in applications that don't support Arabic
2
+
Reconstruct Arabic sentences to be used in applications that don't support
3
+
Arabic script.
3
4
4
-
Based on [Better Arabic Reshaper](https://github.com/agawish/Better-Arabic-Reshaper/), ported to Python, tweaked a little bit.
5
+
Works with Python 2.x and 3.x
5
6
6
-
Arabic is a very special script language with two essential features:
7
+
## Description
8
+
9
+
Arabic script is very special with two essential features:
7
10
8
11
1. It is written from right to left.
9
12
2. The characters change shape according to their surrounding characters.
10
13
11
-
So when you try to print Arabic text in an application – or a library – that doesn’t support Arabic you’re pretty likely to end up with something that looks like this:
14
+
So when you try to print text written in Arabic script in an application
15
+
– or a library – that doesn’t support Arabic you’re pretty likely to end up
16
+
with something that looks like this:
12
17
13
18

14
19
15
-
We have two problems here, first, the characters are in the isolated form, which means that every character is rendered regardless of its surroundings, and second is that the text is written from left to right.
20
+
We have two problems here, first, the characters are in the isolated form,
21
+
which means that every character is rendered regardless of its surroundings,
22
+
and second is that the text is written from left to right.
16
23
17
-
To solve the latter issue all we have to do is to use the [Unicode bidirectional algorithm](http://unicode.org/reports/tr9/), which is implemented purely in Python in [python-bidi](https://github.com/MeirKriheli/python-bidi). If you use it you’ll end up with something that looks like this:
24
+
To solve the latter issue all we have to do is to use the
25
+
[Unicode bidirectional algorithm](http://unicode.org/reports/tr9/), which is
If you use it you’ll end up with something that looks like this:
18
29
19
30

20
31
21
-
The only issue left to solve is to reshape those characters and replace them with their correct shapes according to their surroundings. Using this library helps with the reshaping so we can get the proper result like this:
32
+
The only issue left to solve is to reshape those characters and replace them
33
+
with their correct shapes according to their surroundings. Using this library
34
+
helps with the reshaping so we can get the proper result like this:
22
35
23
36

24
37
@@ -30,26 +43,128 @@ The only issue left to solve is to reshape those characters and replace them wit
The `pass_arabic_text_to_render` function here is an imaginary function, it is just here to say that the variable `bidi_text` is the variable that you would need to use in your code afterwards, for example to print it in PDF, or to write it in an Image, etc.
116
+
### Via ArabicReshaper instance `configuration_file`
43
117
44
-
For more info visit my blog [post here](http://mpcabd.xyz/python-arabic-text-reshaper/)
118
+
You can separte the configuration from your code, by copying the file
119
+
[default-config.ini](default-config.ini) and change its settings,
120
+
then save it somewhere in your project, and then you can tell the reshaper
121
+
to use your new config file, just pass the path to your config file to its
122
+
constructor's `configuration_file` parameter like this:
Where in you `config.ini` you can have something like this:
137
+
138
+
```
139
+
[ArabicReshaper]
140
+
delete_harakat = no
141
+
support_ligatures = yes
142
+
RIAL SIGN = yes
143
+
```
144
+
145
+
### Via `PYTHON_ARABIC_RESHAPER_CONFIGURATION_FILE` environment variable
146
+
147
+
Instead of having to rewrite your old code to configure it like above, you can
148
+
define an environment variable with the name
149
+
`PYTHON_ARABIC_RESHAPER_CONFIGURATION_FILE` and in its value put the full path
150
+
to the configuration file. This way the reshape function will pick it
151
+
automatically, and you won't have to change your old code.
45
152
46
153
## Known Issue
47
154
48
-
[Harakat or Tashkeel](http://en.wikipedia.org/wiki/Arabic_diacritics#Tashkil_.28marks_used_as_phonetic_guides.29) are not supported, and I think that they can't be supported as their unicode characters are non-spacing marks (i.e. they don't take space, they are rendered in the same space of the character before them), which means that when used in a reshaper, they will be rendered on the next character as the text is reversed.
155
+
When using a library/app that doesn't support right-to-left text rendering,
156
+
[Harakat or Tashkeel](http://en.wikipedia.org/wiki/Arabic_diacritics#Tashkil_.28marks_used_as_phonetic_guides.29)
157
+
cannot be supported, because their unicode characters are non-spacing marks
158
+
(i.e. they don't take space, they are rendered in the same space of the
159
+
character before them), which means that when you keep them and pass the
160
+
reshaped text to `bidi.algorithm.get_display`, they will end up being rendered
161
+
on the next character not the character they should be on as the text is
162
+
reversed.
49
163
50
164
## License
51
165
52
-
This work is licensed under [GNU General Public License v3](http://www.gnu.org/licenses/gpl.txt).
166
+
This work is licensed under
167
+
[GNU General Public License v3](http://www.gnu.org/licenses/gpl.txt).
0 commit comments