-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi Chris,
I commented out the line 93 temporary the tables/hu-hu-g1_braille_input.cti file, and looked what output produce the louis check tests/braille-specs/hu-hu-g1_dictionary_special_consonants.yaml command.
Usual, have lot of back-translation failures the capsletter indicator and begcapsword related indicators, the capsletter indicator in hu-hu-g1.ctb table are lust-rs back-translated with $ character, the begcapsword indicator are back-translated with two $ characters.
looks following examples:
- A forward translation failure, with works right In Liblouis:
Original word:
9-sz-7760
Right Liblouis unicode Braille:
⠼⠊⠤⠱⠤⠼⠛⠛⠋⠚
Wrong louis-rs produced braille:
⠼⠪⠤⠱⠤⠼⠻⠻⠫⠬
So, this output forward direction louis-rs marks all numbers with eight dot characters (not six dot combinations the litdigit opcode), and after the first hiphen character marks the numbers with eight dot variant, not six dot variant.
Looks what happening in Liblouis if I use for example the lou_trace unicode.dis,hu-hu-g1.ctb command with forward direction, and paste the original input text:
root@hammera-pc:/mnt/@/vagrant# lou_trace unicode.dis,hu-hu-g1.ctb
9-sz-7760
⠼⠊⠤⠱⠤⠼⠛⠛⠋⠚
1. litdigit 9 24
2. hyphen - 36
3. always sz 156
4. hyphen - 36
5. litdigit 7 1245
6. litdigit 7 1245
7. litdigit 6 124
8. litdigit 0 245
Look what happening if I back-translating the correct Braille output in lou_trace the original liblouis the -b option with lou_trace -b unicode.dis,hu-hu-g1.ctb command:
root@hammera-pc:/mnt/@# lou_trace -b unicode.dis,hu-hu-g1.ctb
⠼⠊⠤⠱⠤⠼⠛⠛⠋⠚
9-sz-7760
1. numsign 3456
2. litdigit 9 24
3. hyphen - 36
4. endnum sz 156
5. hyphen - 36
6. numsign 3456
7. litdigit 7 1245
8. litdigit 7 1245
9. litdigit 6 124
10. litdigit 0 245
- Original word: 9-Sz-7760
Right unicode Braille in Liblouis:
⠼⠊⠤⠨⠱⠤⠼⠛⠛⠋⠚
Wrong louis-rs unicode Braille:
⠼⠪⠤⠨⠎⠣⠤⠼⠻⠻⠫⠬"
lou_trace unicode.dis,hu-hu-g1.ctb result:
root@hammera-pc:/mnt/@# lou_trace unicode.dis,hu-hu-g1.ctb
9-Sz-7760
⠼⠊⠤⠨⠱⠤⠼⠛⠛⠋⠚
1. litdigit 9 24
2. hyphen - 36
3. always sz 156
4. hyphen - 36
5. litdigit 7 1245
6. litdigit 7 1245
7. litdigit 6 124
8. litdigit 0 245
lou_trace -b unicode.dis,hu-hu-g1.ctb command result:
root@hammera-pc:/mnt/@# lou_trace unicode.dis,hu-hu-g1.ctb -b
⠼⠊⠤⠨⠱⠤⠼⠛⠛⠋⠚
9-Sz-7760
1. numsign 3456
2. litdigit 9 24
3. hyphen - 36
4. capsletter 46
5. postpunc sz 156
6. hyphen - 36
7. numsign 3456
8. litdigit 7 1245
9. litdigit 7 1245
10. litdigit 6 124
11. litdigit 0 245
- Look what happens for example with the begcapsword mixed numsign testcase:
Input word: 9-SZ-7760
Right unicode braille in Liblouis:
⠼⠊⠤⠨⠨⠱⠤⠼⠛⠛⠋⠚
Wrong louis-rs unicode Braille:
⠼⠪⠤⠨⠨⠎⠣⠤⠼⠻⠻⠫⠬
Look what happens in Liblouis when I run lou_trace unicode.dis,hu-hu-g1.ctb file:
root@hammera-pc:/mnt/@# lou_trace unicode.dis,hu-hu-g1.ctb
9-SZ-7760
⠼⠊⠤⠨⠨⠱⠤⠼⠛⠛⠋⠚
1. litdigit 9 24
2. hyphen - 36
3. always sz 156
4. hyphen - 36
5. litdigit 7 1245
6. litdigit 7 1245
7. litdigit 6 124
8. litdigit 0 245
Looks what happen if I back-translate the right unicode Braille output (I ran simple the lou_trace -b unicode.dis,hu-hu-g1.ctb command, and pasted the right Braille input):
root@hammera-pc:/mnt/@# lou_trace unicode.dis,hu-hu-g1.ctb -b
⠼⠊⠤⠨⠨⠱⠤⠼⠛⠛⠋⠚
9-SZ-7760
1. numsign 3456
2. litdigit 9 24
3. hyphen - 36
4. begcapsword 46-46
5. postpunc sz 156
6. hyphen - 36
7. numsign 3456
8. litdigit 7 1245
9. litdigit 7 1245
10. litdigit 6 124
11. litdigit 0 245
- A simple word failure with not contains capsletter indicator:
Failure { input: "⠁⠃⠕⠗⠞⠥⠱⠱⠁⠃⠈⠸⠕⠣⠈⠎⠞", expected: "abortuszszabályozást", actual: "abortuszszabá\u{7f}ozást", direction: Backward }
The tables/hu-chardefs.cti file are defined the dots 456 combination with the \x007f character:
letter \x007f 456
Looks what happens if I back-translate the right unicode Braille in Liblouis:
root@hammera-pc:/usr/src/liblouis/tables# lou_trace unicode.dis,hu-hu-g1.ctb -b
⠁⠃⠕⠗⠞⠥⠱⠱⠁⠃⠈⠸⠕⠣⠈⠎⠞
abortuszszabályozást
1. lowercase a 1
2. lowercase b 12
3. lowercase o 135
4. lowercase r 1235
5. lowercase t 2345
6. lowercase u 136
7. always ssz 156-156
8. lowercase a 1
9. lowercase b 12
10. lowercase á 4
11. always ly 456
12. lowercase o 135
13. lowercase z 126
14. lowercase á 4
15. lowercase s 234
16. lowercase t 2345
17. correct "bortusszab" "bortuszszab"
So, the louis-rs binary replaces the ly character with \u{7f} sequence, independent to defined into the hu-chardefs.cti file the letter \x007f character with 456 dot combination, and defined the always ly 456 rule.
I attach entire test log with produces the tested harness file. Now me not opened the Rust prepared virtual machine with contains the installed louis-rs project, but if need, I welcome execute required test this four wrote examples.
test_dictionary.txt
Hopefully this issue help you the development,
Attila