-
Notifications
You must be signed in to change notification settings - Fork 8
Expand file tree
/
Copy pathREADME
More file actions
94 lines (62 loc) · 2.99 KB
/
README
File metadata and controls
94 lines (62 loc) · 2.99 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# Catalan: apertium-cat
This is an Apertium monolingual language package for Catalan. What you can use this language package for:
* Morphological analysis of Catalan
* Morphological generation of Catalan
* Part-of-speech tagging of Catalan
## Requirements
You will need the following software installed:
* lttoolbox (>= 3.7.1)
* apertium (>= 3.8.3)
* vislcg3 (>= 1.3.9)
If this does not make any sense, we recommend you look at: https://apertium.org
## Compiling
Given the requirements being installed, you should be able to just run:
```console
$ ./autogen.sh
$ make
```
If you're doing development, you don't have to install the data, you can use it directly from this directory.
If you are installing this language package as a prerequisite for an Apertium translation pair, then do (typically as root / with sudo):
```console
# make install
```
You can give a `--prefix` to `./autogen.sh` to install as a non-root user, but make sure to use the same prefix when installing the translation pair and any other language packages.
If any of this doesn't make sense or doesn't work, see https://wiki.apertium.org/wiki/Install_language_data_by_compiling
## Testing
If you are in the source directory after running make, the following commands should work:
```console
$ echo "la casa" | apertium -d . cat-morph
^la/la<n><m><sp>/el<det><def><f><sg>/el<prn><pro><p3><f><sg>/prpers<prn><pro><p3><f><sg>$ ^casa/casa<n><f><sg>/casar<vblex><pri><p3><sg>/casar<vblex><imp><p2><sg>$^./.<sent>$
$ echo "la casa" | apertium -d . cat-tagger
^el<det><def><f><sg>$ ^casa<n><f><sg>$^.<sent>$
```
## Tagger model training
To train the tagger model, do one of the following:
Supervised training:
```console
$ make -f tagger.supervised.make
```
Unsupervised training
```console
$ make -f tagger.unsupervised.make
```
For details on the corpora used in training, check the [corpora information](tagger-data/README.md).
For more information, see https://wiki.apertium.org/wiki/Tagger_training
## Files and data
* [`apertium-cat.cat.metadix`](apertium-cat.cat.metadix) - Monolingual dictionary
* [`cat.prob`](cat.prob) - Tagger model
* [`apertium-cat.cat.rlx`](apertium-cat.cat.rlx) - Constraint Grammar disambiguation rules
* [`apertium-cat.post-cat.dix`](apertium-cat.post-cat.dix) - Post-generator
* [`modes.xml`](modes.xml) - Translation modes
* [`apertium-cat.cat.prefs.rlx`](apertium-cat.cat.prefs.rlx) - Generator preference rules
* [`cat.preferences.xml`](cat.preferences.xml) - Generator preference list
## For more information
* https://wiki.apertium.org/wiki/Installation
* https://wiki.apertium.org/wiki/apertium-cat
* https://wiki.apertium.org/wiki/Using_an_lttoolbox_dictionary
## Help and support
If you need help using this language pair or data, you can contact:
* Mailing list (Catalan): apertium-catala@lists.sourceforge.net
* Mailing list (general): apertium-stuff@lists.sourceforge.net
* IRC: `#apertium` on `irc.oftc.net` (irc://irc.oftc.net/#apertium)
See also the file [`AUTHORS`](AUTHORS), included in this distribution.