Skip to content

Commit f2ff4ac

Browse files
committed
Add docs
1 parent fd8864d commit f2ff4ac

File tree

1 file changed

+37
-0
lines changed

1 file changed

+37
-0
lines changed

docs/source/api/disambig/bert.rst

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
camel_tools.disambig.bert
2+
=========================
3+
4+
.. automodule:: camel_tools.disambig.bert
5+
6+
Classes
7+
-------
8+
9+
.. autoclass:: camel_tools.disambig.bert.BERTUnfactoredDisambiguator
10+
:members:
11+
12+
13+
Examples
14+
--------
15+
16+
Below is an example of how to load and use the default pre-trained CAMeLBERT
17+
based model to disambiguate words in a sentence.
18+
19+
.. code-block:: python
20+
21+
from camel_tools.disambig.bert import BERTUnfactoredDisambiguator
22+
23+
unfactored = BERTUnfactoredDisambiguator.pretrained()
24+
25+
# We expect a sentence to be whitespace/punctuation tokenized beforehand.
26+
# We provide a simple whitespace and punctuation tokenizer as part of camel_tools.
27+
# See camel_tools.tokenizers.word.simple_word_tokenize.
28+
sentence = ['سوف', 'نقرأ', 'الكتب']
29+
30+
disambig = unfactored.disambiguate(sentence)
31+
32+
# Let's, for example, use the top disambiguations to generate a diacritized
33+
# version of the above sentence.
34+
# Note that, in practice, you'll need to make sure that each word has a
35+
# non-zero list of analyses.
36+
diacritized = [d.analyses[0].analysis['diac'] for d in disambig]
37+
print(' '.join(diacritized))

0 commit comments

Comments
 (0)