Documentation on tuning the lemmatizer

**Is your feature request related to a problem? Please describe.**
I am using some sample code to tokenize and extract lemmas of Spanish text. It's not fully working as expected so I'm looking for documentation on the tokenizer/lemmatizer or ways to tune it.

**Describe the solution you'd like**
I gave it the string "Quiero llamarte Susan" and expected the lemma of "llamarte" to be "llamar" but it came back "llamarte". 

I'd like to know where to go next to learn more about what is happening and what's expected.

I'm using this code to tokenize it:

```c#
 Catalyst.Models.Spanish.Register();
  var nlp = await Pipeline.ForAsync(Language.Spanish);
  var doc = new Document(text, Language.Spanish);

  nlp.ProcessSingle(doc);

  var tokenList = doc.ToTokenList();
```

and I'm not sure if there's any tuning/tweaking I can do to get the desired result, if this is expected, or if this is a limitation. I'm not sure where to go next.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation on tuning the lemmatizer #114

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Documentation on tuning the lemmatizer #114

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions