0.1.0-beta11
Pre-release
Pre-release
·
659 commits
to master
since this release
General
-
Refactor package structure to make it more suitable for expansion to over languages.
Note: There aren't any plans to support other languages yet.
Model
- Reduce duplication in training data
- Introduce PURPOSE label for tokens that describe the purpose of the ingredient, such as
for the dressingandfor garnish. - Replace quantities with "!num" when determining the features for tokens so that the model doesn't need to learn all possible values quantities can take. This results in a small reduction in model size.
Processing
- Various bug fixes to post-processing of tokens with labels NAME, COMMENT, PREP, PURPOSE, SIZE to correct punctuation and confidence calculations.
- Modification of tokeniser to split full stops from the end of tokens. This helps to model avoid treating "
token." and "token" as different cases to learn. - Add fallback functionality to
parse_ingredientfor cases where none of the tokens are labelled as NAME. This will select name as the token with the highest confidence of being labelled NAME, even though a different label has a high confidence for that token. This can be disabled by settingexpect_name_in_output=Falseinparse_ingredient.