Release 0.1.0-beta4 · strangetom/ingredient-parser

Include new source of training data: cookstr.
- 10,000 additional ingredient sentences from the archive of 7918 recipes (~40,000 total ingredient sentences) found at https://archive.org/details/recipes-en-201706 are now used in the training of the model.
The parse_ingredient function now returns a ParsedIngredient dataclass instead of a dict.
- Remove dependency on typing_extensions as a result of this
A model card is now provided that gives details about how the model was trained, performs, is intended to be used, and limitations.
- The model card is distributed with the package and there is a function show_model_card() that will open the model card in the default application for markdown files.
Improvements to the ingredient sentence preprocessing:
- Expand the list of units
- Tweak the tokenizer to handle more punctuation
- Fix various bugs with the cleaning steps

As a result of these updates the model performance has improved to:

Sentence-level results:
    Total: 12030
    Correct: 10776
    Incorrect: 1254
    -> 89.58% correct

Word-level results:
    Total: 75146
    Correct: 72329
    Incorrect: 2817
    -> 96.25% correct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.1.0-beta4

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!