Skip to content

0.1.0-beta4

Pre-release
Pre-release

Choose a tag to compare

@strangetom strangetom released this 16 Aug 19:27
· 1243 commits to master since this release
  • Include new source of training data: cookstr.
  • The parse_ingredient function now returns a ParsedIngredient dataclass instead of a dict.
    • Remove dependency on typing_extensions as a result of this
  • A model card is now provided that gives details about how the model was trained, performs, is intended to be used, and limitations.
    • The model card is distributed with the package and there is a function show_model_card() that will open the model card in the default application for markdown files.
  • Improvements to the ingredient sentence preprocessing:
    • Expand the list of units
    • Tweak the tokenizer to handle more punctuation
    • Fix various bugs with the cleaning steps

As a result of these updates the model performance has improved to:

Sentence-level results:
    Total: 12030
    Correct: 10776
    Incorrect: 1254
    -> 89.58% correct

Word-level results:
    Total: 75146
    Correct: 72329
    Incorrect: 2817
    -> 96.25% correct