0.1.0-beta5
Pre-release
Pre-release
·
1124 commits
to master
since this release
- Support the extraction of multiple amounts from the input sentence.
- Change output dataclass to put confidence values with each field.
- The name, comment, other fields are output as an
IngredientTextobject containing the text and confidence - The amounts are output as an
IngredientAmountobject containing the quantity, unit, confidence and flags for whether the amount is approximate or for a singular item of the ingredient.
- The name, comment, other fields are output as an
- Rewrite post-processing functionality to make it more maintainable and extensible in the future.
- Add a model card, which provides information about the data used to train and evaluate the model, the purpose of the model and it's limitations.
- Increase l1 regularisation during model training.
- This reduces model size by a factor of ~4.
- This should improve performance on sentences not seen before by forcing to the model to rely less on labelling specific words.
- Improve the model guide in the documentation.
- Add a simple webapp that can be used to view the output of the parser in a more human-readable way.
Example of the output at this release
>>> parse_ingredient("50ml/2fl oz/3½tbsp lavender honey (or other runny honey if unavailable)")
ParsedIngredient(
name=IngredientText(
text='lavender honey',
confidence=0.998829),
amount=[
IngredientAmount(
quantity='50',
unit='ml',
confidence=0.999189,
APPROXIMATE=False,
SINGULAR=False),
IngredientAmount(
quantity='2',
unit='fl oz',
confidence=0.980392,
APPROXIMATE=False,
SINGULAR=False),
IngredientAmount(
quantity='3.5',
unit='tbsps',
confidence=0.990711,
APPROXIMATE=False,
SINGULAR=False)
],
comment=IngredientText(
text='(or other runny honey if unavailable)',
confidence=0.973682
),
other=None,
sentence='50ml/2fl oz/3½tbsp lavender honey (or other runny honey if unavailable)'
)