Releases · strangetom/ingredient-parser

27 Oct 13:04

strangetom

2.4.0

62926f2

2.4.0 Latest

Latest

2.4.0

General

Warning

This release drops support for Python 3.10.

Drop support for Python 3.10.
Add support for Python 3.14.
Require pint >= 0.25.0

Processing

Improve the part of speech tagging accuracy by extending the built-in tagdict in NLTK's part of speech tagger with ingredient specific entries.
Add name_index field to FoundationFood objects. This field refers to the index of the matching name in the ParsedIngredient.name list.
- The list of names and foundation foods are also guaranteed to be in the same order (although be aware that a name may not have a matching foundation food).
Improve processing of names, particularly related to handling of punctuation at the beginning or end of the name.

Assets 2

0 Join discussion

13 Sep 12:59

strangetom

2.3.0

862f80c

2.3.0

Note

This release only contains changes related to the development tools for this library. There are no changes to the functionality of the library.

Development tools

Replace the labeller and webapp tools with a new tool ("webtools") written in react. Many thanks to @mcioffi for this contribution. Key functionality:
- Parser, to display to parsed output of an input ingredient sentence.
- Labeller, to edit the labelled training data or add new training data.
- Trainer, to initiate training of models.
See the docs for more information.
When generated detailed results when model training (using --detailed) also generate a file detailing classification results for features.

Contributors

mcioffi

Assets 2

0 Join discussion

15 Aug 17:49

strangetom

2.2.0

9a7946f

2.2.0

Foundation foods:

Bias foundation food matching to prefer "raw" FDC ingredients, but only if the ingredient name does not include any verbs that indicate the ingredient is not raw (e.g. "cooked").
Normalise spelling of tokens in ingredient names to align with spelling used in FDC ingredient descriptions.
Fix a bug where foundation foods were never calculated if separate_names=False.

General

Add logging to library, under the ingredient-parser namespace.

Model

Improve parser model performance with new features related to sentence structure, such as whether a token is part of an example phrase, a multi-ingredient phrase, or after the split in a compound sentence. See the Feature Generation of the docs for more details.

Processing

Improve post processing of names to avoid returning multiple names if the name is split by a non-name token. For example, in the sentence "8 fresh large basil leaves", the name should be returned as "fresh basil leaves" and not as two separate names: "fresh", "basil leaves".

Assets 2

0 Join discussion

18 May 17:56

strangetom

2.1.1

e56614d

2.1.1

Pin Pint version to 0.24.4, as future versions intend to drop support for Python 3.10.

Assets 2

21 Apr 09:22

strangetom

2.1.0

ec619bb

2.1.0

Warning

This version replaces the floret dependency with numpy.

Numpy was already a dependency of floret, so if you are upgrading from v2.0.0 there should be little impact.

This release overhauls the foundation foods functionality so that ingredient names are matched to entries in the FoodData Central (FDC) database.

This update does not change the API. It adds additional fields to FoundationFood objects for FDC ID, category and data type. The text field now returns the description for the matching FDC entry.
Beware that enabling this functionality causes the parse_ingredient function to be much slower than when disabled (default).

foundation_foods=False (default) foundation_foods=True

Sentences per second ~1500 ~20
This functionality works entirely offline.
See the foundation foods page of the docs for specifics.

	foundation_foods=False (default)	foundation_foods=True
Sentences per second	~1500	~20

Assets 2

0 Join discussion

21 Feb 15:51

strangetom

2.0.0

d0f05bc

2.0.0

Caution

This release contains some breaking changes

ParsedIngredient.name is now a list of IngredientText objects, or an empty list no name is identified.
The quantity_fractions optional keyword argument has been removed. IngredientAmount.quantity and IngredientAmount.quantity_max return fractions.Fraction objects. Conversion to float can be achieved by e.g.:
```
# Round to 3 decimal places
round(float(quantity), 3)
```
New dependency: floret.

Processing

Identify where multiple alternative ingredients are given for the stated amount. For example

# Simple example
>>> parse_ingredient("2 tbsp butter or olive oil").name
[
  IngredientText(text='butter', confidence=0.983045, starting_index=2),
  IngredientText(text='olive oil', confidence=0.930385, starting_index=4)
]
# Complex example
>>> parse_ingredient("2 cups chicken or beef stock").name
[
  IngredientText(text='chicken stock', confidence=0.776891, starting_index=2),
  IngredientText(text='beef stock', confidence=0.94334, starting_index=4)
]

This is enabled by default, but can be disabled by setting separate_ingredients=False in parse_ingredient. If disabled, the ParsedIngredient.name field will be listing containing a single IngredientText object.

Set PREPARED_INGREDIENT flag on amounts in cases like

... to yield 2 cups ...

Add convert_to(...) function to IngredientAmount and CompositeIngredientAmount dataclasses to convert the amount to the given units. Conversion between mass and volume is also supported using a default density (density of water) that can be changed.

>>> p = parse_ingredient("1 8 ounce can chopped tomatoes")
>>> # Convert "8 ounce" to grams
>>> p.amount[1].convert_to("g")
IngredientAmount(quantity=Fraction(5669904625000001, 25000000000000),
                 quantity_max=Fraction(5669904625000001, 25000000000000),
                 unit=<Unit('gram')>,
                 text='226.80 gram',
                 confidence=0.999051,
                 starting_index=1,
                 APPROXIMATE=False,
                 SINGULAR=True,
                 RANGE=False,
                 MULTIPLIER=False,
                 PREPARED_INGREDIENT=False)

>>> # Cannot convert where the quantity or unit is a string
>>> p.amount[0].convert_to("g")
TypeError: Cannot convert where quantity or unit is a string.

Model

Include custom word embeddings as features used by the model. This requires a new dependency of the floret library.

Assets 2

0 Join discussion

06 Dec 07:29

strangetom

1.3.2

479d771

1.3.2

Processing

Fix bug that allowed fractions in the intermediate form (i.e. #1$2) to appear in the name, prep, comment, size, purpose fields of the ParsedIngredient output.

Assets 2

29 Nov 14:51

strangetom

1.3.1

5eeb994

1.3.1

Warning

This version requires pint >=0.24.4

General

Support Python 3.13. Requires pint >= 0.24.4.

Assets 2

06 Nov 19:42

strangetom

1.3.0

bd0dbbe

1.3.0

Processing

Various minor improvements to feature generation.
Add PREPARED_INGREDIENT flag to IngredientAmount objects. This is used to indicate if the amount refers to the prepared ingredient (PREPARED_INGREDIENT=True) or the unpreprared ingredient (PREPARED_INGREDIENT=False).
Add starting_index attribute to IngredientText objects, indicating the index of the token that starts the IngredientText.
Improve detection of composite amounts in sentences.

Add quantity_fractions keyword argument to parse_ingredient. When True, the quantity and quantity_max fields of IngredientAmount objects will be fractions.Fraction objects instead of floats. This allows fractions such as 1/3 to be represented exactly. The default behaviour is when quantity_fractions=False, where quantities are floats as previously. For example

>>> parse_ingredient("1 1/3 cups flour").amount[0]
IngredientAmount(
    quantity=1.333,
    quantity_max=1.333,
    unit=<Unit('cup')>, 
    text='1 1/3 cups', 
    ...
)
>>> parse_ingredient("1 1/3 cups flour", quantity_fractions=True).amount[0]
IngredientAmount(
    quantity=Fraction(4, 3),
    quantity_max=Fraction(4, 3),
    unit=<Unit('cup')>,
    text='1 1/3 cups',
    ...
)

Model

Addition of new dataset: tastecooking. This is a relatively small dataset, but includes a number of unique abbreviations for units and sizes.

Assets 2

0 Join discussion

29 Sep 12:20

strangetom

1.2.0

9321b8c

1.2.0

General

New optional keyword argument to extract foundation foods from the ingredient name. Foundation foods are the fundamental item of food, excluding any qualifiers or descriptive adjectives, e.g. for the name organic cucumber, the foundation food is cucumber.

See https://ingredient-parser.readthedocs.io/en/latest/guide/foundation.html for additional details.
Some minor post processing fixes.

Assets 2

0 Join discussion

Releases: strangetom/ingredient-parser

2.4.0

2.4.0

General

Processing

Uh oh!

2.3.0

2.3.0

Development tools

Contributors

Uh oh!

2.2.0

Foundation foods:

General

Model

Processing

Uh oh!

2.1.1

Uh oh!

2.1.0

Uh oh!

2.0.0

2.0.0

Processing

Model

Uh oh!

1.3.2

Processing

Uh oh!

1.3.1

General

Uh oh!

1.3.0

Processing

Model

Uh oh!

1.2.0

General

Uh oh!