Skip to content

Releases: strangetom/ingredient-parser

2.4.0

27 Oct 13:04

Choose a tag to compare

2.4.0

General

Warning

This release drops support for Python 3.10.

  • Drop support for Python 3.10.
  • Add support for Python 3.14.
  • Require pint >= 0.25.0

Processing

  • Improve the part of speech tagging accuracy by extending the built-in tagdict in NLTK's part of speech tagger with ingredient specific entries.
  • Add name_index field to FoundationFood objects. This field refers to the index of the matching name in the ParsedIngredient.name list.
    • The list of names and foundation foods are also guaranteed to be in the same order (although be aware that a name may not have a matching foundation food).
  • Improve processing of names, particularly related to handling of punctuation at the beginning or end of the name.

2.3.0

13 Sep 12:59

Choose a tag to compare

2.3.0

Note

This release only contains changes related to the development tools for this library. There are no changes to the functionality of the library.

Development tools

  • Replace the labeller and webapp tools with a new tool ("webtools") written in react. Many thanks to @mcioffi for this contribution. Key functionality:

    • Parser, to display to parsed output of an input ingredient sentence.

    • Labeller, to edit the labelled training data or add new training data.

    • Trainer, to initiate training of models.

    See the docs for more information.

  • When generated detailed results when model training (using --detailed) also generate a file detailing classification results for features.

2.2.0

15 Aug 17:49

Choose a tag to compare

Foundation foods:

  • Bias foundation food matching to prefer "raw" FDC ingredients, but only if the ingredient name does not include any verbs that indicate the ingredient is not raw (e.g. "cooked").
  • Normalise spelling of tokens in ingredient names to align with spelling used in FDC ingredient descriptions.
  • Fix a bug where foundation foods were never calculated if separate_names=False.

General

  • Add logging to library, under the ingredient-parser namespace.

Model

  • Improve parser model performance with new features related to sentence structure, such as whether a token is part of an example phrase, a multi-ingredient phrase, or after the split in a compound sentence. See the Feature Generation of the docs for more details.

Processing

  • Improve post processing of names to avoid returning multiple names if the name is split by a non-name token. For example, in the sentence "8 fresh large basil leaves", the name should be returned as "fresh basil leaves" and not as two separate names: "fresh", "basil leaves".

2.1.1

18 May 17:56

Choose a tag to compare

  • Pin Pint version to 0.24.4, as future versions intend to drop support for Python 3.10.

2.1.0

21 Apr 09:22

Choose a tag to compare

Warning

This version replaces the floret dependency with numpy.

Numpy was already a dependency of floret, so if you are upgrading from v2.0.0 there should be little impact.

This release overhauls the foundation foods functionality so that ingredient names are matched to entries in the FoodData Central (FDC) database.

  • This update does not change the API. It adds additional fields to FoundationFood objects for FDC ID, category and data type. The text field now returns the description for the matching FDC entry.

  • Beware that enabling this functionality causes the parse_ingredient function to be much slower than when disabled (default).

    foundation_foods=False (default) foundation_foods=True
    Sentences per second ~1500 ~20
  • This functionality works entirely offline.

  • See the foundation foods page of the docs for specifics.

2.0.0

21 Feb 15:51

Choose a tag to compare

2.0.0

Caution

This release contains some breaking changes

  1. ParsedIngredient.name is now a list of IngredientText objects, or an empty list no name is identified.

  2. The quantity_fractions optional keyword argument has been removed. IngredientAmount.quantity and IngredientAmount.quantity_max return fractions.Fraction objects. Conversion to float can be achieved by e.g.:

    # Round to 3 decimal places
    round(float(quantity), 3)
  3. New dependency: floret.


Processing

  • Identify where multiple alternative ingredients are given for the stated amount. For example

    # Simple example
    >>> parse_ingredient("2 tbsp butter or olive oil").name
    [
      IngredientText(text='butter', confidence=0.983045, starting_index=2),
      IngredientText(text='olive oil', confidence=0.930385, starting_index=4)
    ]
    # Complex example
    >>> parse_ingredient("2 cups chicken or beef stock").name
    [
      IngredientText(text='chicken stock', confidence=0.776891, starting_index=2),
      IngredientText(text='beef stock', confidence=0.94334, starting_index=4)
    ]

    This is enabled by default, but can be disabled by setting separate_ingredients=False in parse_ingredient. If disabled, the ParsedIngredient.name field will be listing containing a single IngredientText object.

  • Set PREPARED_INGREDIENT flag on amounts in cases like

    ... to yield 2 cups ...

  • Add convert_to(...) function to IngredientAmount and CompositeIngredientAmount dataclasses to convert the amount to the given units. Conversion between mass and volume is also supported using a default density (density of water) that can be changed.

    >>> p = parse_ingredient("1 8 ounce can chopped tomatoes")
    >>> # Convert "8 ounce" to grams
    >>> p.amount[1].convert_to("g")
    IngredientAmount(quantity=Fraction(5669904625000001, 25000000000000),
                     quantity_max=Fraction(5669904625000001, 25000000000000),
                     unit=<Unit('gram')>,
                     text='226.80 gram',
                     confidence=0.999051,
                     starting_index=1,
                     APPROXIMATE=False,
                     SINGULAR=True,
                     RANGE=False,
                     MULTIPLIER=False,
                     PREPARED_INGREDIENT=False)
    
    >>> # Cannot convert where the quantity or unit is a string
    >>> p.amount[0].convert_to("g")
    TypeError: Cannot convert where quantity or unit is a string.

Model

  • Include custom word embeddings as features used by the model. This requires a new dependency of the floret library.

1.3.2

06 Dec 07:29

Choose a tag to compare

Processing

  • Fix bug that allowed fractions in the intermediate form (i.e. #1$2) to appear in the name, prep, comment, size, purpose fields of the ParsedIngredient output.

1.3.1

29 Nov 14:51

Choose a tag to compare

Warning

This version requires pint >=0.24.4

General

  • Support Python 3.13. Requires pint >= 0.24.4.

1.3.0

06 Nov 19:42

Choose a tag to compare

Processing

  • Various minor improvements to feature generation.

  • Add PREPARED_INGREDIENT flag to IngredientAmount objects. This is used to indicate if the amount refers to the prepared ingredient (PREPARED_INGREDIENT=True) or the unpreprared ingredient (PREPARED_INGREDIENT=False).

  • Add starting_index attribute to IngredientText objects, indicating the index of the token that starts the IngredientText.

  • Improve detection of composite amounts in sentences.

  • Add quantity_fractions keyword argument to parse_ingredient. When True, the quantity and quantity_max fields of IngredientAmount objects will be fractions.Fraction objects instead of floats. This allows fractions such as 1/3 to be represented exactly. The default behaviour is when quantity_fractions=False, where quantities are floats as previously. For example

    >>> parse_ingredient("1 1/3 cups flour").amount[0]
    IngredientAmount(
        quantity=1.333,
        quantity_max=1.333,
        unit=<Unit('cup')>, 
        text='1 1/3 cups', 
        ...
    )
    >>> parse_ingredient("1 1/3 cups flour", quantity_fractions=True).amount[0]
    IngredientAmount(
        quantity=Fraction(4, 3),
        quantity_max=Fraction(4, 3),
        unit=<Unit('cup')>,
        text='1 1/3 cups',
        ...
    )

Model

  • Addition of new dataset: tastecooking. This is a relatively small dataset, but includes a number of unique abbreviations for units and sizes.

1.2.0

29 Sep 12:20

Choose a tag to compare

General

  • New optional keyword argument to extract foundation foods from the ingredient name. Foundation foods are the fundamental item of food, excluding any qualifiers or descriptive adjectives, e.g. for the name organic cucumber, the foundation food is cucumber.

    See https://ingredient-parser.readthedocs.io/en/latest/guide/foundation.html for additional details.

  • Some minor post processing fixes.