0.1.0-beta9
Pre-release
Pre-release
·
753 commits
to master
since this release
General
Model
- Add additional model performance metrics.
- Add model hyper-parameter tuning functionality with
python train.py gridsearchto iterate over specified training algorithms and hyper-parameters. - Add
--detailedargument to output detailed information about model performance on test data. (#9, @boxydog) - Change model labels to treat label all punctuation as PUNC - this resolves some of the ambiguity in token labeling
- Introduce SIZE label for tokens that modify the size of the ingredient. Note that his only applies to size modifiers of the ingredient. Size modifiers of the unit will remain part of the unit e.g. large clove.
Processing
-
Integration of
pintlibrary for units-
By default, units in
IngredientAmountobject will be returned aspint.Unitobjects (where possible). This enables the easy conversion of amounts between different units. This can be disabled by settingstring_units=Truein theparse_ingredientfunction calls. -
For units that have US customary and Imperial version with the same name (e.g, cup), setting
imperial_units=Truein theparse_ingredientfunction calls will return the imperial version. The default is US customary. -
This only applies to units in
pint's unit registry (basically all common, standardised units). If the unit can't be found, then the string is returned as previously.
-
-
Additions to
IngredientAmountobject:- New
quantity_maxfield for handling upper limit of ranges. If the quantity is not a range, this will default to same as thequantityfield. - Flags for RANGE and MULTIPLIER
- RANGE is set to True for quantity ranges e.g.
1-2 - MULTIPLIER is set to True for quantities like
1x
- RANGE is set to True for quantity ranges e.g.
- Conversion of quantity field to
floatwhere possible
- New
-
PreProcessor improvements
- Be less aggressive about replacing written numbers (e.g. one) with the digit version. For example, in sentences like
1 tsp Chinese five-spice,five-spiceis now kept as written instead of being replaced by two tokens:5 spice. - Improve handling of ranges that duplicate the units e.g.
1 pound to 2 poundis now returned as1-2 pound
- Be less aggressive about replacing written numbers (e.g. one) with the digit version. For example, in sentences like