Skip to content

Commit a3b20e4

Browse files
committed
Update the Makefile and add a warning to the readme
1 parent f620487 commit a3b20e4

File tree

2 files changed

+12
-1
lines changed

2 files changed

+12
-1
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ test: activate
88

99
install: venv activate base
1010
./.venv/bin/python -m pip install git+https://github.com/rafelafrance/spell-well.git@main#egg=spell-well
11-
./.venv/bin/python -m pip install git+https://github.com/rafelafrance/traiter.git@v2.2.3#egg=traiter
11+
./.venv/bin/python -m pip install git+https://github.com/rafelafrance/traiter.git@master#egg=traiter
1212
./.venv/bin/python -m pip install .
1313
./.venv/bin/python -m spacy download en_core_web_md
1414

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,16 @@
11
# FloraTraiter ![Python application](https://github.com/rafelafrance/FloraTraiter/workflows/CI/badge.svg)[![DOI](https://zenodo.org/badge/649758239.svg)](https://zenodo.org/badge/latestdoi/649758239)
22

3+
## Note to people wanting to use these scripts.
4+
5+
These modules were written before the Large Language Model (LLM) revolution occurred. The most recent LLMs, even the smaller ones like Gemma3 etc., get you about (guessing) 90% of the way to what this set of scripts does, they often do it better, and they most definitely do it much with less work. LLMs are great at pattern recognition and that's all that these modules do. So, if you want to start a trait/information extraction project of your own I'd recommend that you consider a LLM-based approach first.
6+
7+
Rule-based parsing still has its uses for the next couple of years, albeit in a limited fashion, and with a lot less code than what I've generated here. I still use rules for:
8+
9+
1. Generating some test data. The code in this repository is way overkill for that.
10+
2. Pre-processing text to get it into a format that gives LLMs an easier time of processing text. I do this less and less with each generation of LLMs.
11+
3. Post-processing LLM results. Sometimes a LLM will give you results that are correct but not quite in a useful format. I'll sometimes use rule-based parsers to tweak LLM output. Nothing in this repository does this.
12+
13+
## Back to the regularly scheduled repository
314

415
Extract traits about plants from authoritative literature.
516

0 commit comments

Comments
 (0)