Train spaCy model inside pure Python script #11673
-
Hello. I am aware that training a (say, NER) spaCy model, requires running some commands from CLI. However, because I need to train a spaCy model inside a Vertex AI Pipeline Component (which can be simply considered as a "Pure Python script"), training a spaCy model from CLI IS NOT an option for my use case. My current attempt looks like this: #train.py
# IMPORTANT: Assume all the necessary files are already available in the same directory
import spacy
import subprocess
subprocess.run(["python", "-m", "spacy", "init", "fill-config", "base_config.cfg", "config.cfg"])
subprocess.run(["python", "-m", "spacy", "train", "config.cfg",
"--output", "my_model",
"--paths.train", "train.spacy",
"--paths.dev", "dev.spacy"]) Which allows me to carry-on with the training (however not being quite stable at times). But I don't know if this is the best implementation, or there is something better or more recommended. Any ideas? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Hi @dave-espinosa , the best way is to call a |
Beta Was this translation helpful? Give feedback.
-
After checking @ljvmiranda921 answer, as well as spaCy code, I have implemented the following solution: from pathlib import Path
from spacy.cli.download import download
from spacy.cli.init_config import fill_config
from spacy.cli.train import train
download('en_core_web_lg')
fill_config(Path("config.cfg"), Path("base_config.cfg"))
train(Path("config.cfg"), Path("my_model"), overrides={"paths.train": "train.spacy", "paths.dev": "dev.spacy"}) With it, I have managed to successfully train a spaCy NER model, from a Python script (i.e., via Thanks for your help. |
Beta Was this translation helpful? Give feedback.
After checking @ljvmiranda921 answer, as well as spaCy code, I have implemented the following solution:
With it, I have managed to successfully train a spaCy NER model, from a Python script (i.e., via
python train.py
).Thanks for your help.