Hyperparameter tuning of a spaCy NER model, with Vertex AI #11126

dave-espinosa · 2022-07-13T18:22:21Z

dave-espinosa
Jul 13, 2022

Hello everyone,

I want to perform some hyperparameter tuning of an spaCy model, but using Vertex AI: Hyperparameter Tuning (I have seen that there is a way to do this exact task using W&B, but I can’t use it, as my company already uses Google Cloud Platform due to info safety reasons, and probably won't be very happy about me moving customer data to some other platform 😅). I will use a deductive approach of this problem, so I have the following questions:

I know that spaCy train command must be run from CLI; however, is there any chance to run train as a “Python-only code”? If so, where can I find some basic example? (I would NOT like to use subprocess to achieve this, because in my own experience, results can get a bit unstable, with model metrics not being able to get retrieved, and others alike).
In the codelab mentioned in the first link, a basic Tensorflow model is created and trained, and whose history is used afterwards in an hypertune object (well, actually only the latest val_accuracy is taken to update, i.e., hp_metric = history.history['val_accuracy'][-1]), to carry-on with the hyperparameter tuning and the related optimization. I know that when one runs spaCy’s train command mentioned above, one gets a ConsoleLogger view of the metrics; however is there a way to “access” the whole history of values obtained along the training (say, in a dictionary or similar)? Are they compatible with hypertune?
Tensorflow models don’t need a config.cfg file but spaCy train does. Plus, when you run the train command, you can totally override some settings in the config.cfg file but still, you DO need it. How to include the config.cfg file in a way that can be containerized AND “compatible with changes during hyperparameter tuning”?

I think after getting a bit clearer with these preliminary questions, I’ll be able to come up with some code (and some derived, additional questions 😁...)

Thanks a lot, and sorry for gathering so much questions in a single post, but I felt like they should be treated as a single one, since all of them aim to solve a bigger problem.

Answered by dave-espinosa

Jul 20, 2022

Hello @pmbaumgartner,

Sorry about the delay, got busy implementing your suggestions, as well as running some Q&A and troubleshooting on the Vertex AI side. Happily however, I think I have managed to implement a functional code for hyperparameter tuning of a spaCy NER model, using Vertex AI. To any future reader, this implementation can be considered as some sort of "version 0.1" and therefore, could be subject of improvement (i.e. "use it at your own risk").

Apologies for pointing you to source code, but you can see how we do this with the train function, with a utility import_code function, in the CLI here.

Spoiler alert: Did not do that, but ended up using subprocess Python library in…

View full answer

pmbaumgartner · 2022-07-13T19:08:43Z

pmbaumgartner
Jul 13, 2022

Hey @dave-espinosa - thanks for the question. This is super interesting and I think will be useful for others looking to do the same thing.

There is a way to run a train command as code - see here for a basic example.
This is a little tricky, but I might have a solution. It looks like hypertune just needs metrics in a form of Dict[metric, List[metric_values]. What I think you could do is write a custom logger that would output your training values to a json file, reference this new logger in your model config, then in your code read that file back in and define your metric like they do in the Vertex example. Unfortunately I don't think there's currently a way to directly get this output from train, but this workaround should work.
I think the answer to (1) is also the answer to this. I would use a base config.cfg as a template, then you can pass your hyperparameters as a dict to the overrides argument of the to the train function.

2 replies

dave-espinosa Jul 13, 2022
Author

Hello @pmbaumgartner , long time not see!

Thanks for your input. I will include some comments / newer questions on top of it.

There is a way to run a train command as code - see here for a basic example.

Thanks! I still have a couple of questions about this though; they will developed further, below 😉.

What I think you could do is write a custom logger [...]

I am trying to run the code you suggested, to get a bit more familiar with what I am getting. First, and as you suggested, I included the following modification on the config.cfg file:

[training.logger]
@loggers = "my_custom_logger.v1"
log_path = "my_file.tab"

Then I ran the following code from CLI:

!python -m spacy train config.cfg --output ./output --paths.train ./train.spacy --paths.dev ./dev.spacy --code ./functions.py

And instead of getting the logs via command output, I get the following message (at the time of writing this update, my_file.tab is still being generated; I'll update this case when I have the training finished):

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.001
Logging to my_file.tab

So far, so good. However, I ran into some problems when trying to user the train helper function to reproduce the previous results. More specifically, I tried the following code:

from spacy.cli.train import train

train(
    "./config.cfg",
    # "./output",  # <== Tested this, works OK; won't need it though
    overrides={
        "paths.train": "./train.spacy",
        "paths.dev": "./dev.spacy",
        "code": "./functions.py"  # Does not work, as `code` does not exist in `config.cfg`
    }
)

And this:

from spacy.cli.train import train

train(
    "./config.cfg",
    # "./output",  # <== Tested this, works OK; won't need it though
    "code": "./functions.py"  # Does not work, as `code` parameter does not exist in `train` helper function
    overrides={
        "paths.train": "./train.spacy",
        "paths.dev": "./dev.spacy",
    }
)

Looks like I tripped right at the start line 😅...

Any clues about this?

Thank you very much.

pmbaumgartner Jul 14, 2022

Ah yes, that is a little tricky. Typically from the CLI we use the --code argument to include functions in the registry by importing them. In this case, could you do something like from functions import <your_logger_function_name> at the top of this code? Alternatively, copy the code for that logger into this training script. If you go the import route, be sure you import the function name -- not the string that you're registering it with in the decorator.

Apologies for pointing you to source code, but you can see how we do this with the train function, with a utility import_code function, in the CLI here.

dave-espinosa · 2022-07-20T16:54:59Z

dave-espinosa
Jul 20, 2022
Author

Hello @pmbaumgartner,

Sorry about the delay, got busy implementing your suggestions, as well as running some Q&A and troubleshooting on the Vertex AI side. Happily however, I think I have managed to implement a functional code for hyperparameter tuning of a spaCy NER model, using Vertex AI. To any future reader, this implementation can be considered as some sort of "version 0.1" and therefore, could be subject of improvement (i.e. "use it at your own risk").

Apologies for pointing you to source code, but you can see how we do this with the train function, with a utility import_code function, in the CLI here.

Spoiler alert: Did not do that, but ended up using subprocess Python library in the end 😁...

1. Disclaimers

About the code:
- The code originally developed is propietary, and for that reason I don't leave a link to any GitHub repo 😕; however I will leave "enough breadcrumbs" so you can come up with projects of your own 🤞.
- The VM where I tested and ran this code, does NOT have Prodigy installed; therefore some useful functions as prodigy.components.preprocess.split_sentences cannot be accessed, so I implemented some reduced versions of my own.
- The code snippets available here, assume you have a GCP project, and enough permissions to access to (at least, but not limited to) Cloud Storage, Vertex AI, Artifact Registry and Logger. Needless to say, they also assume you are quite familiar with spaCy, CLI in Linux, and Vertex AI: Hyperparameter Tuning.
- Over-commenting is considered a bad practice, so cope with me as I have "increased the amount of comments a notch"; these will make easier to introduce some timely comment and / or specific pieces of information.
- If you find any error, bug, or some improvement, feel free to drop them as response to this post. You can always reach me in the info available at my GitHub profile 🤓.

2. Project structure

The project structure I used, was:

spacy_hypertuner (*)
|
|-logger.py (**)
|-Dockerfile (*)
|-dockerignore (*)
|-config.cfg (**)
|-train.spacy (***)
|-dev.spacy (***)
|-history.jsonl (***)
|-trainer (**)
  |
  |-train_script.py (**)
  |-dataconverters (**)
    |-__init__.py (**)
    |-data_converters (**)

Asterisc * hints:

(*): Present only in host directory, not in Docker image.
(**): Present in both host directory and in Docker Image.
(***): Automatically generated inside Docker image, but not originally available in host directory.

3. General approach

As seen before in the "Vertex AI: Hyperparameter Tuning" Codelab, Vertex AI needs a Docker image, where your training code exists. The hyperparameters will be entered as CLI commands (easily achievable through the argparse library); Vertex Ai will do it automatically for you. In the Codelab (4. Containerize your training application code), building the Docker image is relatively easy, as they use a dataset already available in tfds; however in spaCy, you must access a .spacy file (or more flexibly as in my case, work your way from a .jsonl file, obtained from a Prodigy ner.manual session). In the upcoming sections, you will get access to some templates and some related explanations; but in general, the steps for the hyperparameter tuning, taking place inside the Docker container, are the following ones:

Get hyperparameters to test (Vertex AI will do this automatically for you!)
- Uses /trainer/train_script.py
Download and preprocess data (.json files from Prodigy tagging stations), from bucket in Google Cloud Storage
- Uses /trainer/train_script.py and /trainer/dataconverters/data_converters.py
Train a spaCy NER model with the current hyperparameters, and retrieve the training ‘history’ (.jsonl)
- Uses /trainer/train_script.py, logger.py, config.cfg and the automatic & internally generated files train.spacy, dev.spacy, history.jsonl.
Inform Vertex AI about the obtained results, and repeat until running out of hyperparameters or any early stop
- Uses /trainer/train_script.py

4. Files, quick explanations, hints and templates

As mentioned early, the exact code for my use case cannot be released publicly, however I will try to leave functional code, so you can depart from it, and adapt / improve it, to match your needs.

4.1. `train_script.py`

This the central script, which will interact with hypertune library, as well as your own custom libraries (In the "Vertex AI: Hyperparameter Tuning" Codelab tutorial, this script is named task.py).

The suggested template script, goes as follows:

import argparse
import gcsfs
import hypertune
import json
import subprocess
from .dataconverters import data_converters  # You'll find out more about this, later


def get_args():
    '''Parses args. Must include all hyperparameters you want to tune.'''
    parser = argparse.ArgumentParser()
    
    parser.add_argument(
        '--batch_size',  # <== 'batch_size' is the 1st hyperparameter to tune
        required=True,
        type=int,
        help='batch size')
    
    parser.add_argument(
        '--dropout',  # <== 'dropout' is the 2nd hyperparameter to tune
        required=True,
        type=float,
        help='dropout')
    
    # Insert here any other hyperparameter to tune
    
    args = parser.parse_args()
    return args


def get_data():
    """
    Custom function to import several '.jsonl' files
    from Cloud Storage, merge them, and return them as
    a list of Python dictionaries.
    """
    data_to_open = [
        "sample-1",  # i.e. `sample-1.jsonl`, 10000 labeled texts
        "sample-2"   # i.e. `sample-2.jsonl`, 10000 labeled texts
    ]

    data = []
    gcs_file_system = gcsfs.GCSFileSystem()
    for dataset in data_to_open:
        gcs_json_path = "gs://your_bucket_name/your_path/{}.jsonl".format(dataset)
        with gcs_file_system.open(gcs_json_path) as f:
            sample_data = [json.loads(line) for line in f]
        data += sample_data
    
    return data


def preprocess_data(data: list):
    """
    Custom function to preprocess a list of dictionaries,
    (each with the format given by Prodigy 'ner.manual') 
    and output a list of spaCy Docs. Consider this as 
    'all the data available'. The splitting will be done
    afterwards, DO NOT INCLUDE IT HERE.
    """
    # Just take 'accepted samples'
    data = [obj for obj in data if (obj['answer'] == 'accept') and ('spans' in obj) and (len(obj['spans']) > 0)]
    # get rid of duplicate samples, based on the text.
    data = list({obj['text']:obj for obj in data}.values())
    # Convert into a list of spaCy Docs
    preproc_data = data_converters.prodigy2doc_list(preproc_data)

    # Insert here any other pre-processing stage...
    # modify `data_converters.py` to do so...
    # Do NOT forget you must output a list of spaCy Docs
    
    return preproc_data


def train_spacy_ner_model(
    preproc_data: list,
    train_split: int,
    dev_split: int,
    
    # Insert your own hyperparameters here
    batch_size: int,
    dropout: float,
    # ------------------------------------

) -> list:

    # Convert into '.spacy' files, accessible by this function
    data_converters.doclist2spacy(
        preproc_data,
        train_split,
        dev_split
    )

    # -----------------------------------------------

    # Training: spaCy runs the command from CLI
    # which will be run afterwards through the
    # Python library 'subprocess'. Notice how
    # this CLI command includes a '--code' parameter,
    # which points to the 'logger.py' script. This
    # script will be explained later on.

    cmd = """
           python -m spacy train config.cfg
           --paths.train ./train.spacy 
           --paths.dev ./dev.spacy 
           --code ./logger.py 
           --nlp.batch_size {0} 
           --training.dropout {1}""".format(
        batch_size,
        dropout,
        # Insert your own hyperparameters here &
        # do not forget to include the rest of 
        # your own hyperparameters, modifying
        # 'cmd' accordingly (!)
    )
    subprocess.run(cmd.split())

    # -----------------------------------------------

    # Reading jsonlines file back
    with open('./history.jsonl', 'r') as json_file:
        json_list = list(json_file)

    # Retrieving `history`
    history = [json.loads(json_str) for json_str in json_list]

    return history


def main():
    
    args = get_args()
    data = get_data()
    preproc_data = preprocess_data(data)
    history = train_spacy_ner_model(
        preproc_data,
        train_split=90,  # 90% train set split
        dev_split=10,    # 10% dev set split
        
        # Insert your own hyperparameters here
        batch_size=args.batch_size,
        dropout=args.dropout,
        # ------------------------------------
    )
    
    # Retrieving last training value
    last_training_value = history[-1]['f1_score']
    # last_training_step = history[-1]['step']  # Optional

    # # Just for show
    # print(last_training_value, type(last_training_value))
    
    hpt = hypertune.HyperTune()
    hpt.report_hyperparameter_tuning_metric(
        hyperparameter_metric_tag='f1_score',
        metric_value=last_training_value,
        # global_step=last_training_step  # Optional
    )

if __name__ == "__main__":
    main()

4.2. `data_converters.py`

This script has been built as a custom module, in the custom library dataconverters. Include here all you data pre-processing functions.

In this case, just a bunch of demo functions will be included.

import json
import random
import re
import spacy
from spacy.tokens import DocBin, Span
from textacy.preprocessing import normalize


def prodigy2doc_single(text: str, spans: list):
    """
    Given a text with spans in Prodigy format return
    a spaCy Doc object.
    Parameters
    ----------
        text : `spans` are asociated to this text.
        spans : In Prodigy format, i.e.:
            [
                    {
                         "start": int,
                         "end": int,
                         "token_start": int,
                         "token_end": int,
                         "label": str
                     },
                    ...
            ]
    Returns
    -------
        doc : spacy.tokens.doc.Doc
            A spaCy Doc object
    """
    nlp = spacy.blank("en")
    nlp.add_pipe("sentencizer")
    doc = nlp(text)
    if len(spans) > 0:
        doc.ents = [Span(doc, span["token_start"], span["token_end"] + 1, label=span["label"]) for span in spans]
    return doc


def prodigy2doc_list(prodigy_jsonl: list, verbose=False) -> list:
    """
    Given a list of texts with spans in Prodigy format return a
    list of spaCy Doc objects.
    
    Parameters
    ----------
        prodigy_jsonl : Annotated documents in Prodigy format.
            
            Example of an annotated document in Prodigy format:
            {
                "text": str,
                "_input_hash": int,
                "_task_hash": int,
                "_is_binary": boolean,
                "tokens": [
                    {
                        "text": str,
                         "start": int,
                         "end": int,
                         "id": int,
                         "ws": boolean
                     },
                    ...
                        ],
                "_view_id": str,
                "spans": [
                    {
                         "start": int,
                         "end": int,
                         "token_start": int,
                         "token_end": int,
                         "label": str
                     },
                    ...
                    ],
                "answer": str,
                "_timestamp": int
            }
        verbose : Provides messages in CLI if an error occurs.
        
    Returns
    -------
        docs : List of spaCy Doc objects.
    
    """
    docs = []
    count = 0
    for obj in prodigy_jsonl:
        if obj["answer"] == "accept" and "spans" in obj:
            try:
                doc = prodigy2doc_single(obj["text"], obj["spans"])
                docs.append(doc)
            except:
                if verbose:
                    print("Error. Sample of index {} will be ignored.".format(count))
                pass
        count += 1
    return docs


def doclist2spacy(
    doc_list: list,
    train_split: int = 0,
    dev_split: int = 0,
    test_split: int = 0,
    train_name: str = 'train',
    dev_name: str = 'dev',
    test_name: str = 'test'
):
    """
    Generates a train, validation and test sets, compatible
    with spaCy NLP framework. They will be generated in your
    current working directory.
    
    Parameter:
    ---------
    doc_list : spaCy Docs. Must contain NER entities.
    train_split : Percentaje (0 to 100) of samples in 'doc_list', 
        which will become training set
    dev_split : Percentaje (0 to 100) of samples in 'doc_list', 
        which will become cross-validation set
    test_split : Percentaje (0 to 100) of samples in 'doc_list', 
        which will become cross-validation set
    train_name : Name of the train set, saved as '.spacy' type
    dev_name : Name of the train set, saved as '.spacy' type
    test_name : Name of the train set, saved as '.spacy' type
    
    Return:
    ------
    None
    """
    # Operating with input list
    assert len(doc_list) > 1, "'doc_list' does not contain elements"
    random.shuffle(doc_list)
    total_n = len(doc_list)
    
    # Operating with `train_split` & `dev_split`
    assert 0<train_split<100, "`train_split` out of valid range"
    assert 0<dev_split<100, "`dev_split` out of valid range"
    assert train_split >= dev_split, "Recommended: `train_split` must be larger than `dev_split`"
    assert train_split > 50, "Recommended: `train_split` should be larger than 50[%]"
    train_n = int((total_n * train_split)/100)
    
    if test_split:  # Operating WITH `test_split`
        
        assert 0<test_split<100, "`test_split` out of valid range"
        assert train_split >= test_split, "Recommended: `train_split` must be larger than `test_split`"
        assert train_split+dev_split+test_split == 100, "`train_split+dev_split+test_split` must sum 100"
        dev_n = int((total_n * dev_split)/100)
        
        # Splitting datasets as a Python list
        train_set = doc_list[:train_n]
        dev_set = doc_list[train_n:train_n+dev_n]
        test_set = doc_list[train_n+dev_n:]
        
        # Obtaining DocBin from previous lists
        train_docbin = DocBin(docs=train_set)
        dev_docbin = DocBin(docs=dev_set)
        test_docbin = DocBin(docs=test_set)
        
        # Storing DocBins in local disk
        train_docbin.to_disk("./{}.spacy".format(train_name))
        dev_docbin.to_disk("./{}.spacy".format(dev_name))
        test_docbin.to_disk("./{}.spacy".format(test_name))
    
    else:  # Operating WITHOUT `test_split`
        assert train_split+dev_split == 100, "`train_split+dev_split` must sum 100"
        
        # Splitting datasets as a Python list
        train_set = doc_list[:train_n]
        dev_set = doc_list[train_n:]
        
        # Obtaining DocBin from previous lists
        train_docbin = DocBin(docs=train_set)
        dev_docbin = DocBin(docs=dev_set)
        
        # Storing DocBins in local disk
        train_docbin.to_disk("./{}.spacy".format(train_name))
        dev_docbin.to_disk("./{}.spacy".format(dev_name))
    
    return None

4.3. `logger.py`

This script will generate the file history.jsonl automatically, when you use the train_spacy_ner_model function, seen in the previous section.

import json
import spacy
import sys
from pathlib import Path
from spacy import Language
from typing import IO, Tuple, Callable, Dict, Any, Optional

@spacy.registry.loggers("spacy_history_logger.v1")
def custom_logger(log_path):
    def setup_logger(
        nlp: Language,
        stdout: IO=sys.stdout,
        stderr: IO=sys.stderr
    ) -> Tuple[Callable, Callable]:
        stdout.write(f"Logging to {log_path}\n")
        log_file = Path(log_path).open("w", encoding="utf8")
        
        def log_step(info: Optional[Dict[str, Any]]):
            if info:

                to_write = {
                    'epoch': info['epoch'],
                    'step': info['step'],
                    'score': info['score'],
                    'loss_ner': info['losses']['ner'],
                    'f1_score': info['other_scores']['ents_f']               
                }
                
                log_file.write(json.dumps(to_write))
                log_file.write("\n")
                
                # for pipe in nlp.pipe_names:
                #     log_file.write(f"{info['losses'][pipe]}\t")

        def finalize():
            log_file.close()

        return log_step, finalize

    return setup_logger

A sample of the generated file history.jsonl will end-up looking like this:

{"epoch": 0, "step": 0, "score": 0.0, "loss_ner": 51.999997556209564, "f1_score": 0.0}
{"epoch": 3, "step": 200, "score": 0.0, "loss_ner": 10947.428419470787, "f1_score": 0.0}
{"epoch": 7, "step": 400, "score": 0.0, "loss_ner": 13261.642906785011, "f1_score": 0.0}
{"epoch": 12, "step": 600, "score": 0.0, "loss_ner": 15275.071686983109, "f1_score": 0.0}

Just to know, all the training data is available inside the info variable, which is a nested dictionary, that in my case, looked like this (LABEL_n has been edited from the original labels):

epoch <class 'int'>
step <class 'int'>
score <class 'float'>
other_scores <class 'dict'>
	 token_acc <class 'float'>
	 token_p <class 'float'>
	 token_r <class 'float'>
	 token_f <class 'float'>
	 ents_p <class 'float'>
	 ents_r <class 'float'>
	 ents_f <class 'float'>
	 ents_per_type <class 'dict'>
		 LABEL1 <class 'dict'>
			 p <class 'float'>
			 r <class 'float'>
			 f <class 'float'>
		 LABEL2 <class 'dict'>
			 p <class 'float'>
			 r <class 'float'>
			 f <class 'float'>
		 LABEL3 <class 'dict'>
			 p <class 'float'>
			 r <class 'float'>
			 f <class 'float'>
	 speed <class 'float'>
losses <class 'dict'>
	 tok2vec <class 'float'>
	 ner <class 'numpy.float64'>
checkpoints <class 'list'>
seconds <class 'int'>
words <class 'int'>

4.4. `Dockerfile`

This image will use CPU only. GPU code is being developped at the moment, but won't be available publicly (Sorry!)

# Use the following base image if you run with CPU only
FROM python:3.9.13-slim-bullseye

WORKDIR /
	
# Use the following line if you run with CPU only
RUN pip install gcsfs spacy textacy cloudml-hypertune && \
    python -m spacy download en_core_web_lg

# Copies the trainer code to the docker image.
COPY logger.py config.cfg ./
COPY /trainer /trainer/

# Sets up the entry point to invoke the trainer.
ENTRYPOINT ["python", "-m", "trainer.train_script"]

4.5. `dockerignore`

Making sure only needed files get included in the image is important.

Dockerfile
dockerignore
*.spacy
*.jsonl
*.ipynb

4.6. `config.cfg`

I used the spaCy Quickstart to obtain base_config.cfg, with the following settings:

Language: English
Components: ner
Hardware: CPU
Optimize for: accuracy

Then used init fill-config to obtain the config.cfg file. Finally in that file, we modified the following section:

[training.logger]
@loggers = "spacy_history_logger.v1"
log_path = "history.jsonl"

And there you go. I hope to bring some light to future readers interested in this topic.

Special thanks to the guys at 1Mentor Inc. for allowing me to share this code, freely.

Thank you!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Hyperparameter tuning of a spaCy NER model, with Vertex AI #11126

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Hyperparameter tuning of a spaCy NER model, with Vertex AI #11126

Uh oh!

Uh oh!

dave-espinosa Jul 13, 2022

Replies: 2 comments · 2 replies

Uh oh!

pmbaumgartner Jul 13, 2022

Uh oh!

dave-espinosa Jul 13, 2022 Author

Uh oh!

Uh oh!

pmbaumgartner Jul 14, 2022

Uh oh!

dave-espinosa Jul 20, 2022 Author

1. Disclaimers

2. Project structure

3. General approach

4. Files, quick explanations, hints and templates

4.1. train_script.py

4.2. data_converters.py

4.3. logger.py

4.4. Dockerfile

4.5. dockerignore

4.6. config.cfg

dave-espinosa
Jul 13, 2022

Replies: 2 comments 2 replies

pmbaumgartner
Jul 13, 2022

dave-espinosa Jul 13, 2022
Author

dave-espinosa
Jul 20, 2022
Author

4.1. `train_script.py`

4.2. `data_converters.py`

4.3. `logger.py`

4.4. `Dockerfile`

4.5. `dockerignore`

4.6. `config.cfg`