Skip to content

mcaimi/flan-t5-custom-kserve-engine

Repository files navigation

KServe Compatible Demo Model Server For Seq2Seq Models

This is a demo model server that can run inside Openshift AI as a KServe model server. It exposes a way to perform inference with a Sequence to Sequence model (such as T5) running in the backend.

It is a companion project to the finetuning example found here

Parameters

The server expects a JSON-encoded payload to start inference.

  • For v1 compatible requests:
 // example v1 payload:
 {
   "instances": [
     {
       "task": "anonymize",
       "source": "text string to be anonymized",
     }
   ]
 }
  • For v2 compatible requests:
{
  "inputs": [
    {
      "name": "anonymize",
      "shape": [1],
      "datatype": "BYTES",
      "data": ["text string to be anonymized"]
    },
  ]
}

How to run

$ uv sync
$ MODEL_ID="path to the finetuned checkpoint" uv run model.py --model_name flant5-finetuned 

About

This repository has been moved to https://codeberg.org/mcaimi/flan-t5-custom-kserve-engine. Please visit the new location for the latest updates.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published