Skip to content

tritonserver: unrecognized option '--t_exit_on_error=True' #88

@IlyaMescheryakov1402

Description

@IlyaMescheryakov1402

Description

I try to build clearml-serving images by my own but face the issue with triton. I guess this is because TritonHelper has argument t_exit_on_error and pass it to triton server command. But one expects --exit-on-error (see logs below, Model Repository part) instead of --t_exit_on_error.

Actual behavior

When the clearml-serving-triton is running, it fails with error:

clearml-serving - Nvidia Triton Engine Controller
tritonserver: unrecognized option '--t_exit_on_error=True'
Usage: tritonserver [options]
  --help
	Print usage

Server:
  --id <string>
	Identifier for this server.
  --exit-timeout-secs <integer>
	Timeout (in seconds) when exiting to wait for in-flight
	inferences to finish. After the timeout expires the server exits even if
	inferences are still in flight.

Logging:
  --log-verbose <integer>
	Set verbose logging level. Zero (0) disables verbose logging
	and values >= 1 enable verbose logging.
  --log-info <boolean>
	Enable/disable info-level logging.
  --log-warning <boolean>
	Enable/disable warning-level logging.
  --log-error <boolean>
	Enable/disable error-level logging.
  --log-format <string>
	Set the logging format. Options are "default" and "ISO8601".
	The default is "default". For "default", the log severity (L) and
	timestamp will be logged as "LMMDD hh:mm:ss.ssssss". For "ISO8601",
	the log format will be "YYYY-MM-DDThh:mm:ssZ L".
  --log-file <string>
	Set the name of the log output file. If specified, log
	outputs will be saved to this file. If not specified, log outputs will
	stream to the console.

Model Repository:
  --model-store <string>
	Equivalent to --model-repository.
  --model-repository <string>
	Path to model repository directory. It may be specified
	multiple times to add multiple model repositories. Note that if a model
	is not unique across all model repositories at any time, the model
	will not be available.
  --exit-on-error <boolean>
	Exit the inference server if an error occurs during
	initialization.
  --disable-auto-complete-config
	If set, disables the triton and backends from auto
	completing model configuration files. Model configuration files must be
	provided and all required configuration settings must be specified.
  --strict-readiness <boolean>
	If true /v2/health/ready endpoint indicates ready if the
	server is responsive and all models are available. If false
	/v2/health/ready endpoint indicates ready if server is responsive even if
	some/all models are unavailable.
  --model-control-mode <string>
	Specify the mode for model management. Options are "none",
	"poll" and "explicit". The default is "none". For "none", the server
	will load all models in the model repository(s) at startup and will
	not make any changes to the load models after that. For "poll", the
	server will poll the model repository(s) to detect changes and will
	load/unload models based on those changes. The poll rate is
	controlled by 'repository-poll-secs'. For "explicit", model load and unload
	is initiated by using the model control APIs, and only models
	specified with --load-model will be loaded at startup.
  --repository-poll-secs <integer>
	Interval in seconds between each poll of the model
	repository to check for changes. Valid only when --model-control-mode=poll is
	specified.
  --load-model <string>
	Name of the model to be loaded on server startup. It may be
	specified multiple times to add multiple models. To load ALL models
	at startup, specify '*' as the model name with --load-model=* as the
	ONLY --load-model argument, this does not imply any pattern
	matching. Specifying --load-model=* in conjunction with another
	--load-model argument will result in error. Note that this option will only
	take effect if --model-control-mode=explicit is true.
  --model-config-name <string>
	The custom configuration name for models to load.The name
	should not contain any space character.For example:
	--model-config-name=h100. If --model-config-name is not set, Triton will use the
	default config.pbtxt.
  --model-load-thread-count <integer>
	The number of threads used to concurrently load models in
	model repositories. Default is 4.
  --model-load-retry-count <integer>
	The number of retry to load a model in model repositories.
	Default is 0.
  --model-namespacing <boolean>
	Whether model namespacing is enable or not. If true, models
	with the same name can be served if they are in different namespace.
  --enable-peer-access <boolean>
	Whether the server tries to enable peer access or not. Even
	when this options is set to true,  peer access could still be not
	enabled because the underlying system doesn't support it. The server
	will log a warning in this case. Default is true.

...

Expected behavior

Triton Inference Server is running without any errors

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions