Skip to content

Performance benchmark & optimization for the external tokenizer service #155

@delavet

Description

@delavet

In #137, an external tokenizer service based on UDS (Unix Domain Socket) was proposed.
This service aims to enhance compatibility with vLLM tokenization by utilizing the Python transformers library. However, due to its external nature, it may impact tokenization performance. Therefore, an appropriate benchmark needs to be proposed so that we can make a reasonable trade-off between external and internal tokenization.
Additionally, this service can still benefit from performance optimization techniques. For example, using gRPC instead of HTTP+JSON could be considered. The benchmark can be used to measure the performance improvements brought by different optimization approaches.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions