-
Notifications
You must be signed in to change notification settings - Fork 180
bench: add script to measure query length impact on response time #350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: lillyfuge <[email protected]>
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
bench/bench_query_length.sh
Outdated
} | ||
|
||
# Create result log file | ||
log_file="test_results_$(date +%Y%m%d_%H%M%S).log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if saving to the current directory, the file can be added to gitignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid saving the file locally, I directly output the results to the shell for easy viewing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry i forget delete the original code
Signed-off-by: lillyfuge <[email protected]>
bench/bench_query_length.sh
Outdated
|
||
echo "$content" | ||
} | ||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a dup
Signed-off-by: lillyfuge <[email protected]>
What type of PR is this?
add script to measure query length impact on response time
What this PR does / why we need it:
Adds a test script that sends HTTP requests with incrementally increasing content lengths to the Semantic Router and measures response times. This helps analyze how query length affects inference performance and latency, and can be integrated with metrics collection systems for deeper performance analysis #338.