feat: STT model params #4106

shaohuzhang1 · 2025-09-25T07:46:05Z

feat: STT model params

f2c-ci-robot · 2025-09-25T07:46:09Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

f2c-ci-robot · 2025-09-25T07:46:14Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaohuzhang1 · 2025-09-25T07:46:39Z

apps/models_provider/impl/volcanic_engine_model_provider/model/stt.py

+                'codec': params.get('codec', self.codec)
            }
        }
        return req


In your VolcanicEngineSpeechToText class, there are several areas that could be improved or checked:

Parameter Default Values: In the constructor and request construction method, parameters like nbest, workflow, etc., have fallbacks using the instance variables (self). Ensure these defaults align with expected behavior.

Consistent Parameter Handling: In the request construction method, ensure consistent handling of optional parameters. Mixing between direct access to instance variables and parameter dictionary checks can lead to confusion and errors.

Code Organization: Consider organizing constants or settings at the beginning of the file for better readability and maintainability.

Error Handling: Add basic error handling around network requests (e.g., exceptions) to improve robustness.

Logging Configuration: If you want logging in this class, configure it correctly to capture relevant information without cluttering logs with irrelevant details.

Here’s an optimized version with some suggestions applied:

class VolcanicEngineSpeechToText(MaxKBBaseModel, BaseSpeechToText): volcanic_cluster: str volcanic_api_url: str volcano_token: str params: dict DEFAULT_PARAMS = { 'nbest': 3, 'workflow': 1, 'show_language': True, 'show_utterances': False, 'result_type': 'full', 'sequence': 1 } def __init__(self, **kwargs): super().__init__(**kwargs) # Update with provided keys first, then fall back to default self.volcanic_api_url = kwargs.get('volcanic_api_url') self.volcano_token = kwargs.get('volcanic_token') self.volcano_app_id = kwargs.get('volcano_app_id') self.volcanic_cluster = kwargs.get('volcanic_cluster') self.params = {**{k: v for k, v in kwargs.items() if k != 'params'}, **self.DEFAULT_PARAMS} @staticmethod def is_cache_model(): return True @classmethod def new_instance(cls, model_type, model_name, model_credential: Dict[str, object], **model_kwargs): return cls( volcanic_api_url=model_credential.get('volcanic_api_url'), volcano_token=model_credential.get('volcanic_token'), volcano_app_id=model_credential.get('volcano_app_id'), volcanic_cluster=model_credential.get('volcanic_cluster'), *([model_credential.get(k)] + ([model_kwarg] if isinstance(value, tuple) else [value]) for ( k, value) in model_kwargs.items()), ) def construct_request(self, reqid): + params = {**self.params} # Make a copy before modifying user_params = {k: self.params.pop(k, None) for k in ['uid']} req = { 'app': { 'appid': self.volcanic_app_id, 'cluster': self.volcanic_cluster, 'token': self.volcano_token, }, 'user': user_params, 'request': { 'reqid': reqid, **params }, 'audio': { 'format': self.params.pop('format', None), 'rate': self.params.pop('rate', None), 'language': self.params.pop('language', None), 'bits': self.params.pop('bits', None), 'channel': self.params.pop('channel', None), 'codec': self.params.pop('codec', None), } } return req

Key Improvements:

Moved default parameter values to DEFAULT_PARAMS.

Used dictionary unpacking to simplify the initialization process.

Created an explicit function to extract optional user parameters from all input parameters.

Passed both named arguments and tuples as positional arguments when creating instances.

Ensured clean use of .pop() to safely remove processed parameters from the dictionary.

shaohuzhang1 · 2025-09-25T07:47:02Z

apps/models_provider/impl/xf_model_provider/credential/stt.py


    def get_model_params_setting_form(self, model_name):
-        pass
+        return XunFeiSTTModelParams()


The provided code has a few issues that you might want to address:

Import Statements: The default_value parameter was removed from forms.TextInputField which is deprecated. However, the current version of Flask-WTF (v2.3.1) does allow this parameter with the correct casing ("defaultValue").

Class Naming Convention: Using PascalCase for class names can be more consistent and readable.

TooltipLabel usage: It looks like TooltipLabel needs to be properly imported into your file. Ensure it's defined elsewhere in your package.

Function Definition: There seems to be an inconsistency in handling exceptions within is_valid. If you're catching all exceptions globally, it should consider logging or propagating them appropriately rather than using traceback.print_exc(), which may lead to output being displayed in places where console output isn't desired.

Here’s how you could make these adjustments:

# Import statements from common import forms import traceback from common.exception.app_exception import AppApiException, ValidationException from models_provider.base_model_provider import BaseModelCredential, ValidCode # Class renaming convention class XunFeiSpeechToTextModelParams(forms.Form): language = forms.TextInputField( label=_('language'), help_text=_('If not passed, the default value is zh_cn'), validators=[forms.validators.DataRequired()] ) domain = forms.TextInputField( label=_('domain'), help_text=_('If not passed, the default value is iat'), validators=[forms.validators.DataRequired()] ) accent = forms.TextInputField( label=_('accent'), help_text=_('If not passed, the default value is Mandarin'), validators=[forms.validators.DataRequired()] ) class XunFeiSpeechToTextModelCredential(BaseForm, BaseModelCredential): spark_api_url = forms.TextInputField( "API URL", validators=[forms.validators.DataRequired()], default="wss://iat-api.xfyun.cn/v2/iat" ) spark_app_id = forms.TextInputField("APP ID", validators=[forms.validators.DataRequired()]) def is_valid(model_type: str, model_name, model_credential: dict = None, *args, **kwargs): """ Validate if the given credentials are valid according to predefined rules. """ if not isinstance(model_credential, dict): raise ValidationException("Invalid credential type") # Your logic here ... try: model = provider.get_model(model_type=model_type, model_name=model_name, model_credential=model_credential, **kwargs) model.check_auth() except Exception as e: raise ApplicationException(str(e)) def encryption_dict(model: dict): """ Encrypt some sensitive information in the model dictionary. """ return {**model, 'spark_api_secret': super().encryption(model.get('spark_api_secret', ''))} def get_model_params_setting_form(model_name): # Pass necessary params to create form based on specific requirements form_class = globals()[f"{model_name.capitalize()}ModelParams"] return form_class(*args, **kwargs)

Notes:

Adjusted import statements for consistency and clarity.

Ensured proper validation usage across different fields.

Used clear naming conventions and added docstrings for better understanding.

Fixed function parameters to accept keyword arguments correctly in get_model_params_setting_form.

These changes aim to improve readability, maintainability, and functionality as per standard Python coding practices.

shaohuzhang1 · 2025-09-25T07:47:11Z

apps/models_provider/impl/aliyun_bai_lian_model_provider/model/stt.py

+
        with tempfile.NamedTemporaryFile(delete=False) as temp_file:
            # 将上传的文件保存到临时文件中
            temp_file.write(audio_file.read())


There appears to be an issue in the speech_to_text method. The line:

dashscope.api_key = self.api_key

Should likely be changed to:

dashscope.API_KEY = self.api_key

Assuming that dashscope is a global variable or class attribute used throughout your script, this error would prevent it from using the appropriate API key for authentication during processing.

Additionally, you have two instances of printing the parameters dictionary (print(recognition_params)). It might not be necessary unless debugging purposes; if these prints can simply be removed without affecting functionality, they should indeed be excluded for cleaner code.

Here's the adjusted part relevant to fixing the auth issue (and optionally removing unnecessary print statements):

def speech_to_text(self, audio_file): recognition_params = { 'model': self.model, 'format': 'mp3', 'sample_rate': 16000, 'callback': None, **self.params } dashscope.API_KEY = self.api_key with tempfile.NamedTemporaryFile(delete=False) as temp_file: # Ensure correct context for file handle operations with open(temp_file.name, "wb"): temp_file.write(audio_file.read()) # Optionally remove or comment out the following # print("Recognition Parameters:", recognition_params)

This fix ensures proper authorization handling while maintaining clarity within your application code.

feat: STT model params

a5ef985

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Sep 25, 2025

shaohuzhang1 commented Sep 25, 2025

View reviewed changes

zhanweizhang7 merged commit e8c36a6 into v2 Sep 25, 2025
4 of 6 checks passed

zhanweizhang7 deleted the pr@v2@feat_stt_model_params branch September 25, 2025 07:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: STT model params #4106

feat: STT model params #4106

Uh oh!

shaohuzhang1 commented Sep 25, 2025

Uh oh!

f2c-ci-robot bot commented Sep 25, 2025

Uh oh!

f2c-ci-robot bot commented Sep 25, 2025

Uh oh!

shaohuzhang1 Sep 25, 2025

Uh oh!

shaohuzhang1 Sep 25, 2025

Uh oh!

shaohuzhang1 Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: STT model params #4106

feat: STT model params #4106

Uh oh!

Conversation

shaohuzhang1 commented Sep 25, 2025

Uh oh!

f2c-ci-robot bot commented Sep 25, 2025

Uh oh!

f2c-ci-robot bot commented Sep 25, 2025

Uh oh!

shaohuzhang1 Sep 25, 2025

Choose a reason for hiding this comment

Key Improvements:

Uh oh!

shaohuzhang1 Sep 25, 2025

Choose a reason for hiding this comment

Notes:

Uh oh!

shaohuzhang1 Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants