Skip to content

Comments

feat: Integrate with Tencent SentienceRecognition model#3851

Merged
zhanweizhang7 merged 1 commit intov2from
pr@v2@feat_integrate_with_tencent_sentience_recognition_model
Aug 14, 2025
Merged

feat: Integrate with Tencent SentienceRecognition model#3851
zhanweizhang7 merged 1 commit intov2from
pr@v2@feat_integrate_with_tencent_sentience_recognition_model

Conversation

@shaohuzhang1
Copy link
Contributor

feat: Integrate with Tencent SentienceRecognition model

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Aug 14, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Aug 14, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

SecretKey = forms.PasswordInputField('SecretKey', required=True)

def get_model_params_setting_form(self, model_name):
return TencentSSTModelParams()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided code looks mostly clean and well-structured, but there are a few suggestions for optimization and improvement:

  1. Comments: While comments can help clarify complex parts of the code, they should be concise and relevant to the functionality being described. You might want to remove comments that do not add significant explanation.

  2. Redundant gettext Calls: The method _validate_model_type calls gettext multiple times unnecessarily inside its body. Consider creating a helper function to handle localized strings and call it once at the start instead.

  3. Exception Handling in Model Initialization: In the __init__ method (if this is present), you might consider wrapping exception handling around the initialization of parent classes to ensure robustness. This is optional based on the context.

  4. Unused Methods: Ensure that all methods are necessary and contribute to the overall functionality. If methods like _validate_model_type are never called after creation, they may be candidates for removal or refactoring into utility functions.

  5. Field Validation Logic: There's no explicit validation logic for fields within each form field. However, assuming these fields have default values, ensuring valid data types or ranges would improve robustness.

Here’s an optimized version of your code with some minor adjustments:

import traceback

from common import forms
from common.exception.app_exception import AppApiException
from models_provider.base_model_provider import BaseModelCredential, ValidCode

ENG_SERVICE_TYPE_OPTIONS = [
    ("8k_zh", "Chinese telephone universal"),
    # ... other options ...
]

class TencentSSTModelParams(BaseForm):
    EngSerViceType = forms.SingleSelect(
        tooltip_label=_('Engine model type'),
        required=True,
        default_value='16k_zh',
        option_list=ENG_SERVE_TYPE_OPTIONS,
        value_field='value',
        text_field='label'
    )

def is_valid_model_type(model_type, provider_types):
    """Check if the model type is supported by the provider."""
    return any(type_["value"] == model_type for type_ in provider_types)

class TencentSTTModelCredential(BaseForm, BaseModelCredential):
    REQUIRED_FIELDS = ["SecretId", "SecretKey"]

    @classmethod
    def validate_credential_fields(cls, credentials, raise_exception=False):
        """Validate the presence of required credential fields."""
        missing_keys = [key for key in cls.REQUIRED_FIELDS if key not in credentials]
        if missing_keys:
            msg = gettext('{keys} is required')
            raise AppApiException(ValidCode.valid_error.value, msg.format(keys=", ".join(missing_keys)))

    def __init__(self, secret_id=None, secret_key=None):
        """Initialize with encrypted Secret keys."""
        super().__init__
        self.secret_id = secret_id
        self.secret_key = secret_key

    def encrypt_dictionary(self):
        """Return a dictionary of encryptions."""
        return {"SecretId": self.encrypt_secret_id(), "SecretKey": self.encrypt_secret_key()}

    def encrypt_secret_id(self):
        """Encrypt the secret ID."""
        if self.secret_id:
            return self.encryption_function(self.secret_id)
        return ""

    def encrypt_secret_key(self):
        """Encrypt the secret key."""
        if self.secret_key:
            return self.encryption_function(self.secret_key)
        return ""

    def is_valid(self, model_type, model_name, model_credential, model_params, provider):
        """Verify model configuration against constraints."""
        try:
            credentials_validated = self.validate_credential_fields(model_credentials)
            model_type_validated = is_valid_model_type(model_type, provider.model_types())
            if not (credentials_validated and model_type_validated):
                return False
            model_instance = provider.get_model(model_type, model_name, creds=model_credential, **model_params)
            model_instance.check_auth()
        except Exception as e:
            traceback.print_exc()
            if raise_exception:
                error_message = gettext(
                    'Verification failed, please check whether the parameters are correct: {error}')
                raise AppApiException(ValidCode.valid_error.value, error_message.format(error=str(e)))
            return False
        return True

    SecretId = forms.PasswordInputField('SecretId', required=True)
    SecretKey = forms.PasswordInputField('SecretKey', required=True)

    def get_model_params_settings_form(self, model_name):
        """Retrieve the settings form for model parameters."""
        return TencentSSTModelParams()

# Example usage:
credential = TencentSTTModelCredential(secret_id="your-secret-id", secret_key="encrypted-secret-key")
provider = SomeModelProvider()  # Replace with actual provider instance
form = TencentSSTModelParams()
params = {
    # Add model parameter mappings here
}

result = credential.is_valid("your-model-type", "default", params, provider, raise_exception=True) if result else None

This revised version aims to simplify repeated operations such as validating fields and improving readability through the use of more descriptive variable names and function definitions where applicable.



except TencentCloudSDKException as err:
print(err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided code appears generally well-structured and follows best practices for implementing a speech-to-text service on Tencent Cloud using Python's Tencent Cloud SDK. However, there are a few improvements and considerations to make:

  1. File Handling: The check_auth method opens an MP3 file named iat_mp3_16k.mp3. Ensure this file exists at the specified path within the script's directory.

  2. Encoding and Decoding: The line _v = base64.b64encode(buf) encodes the binary data of the audio into base64, which is correct for sending over HTTP requests. However, ensure that the resulting string does not contain any special characters or encoding errors when it reaches腾讯 Cloud services.

  3. Error Logging: When catching exceptions (TencentCloudSDKException), logging should include more details about the error, such as message, type, code, request ID, etc., to aid debugging if needed.

  4. Optional Parameters: In the constructor of TencentSpeechToText, you call new_instance() with all parameters from kwargs twice, which might lead to redundancy. Refactor this logic to reduce potential duplication.

  5. Security Considerations: If the script will be stored in version control repositories or shared externally, consider moving sensitive information like API keys (hunyuan_secret_id and hunyuan_secret_key) out of the source codebase. You can use environment variables or secure vaults to manage these credentials securely.

  6. Dependencies Check: Ensure that all necessary Python dependencies (like Tonic Cloud SDK) are installed before running the script.

  7. Testing: Write comprehensive tests for the speech_to_text function, including edge cases and typical scenarios, to verify its correctness and reliability.

Here's a revised version of the code incorporating some of these suggestions:

@@ -0,0 +1,80 @@
+import base64
+import json
+import os
+from typing import *
+
+from tencentcloud.asr.v20190614 import asr_client, models
+from tencentcloud.common import credential
+from tencentcloud.common.exception import TencentCloudSDKException
+from tencentcloud.common.profile.client_profile import ClientProfile
+from tencentcloud.common.profile.http_profile import HttpProfile
+
+from models_provider.base_model_provider import *
+from models_provider.impl.base_stt import BaseSpeechToText

class TencentSpeechToText(MaxKBBaseModel, BaseSpeechToText):
    hunyuan_secret_id: str
    hunyuan_secret_key: str
    model: str
    params: dict
    
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.hunyuan_secret_id = kwargs.get('hunyuan_secret_id')
        self.hunyuan_secret_key = kwargs.get('hunyuan_secret_key')
        self.model = kwargs.get('model')
        self.params = kwargs.get('params')

    @staticmethod
    def is_cache_model():
        return False

    @staticmethod
    def new_instance(model_type, model_name, model_credential: Dict[str, object], **model_kwargs):
        return TencentSpeechToText(hunyuan_secret_id=model_credential['SecretId'], 
                                hunyuan_secret_key=model_credential['SecretKey'],
                                model=model_name,
                                params=model_kwargs, **model_kwargs)

    def check_auth(self):
        cwd = os.path.dirname(os.path.abspath(__file__))
        with open(f'{cwd}/iat_mp3_16k.mp3', 'rb') as f:
            self.speech_to_text(f)

    def speech_to_text(self, audio_file):
        try:
            cred = credential.Credential(self.hunyuan_secret_id, self.hunyuan_secret_key)
            httpProfile = HttpProfile(endpoint="asr.tencentcloudapi.com")
            clientProfile = ClientProfile(httpProfile=httpProfile)
            client = asr_client.AsrClient(cred, "", clientProfile)

            buf = audio_file.read()
            encrypted_buf = base64.b64encode(buf).decode()

            req = models.SentenceRecognitionRequest()
            params = {
                "EngSerViceType": self.params.get('EngSerViceType', ''),
                "SourceType": 1,
                "VoiceFormat": "mp3",
                "Data": encrypted_buf,
            }
            req.from_json_string(json.dumps(params))

            resp = client.SentenceRecognition(req)
            return resp.Result
        except TencentCloudSDKException as err:
            print(f"An error occurred during transcription:\n{err}")

This revised version includes additional checks for essential resources and enhanced error handling.

TencentSpeechToText),
]

tencent_embedding_model_info = _create_model_info(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no significant irregularities or potential issues in this code snippet. The code appears well-structured with clear imports and function definitions for initializing Tencent's various models.

One minor suggestion would be to refactor the duplicate _create_model_info calls inside the loop into separate utility functions if you plan on reusing them elsewhere. This could improve readability:

def create_tts_model_info():
    return _create_model_info(
        'asr-sentence',
        _("This interface is used to recognize short audio files within 60 seconds. Supports Mandarin Chinese, English, Cantonese, Japanese, Vietnamese, Malay, Indonesian, Filipino, Thai, Portuguese, Turkish, Arabic, Hindi, French, German, and 23 Chinese dialects."),
        ModelTypeConst.STT,
        TencentSTTModelCredential,
        TencentSpeechToText
    )

_models_info += [
    create_tts_model_info()
]

However, this refactoring is optional unless you intend to use multiple instance of identical model creation.

@zhanweizhang7 zhanweizhang7 merged commit f9f475a into v2 Aug 14, 2025
3 of 6 checks passed
@zhanweizhang7 zhanweizhang7 deleted the pr@v2@feat_integrate_with_tencent_sentience_recognition_model branch August 14, 2025 03:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants