feat: Support tag in knowledge_workflow #4468

shaohuzhang1 · 2025-12-09T04:04:05Z

feat: Support tag in knowledge_workflow

f2c-ci-robot · 2025-12-09T04:04:10Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

f2c-ci-robot · 2025-12-09T04:04:15Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaohuzhang1 · 2025-12-09T04:04:36Z

apps/application/flow/step_node/knowledge_write_node/impl/base_knowledge_write_node.py

+                batch_add_document_tag(document_tag_id_map)

        problem_model_list, problem_paragraph_mapping_list = (
            ProblemParagraphManage(problem_paragraph_object_list, knowledge_id).to_problem_model_list()


The provided Python code is generally well-structured and follows Django best practices. However, there are a few areas where improvements could be made:

Imports: The typing.Any import can often lead to type-checking errors. It might be better to specify more precise types where possible.

Method Names: Method names like convert_uuid_to_str, link_file, etc., are good for clarity but consider using underscores consistently (snake_case) instead of hyphens (-).

Error Handling: Ensure that error handling is robust enough to manage any exceptions that may arise during file operations or database interactions.

Database Operations: For larger datasets, ensure that transactions are used to group multiple SQL queries into one transaction block, which can improve performance and consistency.

Serialization: Consider simplifying the serialization process by using Django's built-in serializers where appropriate, which can reduce boilerplate code.

Functionality: Verify that all functions, particularly those related to data insertion, handle edge cases and invalid inputs gracefully.

Code Comments: Add comments explaining complex logic or sections of the code, especially around multi-line comments or blocks of repeated code.

Here are some specific suggestions:

# Example refinement: Using snake_case for method names def convert_uuid_to_str(obj): # Implementation... pass # Adding error handling try: link_file(source_file_id, document_id) except Exception as e: print(f"Error linking file: {e}") # Simplified serialization with Django SerializerUtils if applicable from django.core.exceptions import ValidationError def create_serializer(data): try: serializer = MyDjangoModelSerializer(data=data) serializer.is_valid(raise_exception=True) return serializer.validated_data except ValidationError as e: raise SerialzationException(str(e)) # Improved function for tag management @transaction.atomic def save_knowledge_tags(concerned_document_id, concerned_tags): old_tag_models = DocumentTag.objects.filter(document_id__in=[concerned_document_id]) new_tag_list = [] existing_tag_dict = {} for tag in concerned_tags: key_value_pair = (tag['key'], tag['value']) if key_value_pair not in existing_tag_dict.values(): new_tag = DocmumentTag(document_id=concerned_document_id, **key_value_pair) new_tag_list.append(new_tag) existing_tag_dict[key_value_pair] = f"{new_tag.id}" if new_tag_list: DocumentTag.objects.bulk_create(new_tag_list) # Refactoring batch_add_document_tag @transaction.atomic def batch_add_document_tag(relation_map): existing_relations = set(QuerySet(...).filter(...).values_list('...')) new_entries = [] for doc_id in relation_map: for tag_id in relation_map[doc_id]: if (doc_id, tag_id) not in existing_relations: entry = {...} new_entries.append(entry) if new_entries: _queryset.objects.bulk_create(..., new_entries)

These changes make the code cleaner, more maintainable, and safer from potential runtime errors when interacting with databases or external services.

shaohuzhang1 · 2025-12-09T04:04:45Z

apps/models_provider/impl/aliyun_bai_lian_model_provider/credential/stt/default_stt.py

+            {'label': _('Real-time speech recognition - Fun-ASR/Paraformer'),
             'value': 'other'}
        ])
    api_url = forms.TextInputField(_('API URL'), required=True, relation_show_field_dict={'type': ['qwen', 'omni']})


The code looks mostly clean, but here are some suggestions:

The option_list should be indented under the type field definition.

Here's the corrected version with indentation added:

class AliyunBaiLianDefaultSTTModelCredential(BaseForm, BaseModelCredential): type = forms.SingleSelect( _("API"), required=True, text_field='label', default_value='qwen', provider='', method='', value_field='value', option_list=[ {'label': _('Audio file recognition - Tongyi Qwen'), 'value': 'qwen'}, {'label': _('Qwen-Omni'), 'value': 'omni'}, {'label': _('Real-time speech recognition - Fun-ASR/Paraformer'), 'value': 'other'} ] ) api_url = forms.TextInputField( _('API URL'), required=True, relation_show_field_dict={'type': ['qwen', 'omni']} )

Consider renaming the api_url to something more descriptive like speech_recognition_endpoint, as it suggests that this field is related to speech recognition rather than simply an API URL.

Although not strictly necessary from a technical perspective, ensuring consistent coding style (e.g., using consistent spacing around operators and closing braces) can improve readability.

These changes make the code more readable and maintainable while maintaining its functional correctness.

shaohuzhang1 · 2025-12-09T04:04:54Z

apps/models_provider/impl/aliyun_bai_lian_model_provider/model/stt/omni_stt.py

+                extra_body = {'enable_thinking': False, **self.params},
            )
            result = []
            for chunk in completion:


The provided code checks for certain irregularities that need addressing:

Irregularity/Issue

Missing self Parameter: The extra_body parameter is missing the self. prefix before it. This will raise an AttributeError because Python needs to access class attributes using the instance self.

extra_body=self.params

extra_body=self.extra_body

### Optimization Suggestion (if applicable) There doesn't appear to be any immediate optimization opportunities within this function, as it primarily involves processing audio and sending requests. ### Additional Considerations - **Exception Handling**: Ensure there's proper error handling around HTTP requests if they might fail unexpectedly. ```python except Exception as e: print(f"An error occurred in speech_to_text: {e}")

Logging: Implement logging statements to track the flow of execution and errors more effectively.

import logging logger = logging.getLogger(__name__) logger.setLevel(logging.INFO) formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s') handler = logging.StreamHandler() handler.setFormatter(formatter) logger.addHandler(handler) try: completion_stream = openai.Completion.create( ... additional_body={'enable_thinking': False, **self.params} ) except Exception as e: logger.error(f"Error generating text from AI: {e}") return None

These changes should help ensure that the code functions correctly without syntax errors and improves robustness through some basic error handling practices.

feat: Support tag in knowledge_workflow

e77bdc7

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Dec 9, 2025

shaohuzhang1 commented Dec 9, 2025

View reviewed changes

zhanweizhang7 merged commit 6e87cef into v2 Dec 9, 2025
3 of 6 checks passed

zhanweizhang7 deleted the pr@v2@feat_support branch December 9, 2025 04:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support tag in knowledge_workflow #4468

feat: Support tag in knowledge_workflow #4468

shaohuzhang1 commented Dec 9, 2025

Uh oh!

f2c-ci-robot bot commented Dec 9, 2025

Uh oh!

f2c-ci-robot bot commented Dec 9, 2025

Uh oh!

shaohuzhang1 Dec 9, 2025

Uh oh!

shaohuzhang1 Dec 9, 2025

Uh oh!

shaohuzhang1 Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Support tag in knowledge_workflow #4468

feat: Support tag in knowledge_workflow #4468

Conversation

shaohuzhang1 commented Dec 9, 2025

Uh oh!

f2c-ci-robot bot commented Dec 9, 2025

Uh oh!

f2c-ci-robot bot commented Dec 9, 2025

Uh oh!

shaohuzhang1 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

shaohuzhang1 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

shaohuzhang1 Dec 9, 2025

Choose a reason for hiding this comment

Irregularity/Issue

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants