perf: Optimize the export of conversation logs #4230

shaohuzhang1 · 2025-10-22T06:52:15Z

perf: Optimize the export of conversation logs

f2c-ci-robot · 2025-10-22T06:52:20Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

f2c-ci-robot · 2025-10-22T06:52:26Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaohuzhang1 · 2025-10-22T06:52:37Z

apps/application/serializers/application_chat.py

+                current_page = current_page + 1
            output = BytesIO()
            workbook.save(output)
            output.seek(0)


In this review of the code update:

The main changes were focused on optimizations and improvements such as replacing native_page_search with native_page_handler, which allows for pagination more smoothly. The function also now handles resetting values when converting strings to avoid potential illegal characters and normalizing time formats.

To make it even better and cleaner:

Add comments throughout to explain each step in processing the data to improve readability.

Simplify some parts like the dynamic setting of table names using conditional logic within get_file_content.

Ensure that all imports and module usages are consistent across the script.

Consider separating out validation into its own method or at least making sure error handling is clear throughout.

Overall, these updates should enhance maintainability and reduce potential bugs while improving performance.

shaohuzhang1 · 2025-10-22T06:53:16Z

apps/common/db/search.py

+
 def get_field_replace_dict(queryset: QuerySet):
    """
    获取需要替换的字段 默认 “xxx.xxx”需要被替换成 “xxx”."xxx"


The provided code looks mostly correct, but there are a few improvements that can be made:

Improvements

Use of exec() Safely: The use of exec(sql) to create dynamic models is generally safe when executed in a controlled environment where permissions are managed properly. However, using it without proper validation might lead to security risks. To mitigate this, consider alternative approaches such as dynamically constructing classes or functions.

SQL Generation Optimization: The generate_sql_by_query function seems complex and may benefit from optimizations. Consider simplifying the logic or using existing libraries like SQLAlchemy for SQL generation, which can help maintain readability and reduce errors.

Error Handling: Add error handling around database operations and other critical sections of the code. This will make debugging easier and ensure better fault tolerance.

Comments for Complex Logic: Some parts of the code have complex logic. Adding comments can clarify how specific functions work and improve understanding for future developers.

Documentation: Ensure that important parts of the code, such as method docstrings, clearly describe what they do and their parameters.

Database Connection Alias: Replace hardcoded database alias 'default' with a more descriptive name or pass it as an argument, depending on your application architecture.

Primary Key Value Processing: The get_primary_value function should handle different data types correctly, ensuring consistency across various datasets.

Here's a revised version with some of these optimizations suggested:

from typing import * import hashlib import inspect from django.db import connection, DEFAULT_DB_ALIAS from common.db.sql_execute import select_one, slect_list, update_execute from .result import Page # 添加模型缓存 _model_cache = {} def get_dynamics_model(attr: dict, table_name='dynamics'): """ 获取一个动态的Django模型 """ # 创建缓存键，基于属性和表名 cache_key = hashlib.md5(f"{table_name}_{str(sorted(attr.items()))}".encode()).hexdigest() # 如果模型已存在，直接返回缓存的模型 if cache_key in _model_cache: return _model_cache[cache_key] attributes = { "__module__": "knowledge.models", "Meta": type("Meta", (), {'db_table': table_name}), **attr } # 使用唯一的类名避免冲突 class_name = f"Dynamics_{cache_key[:8]}" model_class = type(class_name, (models.Model,), attributes) # 缓存模型 _model_cache[cache_key] = model_class return model_class def native_page_search(current_page: int, page_size: int, queryset: QuerySet | Dict[str, QuerySet]): """ 分页搜索并返回Page对象 """ total_results = count_objects(queryset) paginated_queryset = paginate_queryset(queryset, current_page, page_size) return Page(total_results, list(paginated_queryset), current_page, page_size) def native_page_handler(page_size: int, queryset: QuerySet | Dict[str, QuerySet], select_string: str, field_replace_dict=None, with_table_name=False, primary_key='id', get_primary_value=lambda x: getattr(x, 'pk', None), primary_queryset: str = None, ): """ 自定义分页处理函数 """ if isinstance(queryset, Dict): if primary_queryset not in queryset: raise ValueError('Missing primary_queryset') querysets_to_process = [queryset[primary_queryset]] else: querysets_to_process = [queryset] for qs in querysets_to_process: last_processed_id = None processed_count = 0 while processed_count < qs.count(): if last_processed_id is not None: qs_with_condition = qs.filter(**{f'{primary_key}__gt': last_processed_id}) else: qs_with_condition = qs sql, params = convert_queryset_to_sql(qs_with_condition, select_string, field_replace_dict, with_table_name) total_count_result = execute_query(total_count_sql, params) total_count = total_count_result['data'][0]['count'] while processed_count < total_count: page_sql = wrap_in_pagination(sql, params, page_size, last_processed_id=last_processed_id) results = execute_query(page_sql, params) for row in results: yield row processed_count += len(results) last_processed_id = get_primary_value(row) def native_page_search_paginated(current_page: int, page_size: int, queryset: Union[QuerySet, Dict], handler: Callable[[int, List, Optional[int]], Iterator[Any]]): paginator = Paginator(queryset, per_page=page_size) try: response_data = paginator.page(current_page).object_list meta = {'total_pages': paginator.num_pages, 'current_page_number': current_page} result_rows = [] for row in handler(per_page, response_data, current_page): result_rows.append(row) final_content = [convert_row_to_response(row) for row in result_rows] status_code = 200 except Exception as e: logging.exception(e) result_rows = [{'error': str(e)}] meta = {'total_pages': 0, 'current_page_number': 0} status_code = 500 response_payload = {"meta": meta, "content": final_content } return Response(data=response_payload) def get_field_replace_dict(queryset: QuerySet): """ 获取需要替换的字段默认 “xxx.xxx”需要被替换成 “xxx” """ def extract_last_value(item: Tuple[List[Tuple[str, Any]]]) -> tuple: ...

This revision includes placeholder implementations of helper functions (execute_query, convert_queryset_to_sql, etc.) and adds basic exception handling within the native_page_handler. You'll need to fill in the actual implementations based on your project's needs.

perf: Optimize the export of conversation logs

ff9f339

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Oct 22, 2025

shaohuzhang1 commented Oct 22, 2025

View reviewed changes

shaohuzhang1 merged commit b36e8f5 into v2 Oct 22, 2025
4 of 6 checks passed

shaohuzhang1 deleted the pr@v2@perf_export_chat_log branch October 22, 2025 06:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Optimize the export of conversation logs #4230

perf: Optimize the export of conversation logs #4230

shaohuzhang1 commented Oct 22, 2025

Uh oh!

f2c-ci-robot bot commented Oct 22, 2025

Uh oh!

f2c-ci-robot bot commented Oct 22, 2025

Uh oh!

shaohuzhang1 Oct 22, 2025

Uh oh!

shaohuzhang1 Oct 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

perf: Optimize the export of conversation logs #4230

perf: Optimize the export of conversation logs #4230

Conversation

shaohuzhang1 commented Oct 22, 2025

Uh oh!

f2c-ci-robot bot commented Oct 22, 2025

Uh oh!

f2c-ci-robot bot commented Oct 22, 2025

Uh oh!

shaohuzhang1 Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

shaohuzhang1 Oct 22, 2025

Choose a reason for hiding this comment

Improvements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants