Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion apps/dataset/serializers/document_serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1106,7 +1106,7 @@ def batch_save(self, instance_list: List[Dict], with_valid=True):
'order_by_query': QuerySet(Document).order_by('-create_time', 'id')
}, select_string=get_file_content(
os.path.join(PROJECT_DIR, "apps", "dataset", 'sql', 'list_document.sql')),
with_search_one=False), dataset_id
with_search_one=False), dataset_id

@staticmethod
def _batch_sync(document_id_list: List[str]):
Expand Down Expand Up @@ -1263,6 +1263,7 @@ def save_image(image_list):
exist_image_list = [str(i.get('id')) for i in
QuerySet(Image).filter(id__in=[i.id for i in image_list]).values('id')]
save_image_list = [image for image in image_list if not exist_image_list.__contains__(str(image.id))]
save_image_list = list({img.id: img for img in save_image_list}.values())
if len(save_image_list) > 0:
QuerySet(Image).bulk_create(save_image_list)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code snippets appear mostly clean to my initial inspection. However, there are a few suggestions for improvement:

  1. In batch_save method:

    • The select_string argument is passed twice (once without it). Ensure that one of them is correct.
  2. In _batch_sync method:

    • This method currently iterates over all documents and checks if each document exists in the database. It then filters out documents that already exist before performing an action. If this logic remains unchanged, consider checking if the IDs provided are actually unique.
  3. When filtering documents by ID in the save_image function:

    • You can use set operations directly on lists in Python, which can make the code more efficient and easier to read.
  4. In save_image function when creating new images:

    • Using {img.id: img for img in save_image_list} creates a dictionary first, then converts it back to a list. Consider using list(set([img.id for img in save_image_list])) directly to avoid creating extra data structures.
  5. General recommendation:

    • Ensure consistency across your methods regarding parameters used, such as whether with_search_one is always included or omitted.
  6. Security note from commit message:

    • Make sure you're handling file paths carefully to prevent security vulnerabilities like path traversal.

Overall, the code seems functional and well-structured. Just ensure that any changes or improvements align with your application's requirements and best practices.

Expand Down