Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -161,18 +161,30 @@ def _generate_problem_list(
return list(set(problem_list))

def get_details(self, index: int, **kwargs):
paragraph_list = self.context.get('paragraph_list', [])
# 每个文档保留前5个分段
limited_paragraph_list = []
for doc in paragraph_list:
if doc.get('paragraphs'):
doc_copy = doc.copy()
doc_copy['paragraphs'] = doc['paragraphs'][:5]
limited_paragraph_list.append(doc_copy)
else:
limited_paragraph_list.append(doc)
paragraph_list = limited_paragraph_list

return {
'name': self.node.properties.get('stepName'),
"index": index,
'run_time': self.context.get('run_time'),
'type': self.node.type,
'status': self.status,
'err_message': self.err_message,
'paragraph_list': self.context.get('paragraph_list', []),
'paragraph_list': paragraph_list,
'limit': self.context.get('limit'),
'chunk_size': self.context.get('chunk_size'),
'with_filter': self.context.get('with_filter'),
'patterns': self.context.get('patterns'),
'split_strategy': self.context.get('split_strategy'),
'document_list': self.context.get('document_list', []),
# 'document_list': self.context.get('document_list', []),
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code appears to be structured correctly, with no obvious syntax errors. However, there are a few suggestions for improvement:

  1. Variable Naming Consistency: Use consistent naming conventions for variables and parameters across the function.

  2. Error Handling for context Fields: Check if context fields like paragraph_list, document_list, etc., are populated before using them to avoid potential exceptions.

  3. Optimization Suggestion:

    • In _generate_problem_list(), you might want to consider adding more sophisticated logic for handling duplicate items based on unique keys within each document's paragraphs or sections.
      def _generate_problem_list(self) -> List[str]:
          problem_set = set(
              item.get('key') for item in (doc.get('paragraphs', []) + doc.get('sections', [])) for key in item.keys()
          )
          return list(problem_set)

    This approach ensures that duplicates across different sections and paragraphs are counted separately.

  4. Code Formatting: Ensure proper indentation and spacing to improve readability.

Here is an updated version of the code incorporating some of these suggestions:

def generate_problem_list(items_list) -> Set[dict]: 
    problems = set(item.get('problemKey', item.get('name')) for sublist in items_list for item in sublist)

    # Convert set back to sorted list
    problems_sorted = sorted(problems, key=str.lower)
    
    return problems_sorted

# Call method from where this needs to called with appropriate arguments passed in
problems_list = generate_problem_list(your_data)

print(problems_list)

Replace placeholders with actual data processing steps as needed. These adjustments help ensure better functionality, maintainability, and consistency in handling context-related data.

Loading