-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: Intent classify #4026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Intent classify #4026
Conversation
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone through the provided code closely, and here's my analysis:
The code looks generally solid but could benefit from some improvements, especially in terms of code structure, readability, and performance. Below are the main points to consider for refactoring:
Issues and Improvements
1. Code Formatting
- The file is missing indentation levels. Ensure consistent indentation throughout.
- Add blank lines around function definitions and loops to improve readability.
2. Useful Comments
- Add comments where complex logic happens or explains decisions made, especially important parts like exception handling and data processing steps.
3. Reuse Functions Where Possible
- Combine similar functions together if they share similar functionality. For example,
_write_contextcan be further simplified if thenode_variable',workflow_variable', andworkparameters do not change between calls.
4. Error Handling Logic
- Refactor error logging and exception messaging to ensure consistent and clear information about errors.
5. Readability Enhancements
- Break down longer blocks into smaller functions with descriptive names.
- Extract common methods that perform repeated tasks into separate helper methods.
6. Docstring Consistency
- Maintain consistency on docstrings across all classes and methods to clarify their purpose and usage.
Specific Changes
Here's an improved version with these suggestions applied:
# coding=utf-8
import json
import re
from typing import List, Dict, Any
from django.db.models import QuerySet
from langchain.schema import HumanMessage, SystemMessage
from application.flow.i_step_node import INode, NodeResult
from application.flow.step_node.intent_node.i_intent_node import IIntentNode
from models_provider.models import Model
from models_provider.tools import get_model_instance_by_model_workspace_id, get_model_credential
from .prompt_template import PROMPT_TEMPLATE
def get_default_model_params_setting(model_id):
"""
Retrieve default model parameter settings based on model ID.
"""
model = QuerySet(Model).filter(id=model_id).first()
credential = get_model_credential(model.provider, model.model_type, model.model_name)
return credential.get_model_params_setting_form(model.model_name).get_default_form_data()
def write_context(node_variable: Dict, workflow_variable: Dict, node: INode, workflow, answer: str):
"""
Write context containing relevant details after generating an answer.
"""
chat_model = node_variable.get('chat_model')
message_tokens = chat_model.get_num_tokens_from_messages(node_variable.get('message_list'))
answer_tokens = chat_model.get_num_tokens(answer)
node.context['message_tokens'] = message_tokens
node.context['answer_tokens'] = answer_tokens
node.context['answer'] = answer
node.context['history_message'] = node_variable['history_message']
node.context['user_input'] = node_variable['user_input']
node.context['branch_id'] = node_variable.get('branch_id')
node.context['reason'] = node_variable.get('reason')
node.context['category'] = node_variable.get('category')
node.context['run_time'] = time.time() - node.context['start_time']
class BaseIntentNode(INode, IIntentNode):
def save_context(self, details, workflow_manage):
"""
Save additional execution details into node context.
"""
self.context['branch_id'] = details.get('branch_id')
self.context['category'] = details.get('category')
def execute(self, model_id, dialogue_number, history_chat_record, user_input, branch,
model_params_setting=None, **kwargs) -> NodeResult:
"""
Execute the intent recognition process for a specific dialogue turn.
"""
# Set default model params setting if none provided
if model_params_setting is None:
model_params_setting = get_default_model_params_setting(model_id)
# Fetch the chat model instance using the provided workspace id and model params
workspace_id = workflow_manage.get_body().get('workspace_id')
chat_model = get_model_instance_by_model_workspace_id(
model_id, workspace_id, **model_params_setting
)
# Get the historical conversation messages up to the current dialogue number
history_message = self.get_history_message(history_chat_record, dialogue_number)
self.context['history_message'] = history_message
# Store the user input in the context
self.context['user_input'] = user_input
# Generate and format the classification prompt
prompt = self.build_classification_prompt(user_input, branch)
system_prompt = self.build_system_prompt()
# Construct the full set of dialogues for LLM call
message_list = self.generate_message_list(system_prompt, prompt, history_message)
self.context['message_list'] = message_list
# Call the model to classify the user's request
try:
r = chat_model.invoke(message_list)['content'].strip()
classification_result = json.loads(r)[0]['classificationId']
# Parse and select appropriate classification from the result
selected_branch = None
for branche in branch:
if (
classification_result == branche.get('classificationId') and
not branche.get('isOther') or
classification_result == 0
):
selected_branch = branche
break
# Return formatted result
return NodeResult({
'result': r,
'chat_model': chat_model,
'message_list': message_list,
'history_message': history_message,
'user_input': user_input,
'branch_id': selected_branch.get('id'),
'category': selected_branch.get('content')
}, {},
lambda variable, value: write_context(variable, value, self))
except Exception as e:
# Handle exceptions by falling back to "other" branch
other_branch = self.find_other_branch(branch)
if other_branch:
return NodeResult({
'branch_id': other_branch.get('id'),
'category': other_branch.get('content'),
'error': str(e),
'system': system_prompt
}, {})
else:
raise Exception("Error executing intent recognition!")
@staticmethod
def get_history_message(history_chat_record, dialogue_number, max_history_length=7):
# Function to extract the last `max_history_length` messages from the chat history
start_index = len(history_chat_record) - dialogue_number
return [
HumanMessage(chat_content=message['human']) for message in history_chat_record[start_index:min(len(history_chat_record), max_history_length)]
]
def build_system_prompt(self) -> str:
# System prompt template
return "您是一名专业的意图识别助手,请基于用户输入和意图选项,准确识别用户的真实意图。"
def generate_message_list(self, system_prompt: str, query_prompt: str, history_messages: List[HumanMessage]):
# Create a list for the messages to send to the GPT model
messages = []
if system_prompt:
messages.append(SystemMessage(content=self.workflow_manage.generate_prompt(system_prompt)))
messages.extend([
HumanMessage(role=response["role"], content=response["content"])
for response in history_messages
])
messages.append(HumanMessage(role="user", content=query_prompt))
self.context['system'] = ""
return messages
def find_other_branch(self, branches):
# Find "other" branch from available options
return next((branche for branche in branches if branche.get("isOther")), {"id": "None"})This revision provides clearer documentation, more readable formatting, and better separation of concerns through method extraction and improved commentation, making it easier to maintain and understand.
| type: 'intent-node', | ||
| model: IntentModel, | ||
| view: IntentNode | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code provided contains several issues that need to be addressed:
-
Component Not Exported: The
IntentNodecomponent is imported but not exported in the module exports list. -
Type Declaration: There seems to be a typo or an issue with declaring the class methods (
refreshBranch,getDefaultAnchor) as they might not follow standard JavaScript conventions. -
Path Update Method: The
updatePathByAnchor()method should potentially accept parameters such as new anchor coordinates if needed, especially during updates. -
Undefined Height Handling: When checking if the node's height is undefined and setting it to 200 pixels, ensure that this logic is correctly applied within the scope where you define properties like
height. -
Anchors Logic: Ensure that all operations related to anchors respect their positions relative to the parent node’s boundaries correctly.
-
Documentation and Comments: Add detailed comments explaining each part of the code to make it easier for others (or yourself in the future) to understand its functionality.
Here is an updated version of the TypeScript/JavaScript file based on these considerations:
// Import necessary components/models/modules from other files here
import IntentNodeVue from './index.vue';
import { AppNode, AppNodeModel } from '@/workflow/common/app-node';
/**
* Custom intent action node implementation extending base AppNode.
*/
class IntentNode extends AppNode {
constructor(props: any) {
super(props, IntentNodeVue); // Call superclass constructor
}
}
/**
* IntentActionNodeModel manages data management for IntentActionNodes.
*/
class IntentModel extends AppNodeModel {
/**
* Refreshes paths connected to incoming/outgoing edges when model changes.
*
* @method refreshBranch
*/
refreshBranch(): void {
this.incoming.edges.forEach((edge: any): void => edge.updatePathByAnchor());
this.outgoing.edges.forEach((edge: any): void => edge.updatePathByAnchor());
}
/**
* Returns default array of anchors (ports/nodes).
*
* @returns An array of Anchor objects.
*/
getDefaultAnchor(): Array<any> {
const {
id,
x,
y,
width,
height,
properties: branchConditionList
} = this;
if (!this.height || this.height <= 0) {
this.height = 200;
}
const showNode = this.properties.showNode !== null && this.properties.showNode === true? true:false;
let anchors: Array<any>[] = [{
id: `${id}_left`,
x: x - width / 2 + 10,
y: showNode ? y - 15 : y - height / 2 + 20,// Adjusts space between nodes
type: 'left',
edgeAddable: false
}];
if (branchConditionList?.length) {
const FORM_ITEMS_HEIGHT = 382;
for (let index = 0; index < branchConditionList.length; index++) {
let element = branchConditionList[index];
// Calculate upward offset considering previous elements' heights
const h = getUpIndexHeight(branchConditionList, index);
anchors.push ({
id: `${id}_${element.id}_right`,
x: x + width / 2 - 10,
y: showNode
? Math.min(y - height / 2 + FORM_ITEMS_HEIGHT + h + element.height / 2, window.innerHeight - 10)
: y - 15,
position: "absolute", // Ensures anchoring remains fixed even after window resizing
type: 'right'
});
}
}
return anchors;
}
}
function getUpIndexHeight(branchLsit: Array<any>, index: number) {
return branchLsit.slice(0, index).reduce ((acc, curr) => acc + curr.height + 8, 0);
}
export default {
type: 'intent-node', // Type identifier/name given for the node kind.
model: IntentModel, // Class reference/model implementation used behind our Vue wrapper/view rendering system.
view: IntentNode // Vue.js Component definition representing UI structure/layout for said node instance(s)
};Additional Tips:
- Always use named functions instead of arrow function expressions when defining class methods unless there's a specific reason for keeping them anonymous.
- Properly handle variable naming conventions according to industry standards.
- Implement proper error handling around logic that could throw exceptions due to invalid conditions.
- If path updating requires dynamic calculations based on different factors (like viewport size), make sure those calculations are reliable and responsive to state changes appropriately.
These tweaks will help improve maintainability and correctness significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code has several improvements that can be made to enhance its structure, readability, and maintainability:
-
Imports and Comments: The code includes a
# noqacomment at the end of the file to suppress flake8 errors related to missing newline characters. This is not necessary in modern Python syntax. -
Comments: Some comments are redundant or misspelled. They should be revised for clarity and accuracy. For example,
_model_params_setting_could be renamed tomodel_params. -
Function Documentation: While some functions have docstrings, they are minimal. Adding more detailed descriptions could help others understand their purpose better.
-
Class Implementations: The class methods need to include actual implementations for logic such as
save_context,get_node_params_serializer_class,_run, andexecute. These will depend on how you plan to handle these functionalities within your application. -
Error Handling: If the implementation requires handling exceptions or specific conditions, appropriate error messages or exception handling should be added.
Here's an updated version of the code with improved comments, function documentation, and potential implementation suggestions:
#!/usr/bin/env python3
# coding=utf-8
from typing import Type
from django.utils.translation import gettext_lazy as _
from rest_framework import serializers
from application.flow.i_step_node import INode, NodeResult
class IntentBranchSerializer(serializers.Serializer):
"""
Serializer for intent branch data.
"""
id = serializers.CharField(required=True, label=_("Branch ID"))
content = serializers.CharField(required=True, label=_("Content"))
is_other = serializers.BooleanField(required=True, label=_("Branch Type"))
class IntentNodeSerializer(serializers.Serializer):
"""
Serializer for intent node data.
"""
model_id = serializers.CharField(required=True, label=_("Model ID"))
content_list = serializers.ListField(required=True, label=_("Text Content"))
dialogue_number = serializers.IntegerField(required=True, label=_("Number of Multi-Round Conversations"))
model_params_setting = serializers.DictField(required=False, label=_("Model Parameter Settings"))
branch = IntentBranchSerializer(many=True)
class IIntentNode(INode):
"""
Base interface for intent nodes.
"""
type = 'intent-node'
def save_context(self, details, workflow_manage):
# Save node-specific context if needed
pass
def get_node_params_serializer_class(self) -> Type[serializers.Serializer]:
return IntentNodeSerializer
async def _run(self):
"""
Run the intent node.
Retrieve the reference field using the first two elements from the text content list.
Execute the node based on parameters and input.
"""
try:
question = await self.workflow_manage.get_reference_field(
self.node_params_serializer.data.get('content_list')[0],
self.node_params_serializer.data.get('content_list')[1:],
)
return await self.execute(
model_id=self.node_params_serializer.data['model_id'],
dialogue_number=self.node_params_serializer.data['dialogue_number'],
history_chat_record=await self.retrieve_history_chat_record(),
user_input=str(question),
branch=self.node_params_serializer.data['branch']
)
except Exception as e:
# Handle potential exceptions here, log them, etc.
raise RuntimeError(f"An error occurred while running the node: {e}")
async def execute(self, model_id: str, dialogue_number: int, history_chat_record: dict, user_input: str, branch=None, model_params_setting=None, **kwargs) -> Optional[dict]:
"""
Perform the execution of the intent node.
Parameters:
- model_id (str): The identifier for the model used.
- dialogue_number (int): The number of multi-round conversations.
- history_chat_record (dict): A dictionary containing chat records up to this point.
- user_input (str): The input received from the user.
- branch (list): List of branches.
- model_params_setting (dict, optional): Additional settings for the model configuration.
Returns:
- Optional[dict]: Result of the node execution, possibly None depending on the use case.
"""
# Placeholder for node execution logic
result = {
"node_type": self.type,
"user_input": user_input,
"history_chats": history_chat_record,
"output": f"Executed intent node: {question}"
}
return result
@staticmethod
async def retrieve_history_chat_record() -> dict:
"""
Retrieve historical chat records from the workflow manage service.
"""
# This method should be implemented according to the workflow management system.
passKey Changes:
- Added comments and detailed documentation for each function.
- Enhanced comments about potential edge cases and error handling.
- Used asynchronous keywords (
async) where applicable. - Updated serialization fields and methods to adhere to common Django REST Framework conventions.
feat: Intent classify