[WIP]feat(eval): support evaluation visualization#107
[WIP]feat(eval): support evaluation visualization#107zhijianma wants to merge 11 commits intoagentscope-ai:mainfrom
Conversation
Summary of ChangesHello @zhijianma, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the application's capability to display and analyze evaluation data. It provides users with a rich, interactive interface to import evaluation results, view high-level summaries, dive into specific metrics, and trace the execution paths of individual tasks. The changes span both frontend and backend, establishing a robust foundation for future evaluation-centric features. Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new evaluation visualization feature, including new UI components, data models, and API endpoints. The changes are extensive and well-structured, providing a comprehensive view of evaluation results and task details. The use of context providers for managing evaluation and task data is a good architectural choice. There are a few areas where the code could be improved for robustness, consistency, and maintainability.
| // TODO: check if the request is from localhost | ||
|
|
There was a problem hiding this comment.
| children: item.isDirectory ? [] : undefined, | ||
| isLeaf: !item.isDirectory, | ||
| icon: item.isDirectory ? <FolderOutlined /> : <FileOutlined />, | ||
| selectable: item.isDirectory, |
There was a problem hiding this comment.
The selectable property is hardcoded to item.isDirectory. This means only directories will be selectable in the tree. However, the Props interface allows type: 'file' | 'directory' | 'both'. If the type prop is 'file' or 'both', this logic would be incorrect. The selectable property should dynamically depend on the type prop to allow selection of files or both files and directories as needed.
| callback({ | ||
| success: true, | ||
| message: | ||
| 'Directory listed successfully', | ||
| data: { | ||
| title: fileName, | ||
| isDirectory: stats.isDirectory(), | ||
| modifiedTime: stats.mtime, | ||
| }, | ||
| } as ResponseBody); | ||
| }); |
There was a problem hiding this comment.
The callback function is being called inside the map function, which means it will be invoked for each file found in the directory. This is incorrect, as a socket callback should typically be called only once with the complete result. The callback should be called after the map operation, passing the entire fileNames array as data.
| // checked={isChecked(tagRecord.tag, tableRequestParams)} | ||
| checked={( | ||
| (tableRequestParams | ||
| .filters?.tags | ||
| ?.value as string[]) || | ||
| [] | ||
| ).includes(tagRecord.tag)} |
There was a problem hiding this comment.
| const formatNumber = (num: number, decimals: number = 6): number => { | ||
| return parseFloat(num.toFixed(decimals)); | ||
| }; |
There was a problem hiding this comment.
| } catch (error) { | ||
| console.error('Error in getEvaluationTask:', error); | ||
| } |
| const requiredBenchmarkFields = [ | ||
| 'name', | ||
| 'description', | ||
| // 'total_tasks', |
| if (fs.existsSync(dirPath)) { | ||
| // 获取该目录下所有的文件和文件夹,只获取他们的名字,是否是文件夹,修改时间 |
There was a problem hiding this comment.
| window.removeEventListener('resize', handleResize); | ||
| chartInstanceRef.current?.dispose(); | ||
| }; | ||
| }, [theme, onChartReady]); |
There was a problem hiding this comment.
| socket.on( | ||
| SocketEvents.client.getEvaluationResult, | ||
| async ( | ||
| evaluationDir: string, | ||
| callback: (res: ResponseBody) => void, | ||
| ) => { | ||
| try { | ||
| const data = await FileDao.getJSONFile<EvalResult>( | ||
| path.join(evaluationDir, 'evaluation_result.json'), | ||
| ); | ||
| callback({ | ||
| success: true, | ||
| message: 'Get evaluation result successfully', | ||
| data: data, | ||
| } as ResponseBody); | ||
| } catch (error) { | ||
| console.error(error); | ||
| callback({ | ||
| success: false, | ||
| message: `Error: ${error}`, | ||
| } as ResponseBody); | ||
| } | ||
| }, |
There was a problem hiding this comment.
Description
[Please describe the background, purpose, changes made, and how to test this PR]
Checklist
Please check the following items before code is ready to be reviewed.
npm run formatcommand in the root directory