Refactor and new function suggestions.

The current version of deepks-kit is hard to use and maintain, here are some problems & suggestions for improvement.
# New functions for users
1. Transition of final model. When the trainning process is finished for deepks+abacus, there should be a transition from `model.pth` to `model.pth` in the final step, since the final model is the most frequently used one for latter works.
2. The visualization of scf process. In deepks-abacus, almost the vast majority of the time in the whole iteration is spent in the scf process. However, one can only see the general progress of iteration by watching `RECORD` file. And the `tag_0_finished` files are only generated in the init step, which make it quite difficult to check the trainning process. Accordingly, there should be a convenient way to stop running and restart at any point.
3. The mpi/openmp parallelization of deepks-kit running.
4. Check of data files in `.npy`. There should be a test to check the size of each npy file at the very beginning of running, which could lessen the tedious checking works made by users theirselves.
5. A function to automatically spilt a whole dataset to train set and test set. Currently, users is required to prepare separate npy files for trainning and testing, which brings additional works. It's better to add a function and input parameter for users to split the dataset in ways they prefer directly in deepks-kit.
6. Update of docs. Both user docs and developer docs should be updated.
7. Compact input file. The number of input files is too large, and the parameter list is too long. Users may only need to modify a few parts in actual use, thus it is better to modify the reference file of the input file to retain only the necessary parameters, and put the complete parameter list and explanation in the user document.
8. Dependence update. The current deepks-kit does not support newest version of ***ruamel-yaml*** and ***numpy***.

# Refactor suggestions
1. File structure optimization. At present, the outermost structure is relatively clear, but the specific implementation of each file contains too many functions, resulting in a lot of file content is very long, contains too much content, inconvenient maintenance. It is recommended to separate utils folders and files based on functionality. (For example, train.py contains all training related functions and classes, the class implementation should be split out into a separate file like evaluator.py, etc.)
2. Independent default value files. Currently, function realizations and default value settings (capital naming variables) are written together in different files. It's better to combile all default value lists into one file, which makes it easy to maintain in the future.
3. Make the function and usage of some functions more clear. Some functions integrate multiple functions, but only by the type of input parameters, which makes it difficult to tell what function is used in the actual application of these functions, such as *check_share_folder()*, such functions need to be rethought for the more reasonable implementation.
4. Simplify some functions. Some of the functions, such as *gather_stats_abacus()*, are written in long segments but for similar operations, it's better to simplify.
5. Add the necessary comments and headers.

# Bugs
1. Support for ***pyscf***. Both the **master** branch and the **develop** branch do not support the newest ***pyscf***. Whether to continue the support for pyscf should be taken into concern.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor and new function suggestions. #82

New functions for users

Refactor suggestions

Bugs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor and new function suggestions. #82

Description

New functions for users

Refactor suggestions

Bugs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions