Skip to content

Property prediction model training fails when training data is much similar #5021

@Chengqian-Zhang

Description

@Chengqian-Zhang

Summary

A researcher trains a d33(piezoelectric coefficient) prediction model with 2 training data using property prediction head. The two data is perovskite-type(ABO3). The composition(Pb1000Ti340Nb400Mg140In120O3000) and the spatial arrangement of atoms of the two data are completely the same. The only difference between the two data is that Ti,Nb,Mg,In randomly occupy the B sites in 10✖️10✖️10 perovskite supercells. The two pictures shows the x-direction projection of B sites in the two data. I use StructureMatcher of pymatgen to confirm that the two structure can not match. The label of the two data is 411 and 587, respectively.

Image Image

I use the settings of DPA-3.1-3M and train 1000 steps(500 epoch), but find there is no precision, the dp test result is

# d33_database/1 - 0: data_property pred_property
4.111499938964843750e+02 4.762894698179545685e+02
# d33_database/2 - 0: data_property pred_property
5.865300292968750000e+02 4.762894713410120744e+02

It seems that dp model can not distinguish the two similar structure.

The reproduce input.json and data file are:
input.json
d33_database.tgz

DeePMD-kit Version

3.1.0

Backend and its version

pytorch

Python Version, CUDA Version, GCC Version, LAMMPS Version, etc

Python version: 3.12.11
CUDA Version: 12.4

Details

See above

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions