From dafcc80d650d9c6aa324248cf8bb9e1183ff5199 Mon Sep 17 00:00:00 2001 From: Atharva Rai Date: Tue, 20 May 2025 13:52:00 +0530 Subject: [PATCH 1/6] Add chemical properties description to csd-1000r dataset documentation --- src/skmatter/datasets/descr/csd-1000r.rst | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/src/skmatter/datasets/descr/csd-1000r.rst b/src/skmatter/datasets/descr/csd-1000r.rst index 7eed4e4f8..ccb3336d6 100644 --- a/src/skmatter/datasets/descr/csd-1000r.rst +++ b/src/skmatter/datasets/descr/csd-1000r.rst @@ -43,6 +43,19 @@ The representations were computed with [C1]_ using the hyperparameters: Of the 2'520 resulting features, 100 were selected via FPS using [C2]_. +Chemical Properties +------------------- + +The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures in the Cambridge Structural Database (CSD). These environments represent a diverse set of chemical compositions and bonding types, including: + +- Metals, metalloids, and non-metals +- Covalent, ionic, and metallic bonding environments +- Various coordination numbers and geometries + +The dataset captures local chemical environments relevant for modeling properties such as nuclear magnetic resonance (NMR) chemical shieldings, aiding in the understanding of structure-property relationships in materials chemistry. + +For more detailed chemical information, users can refer to the original Cambridge Structural Database or the publication by Ceriotti et al. (2019). + References ---------- From 2a459143e7428c2f30159322258f06e6f5a278d6 Mon Sep 17 00:00:00 2001 From: Atharva Rai Date: Thu, 22 May 2025 11:00:53 +0530 Subject: [PATCH 2/6] Add CSD website and publication link to CSD-1000R dataset description --- src/skmatter/datasets/descr/csd-1000r.rst | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/skmatter/datasets/descr/csd-1000r.rst b/src/skmatter/datasets/descr/csd-1000r.rst index ccb3336d6..2dba95eeb 100644 --- a/src/skmatter/datasets/descr/csd-1000r.rst +++ b/src/skmatter/datasets/descr/csd-1000r.rst @@ -46,7 +46,8 @@ Of the 2'520 resulting features, 100 were selected via FPS using [C2]_. Chemical Properties ------------------- -The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures in the Cambridge Structural Database (CSD). These environments represent a diverse set of chemical compositions and bonding types, including: +The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures in the `Cambridge Structural Database (CSD) . These environments represent a diverse set of chemical compositions and bonding types, including: + - Metals, metalloids, and non-metals - Covalent, ionic, and metallic bonding environments @@ -54,7 +55,8 @@ The CSD-1000R dataset consists of 100 atomic environments selected from crystal The dataset captures local chemical environments relevant for modeling properties such as nuclear magnetic resonance (NMR) chemical shieldings, aiding in the understanding of structure-property relationships in materials chemistry. -For more detailed chemical information, users can refer to the original Cambridge Structural Database or the publication by Ceriotti et al. (2019). +For more detailed chemical information, users can refer to the original Cambridge Structural Database or the `publication by Ceriotti et al. (2019) `_. + References ---------- From c2815fdbf95495ad63311df10dc92dde59acd700 Mon Sep 17 00:00:00 2001 From: Atharva Rai Date: Thu, 22 May 2025 11:13:58 +0530 Subject: [PATCH 3/6] Add CSD website and publication link to CSD-1000R dataset description --- src/skmatter/datasets/descr/csd-1000r.rst | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/src/skmatter/datasets/descr/csd-1000r.rst b/src/skmatter/datasets/descr/csd-1000r.rst index 2dba95eeb..2174d1563 100644 --- a/src/skmatter/datasets/descr/csd-1000r.rst +++ b/src/skmatter/datasets/descr/csd-1000r.rst @@ -46,8 +46,7 @@ Of the 2'520 resulting features, 100 were selected via FPS using [C2]_. Chemical Properties ------------------- -The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures in the `Cambridge Structural Database (CSD) . These environments represent a diverse set of chemical compositions and bonding types, including: - +The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures in the Cambridge Structural Database (CSD) [C3]_. These environments represent a diverse set of chemical compositions and bonding types, including: - Metals, metalloids, and non-metals - Covalent, ionic, and metallic bonding environments @@ -55,7 +54,8 @@ The CSD-1000R dataset consists of 100 atomic environments selected from crystal The dataset captures local chemical environments relevant for modeling properties such as nuclear magnetic resonance (NMR) chemical shieldings, aiding in the understanding of structure-property relationships in materials chemistry. -For more detailed chemical information, users can refer to the original Cambridge Structural Database or the `publication by Ceriotti et al. (2019) `_. +For more detailed chemical information, users can refer to the original Cambridge Structural Database [C3]_ or the publication by Ceriotti et al. (2019) [C4]_ + References @@ -63,6 +63,10 @@ References .. [C1] https://github.com/lab-cosmo/librascal commit ade202a6 .. [C2] https://github.com/lab-cosmo/scikit-matter commit 4ed1d92 +.. [C3] https://www.ccdc.cam.ac.uk/structures/ +.. [C4] https://www.nature.com/articles/s41597-019-0224-1 + + Reference Code -------------- From ec231bffb1ea9ee9578b0c65ae855e0df18cd21d Mon Sep 17 00:00:00 2001 From: Atharva Rai Date: Thu, 22 May 2025 11:42:23 +0530 Subject: [PATCH 4/6] Fix RST formatting and citation issues in csd-1000r.rst --- src/skmatter/datasets/descr/csd-1000r.rst | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/skmatter/datasets/descr/csd-1000r.rst b/src/skmatter/datasets/descr/csd-1000r.rst index 2174d1563..d66178ff9 100644 --- a/src/skmatter/datasets/descr/csd-1000r.rst +++ b/src/skmatter/datasets/descr/csd-1000r.rst @@ -32,7 +32,7 @@ The representations were computed with [C1]_ using the hyperparameters: +---------------------------+------------+ | max_angular: | 6 | +---------------------------+------------+ -| gaussian_sigma_constant": | 0.4 | +| gaussian_sigma_constant: | 0.4 | +---------------------------+------------+ | gaussian_sigma_type: | "Constant"| +---------------------------+------------+ @@ -123,3 +123,6 @@ Reference Code properties_select = [ frames[fi].arrays["CS_local"][ci] for fi, ci in zip(f_selected, ci_selected) ] + + +.. [Ceriotti2019] Ceriotti, M. et al. Science Advances, 2019. From 9defa7f536d241059246e5d757bb991906afbe93 Mon Sep 17 00:00:00 2001 From: Atharva Rai Date: Thu, 22 May 2025 12:54:37 +0530 Subject: [PATCH 5/6] docs: update CSD-1000R dataset description with improved formatting and citations --- src/skmatter/datasets/descr/csd-1000r.rst | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/src/skmatter/datasets/descr/csd-1000r.rst b/src/skmatter/datasets/descr/csd-1000r.rst index d66178ff9..b40b241b4 100644 --- a/src/skmatter/datasets/descr/csd-1000r.rst +++ b/src/skmatter/datasets/descr/csd-1000r.rst @@ -25,7 +25,7 @@ The representations were computed with [C1]_ using the hyperparameters: +---------------------------+------------+ | key | value | -+---------------------------+------------+ ++===========================+============+ | interaction_cutoff: | 3.5 | +---------------------------+------------+ | max_radial: | 6 | @@ -46,17 +46,20 @@ Of the 2'520 resulting features, 100 were selected via FPS using [C2]_. Chemical Properties ------------------- -The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures in the Cambridge Structural Database (CSD) [C3]_. These environments represent a diverse set of chemical compositions and bonding types, including: +The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures +in the Cambridge Structural Database (CSD) [C3]_. These environments represent a diverse set +of chemical compositions and bonding types, including: - Metals, metalloids, and non-metals - Covalent, ionic, and metallic bonding environments - Various coordination numbers and geometries -The dataset captures local chemical environments relevant for modeling properties such as nuclear magnetic resonance (NMR) chemical shieldings, aiding in the understanding of structure-property relationships in materials chemistry. - -For more detailed chemical information, users can refer to the original Cambridge Structural Database [C3]_ or the publication by Ceriotti et al. (2019) [C4]_ - +The dataset captures local chemical environments relevant for modeling properties such as +nuclear magnetic resonance (NMR) chemical shieldings, aiding in the understanding of +structure-property relationships in materials chemistry. +For more detailed chemical information, users can refer to the original Cambridge +Structural Database [C3]_ or the publication by Ceriotti et al. (2019) [C4]_. References ---------- @@ -65,8 +68,8 @@ References .. [C2] https://github.com/lab-cosmo/scikit-matter commit 4ed1d92 .. [C3] https://www.ccdc.cam.ac.uk/structures/ .. [C4] https://www.nature.com/articles/s41597-019-0224-1 - - +.. [Ceriotti2019] M. Ceriotti et al. "Chemical Shifts in Molecular Solids by Machine Learning Datasets", + Materials Cloud Archive 2019.0023/v2 (2019), https://doi.org/10.24435/materialscloud:2019.0023/v2. Reference Code -------------- @@ -123,6 +126,3 @@ Reference Code properties_select = [ frames[fi].arrays["CS_local"][ci] for fi, ci in zip(f_selected, ci_selected) ] - - -.. [Ceriotti2019] Ceriotti, M. et al. Science Advances, 2019. From fab9f12bd024f34b1c191dce4852bca44ad2124f Mon Sep 17 00:00:00 2001 From: Atharva Rai Date: Sat, 24 May 2025 14:44:05 +0530 Subject: [PATCH 6/6] fix: remove duplicate citation and fix linting issues in csd-1000r.rst --- src/skmatter/datasets/descr/csd-1000r.rst | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/src/skmatter/datasets/descr/csd-1000r.rst b/src/skmatter/datasets/descr/csd-1000r.rst index b40b241b4..f1e070338 100644 --- a/src/skmatter/datasets/descr/csd-1000r.rst +++ b/src/skmatter/datasets/descr/csd-1000r.rst @@ -46,19 +46,19 @@ Of the 2'520 resulting features, 100 were selected via FPS using [C2]_. Chemical Properties ------------------- -The CSD-1000R dataset consists of 100 atomic environments selected from crystal structures -in the Cambridge Structural Database (CSD) [C3]_. These environments represent a diverse set -of chemical compositions and bonding types, including: +The CSD-1000R dataset consists of 100 atomic environments selected from crystal +structures in the Cambridge Structural Database (CSD) [C3]_. These environments +represent a diverse set of chemical compositions and bonding types, including: - Metals, metalloids, and non-metals - Covalent, ionic, and metallic bonding environments - Various coordination numbers and geometries -The dataset captures local chemical environments relevant for modeling properties such as -nuclear magnetic resonance (NMR) chemical shieldings, aiding in the understanding of -structure-property relationships in materials chemistry. +The dataset captures local chemical environments relevant for modeling properties +such as nuclear magnetic resonance (NMR) chemical shieldings, aiding in the +understanding of structure-property relationships in materials chemistry. -For more detailed chemical information, users can refer to the original Cambridge +For more detailed chemical information, users can refer to the original Cambridge Structural Database [C3]_ or the publication by Ceriotti et al. (2019) [C4]_. References @@ -68,8 +68,6 @@ References .. [C2] https://github.com/lab-cosmo/scikit-matter commit 4ed1d92 .. [C3] https://www.ccdc.cam.ac.uk/structures/ .. [C4] https://www.nature.com/articles/s41597-019-0224-1 -.. [Ceriotti2019] M. Ceriotti et al. "Chemical Shifts in Molecular Solids by Machine Learning Datasets", - Materials Cloud Archive 2019.0023/v2 (2019), https://doi.org/10.24435/materialscloud:2019.0023/v2. Reference Code --------------