aced-differentiate
diff --git a/‎docs/README.md‎
Lines changed: 4 additions & 1 deletion b/‎docs/README.md‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎docs/Tutorials/pred_h.md‎
Lines changed: 71 additions & 29 deletions b/‎docs/Tutorials/pred_h.md‎
Lines changed: 71 additions & 29 deletions
diff --git a/‎docs/Tutorials/sl.md‎
Lines changed: 115 additions & 22 deletions b/‎docs/Tutorials/sl.md‎
Lines changed: 115 additions & 22 deletions
@@ -16,7 +16,7 @@ For additional details please see the User Guide, Tutorials, and API sections.
 One of the core philosophies of AutoCat is to provide modular and extensible tooling to
 facilitate closed-loop computational materials discovery workflows. Within this submodule 
 are classes for defining a design space, featurization, 
-regression, and defining a closed-loop sequential learning iterator. The 
+regression, selecting candidate systems, and defining a closed-loop sequential learning iterator. The 
 key classes intended for each of these purposes are:
 
 - [**`DesignSpace`**](User_Guide/Learning/sequential#designspace): define a design space to explore
@@ -25,6 +25,9 @@ key classes intended for each of these purposes are:
 
 - [**`Predictor`**](User_Guide/Learning/predictors): a regressor for predicting materials properties
 
+- [**`CandidateSelector`**](User_Guide/Learning/sequential.md#candidateselector): propose candidate system(s) 
+for evaluation
+
 - [**`SequentialLearner`**](User_Guide/Learning/sequential#sequentiallearner): define a closed-loop iterator 
 
 
 
@@ -14,14 +14,14 @@ function.
 ```py
 >>> # Generate the clean single-atom alloy structures
 >>> from autocat.saa import generate_saa_structures
->>> from autocat.utils import extract_structures
+>>> from autocat.utils import flatten_structures_dict
 >>> saa_struct_dict = generate_saa_structures(
 ...     ["Fe", "Cu", "Au"],
 ...     ["Pt", "Pd", "Ni"],
 ...     facets={"Fe":["110"], "Cu":["111"], "Au":["111"]},
 ...     n_fixed_layers=2,
 ... )
->>> saa_structs = extract_structures(saa_struct_dict)
+>>> saa_structs = flatten_structures_dict(saa_struct_dict)
 ```
 
 Now that we have the clean structures, let's adsorb hydrogen on the surface. 
@@ -33,14 +33,10 @@ function.
 ```py
 >>> # Adsorb hydrogen onto each of the generated SAA surfaces
 >>> from autocat.adsorption import place_adsorbate
+>>> from ase import Atoms
 >>> ads_structs = []
 >>> for clean_struct in saa_structs:
-...     ads_dict = place_adsorbate(
-...        clean_struct,
-...        "H",
-...        (0.,0.)
-...     )
-...     ads_struct = extract_structures(ads_dict)[0]
+...     ads_struct = place_adsorbate(clean_struct, Atoms("H"))
 ...     ads_structs.append(ads_struct)
 ```
 
@@ -62,6 +58,16 @@ if any of the labels for a structure are unknown, it can be included as a `numpy
 ```py
 >>> from autocat.learning.sequential import DesignSpace
 >>> design_space = DesignSpace(ads_structs, labels)
+>>> design_space
++-------------------------+-------------------------------------------+
+|                         |                DesignSpace                |
++-------------------------+-------------------------------------------+
+|    total # of systems   |                     9                     |
+| # of unlabelled systems |                     0                     |
+|  unique species present | ['Fe', 'H', 'Pt', 'Pd', 'Ni', 'Cu', 'Au'] |
+|      maximum label      |             1.0173326963281424            |
+|      minimum label      |            -1.4789390894451206            |
++-------------------------+-------------------------------------------+
 ```
 
 ## Setting up a `Predictor`
@@ -71,43 +77,79 @@ When setting up our `Predictor` we now have two choices to make:
 1. The technique to be used for featurizing the systems
 2. The regression model to be used for training and predictions
 
-Internally, the `Predictor` will contain a `Featurizer` object which contains all of 
-our choices for how to featurize the systems. Our choice of featurizer class and 
-the associated kwargs are specified via the `featurizer_class` and 
-`featurization_kwargs` arguments, respectively. By providing the design space structures 
+Internally, the `Predictor` will contain a `Featurizer` object (that the user supplies) 
+which stores all of our choices for how to featurize the systems. Our choice of 
+featurizer class and the associated kwargs are specified via the `featurizer_class` and 
+`kwargs` arguments, respectively. By providing the design space structures 
 some of the kwargs related to the featurization (e.g. maximum structure size) can be 
 automatically obtained.
 
-Similarly, we can specify the regressor to be used within the `model_class` and 
-`model_kwargs` arguments. The class should be "`sklearn`-like" with `fit` and 
-`predict` methods.
+Let's featurize the hydrogen environment via `dscribe`'s `SOAP` class 
+```py
+>>> from autocat.learning.featurizers import Featurizer
+>>> from dscribe.descriptors.soap import SOAP
+>>> featurizer = Featurizer(
+...     featurizer_class=SOAP,
+...     kwargs={"rcut": 7.0, "nmax": 8, "lmax": 8},
+...     design_space_structures=design_space.design_space_structures
+... )
+>>> featurizer
++-----------------------------------+-------------------------------------------+
+|                                   |                 Featurizer                |
++-----------------------------------+-------------------------------------------+
+|               class               |       dscribe.descriptors.soap.SOAP       |
+|               kwargs              |    {'rcut': 7.0, 'nmax': 8, 'lmax': 8}    |
+|            species list           | ['Fe', 'Ni', 'Pt', 'Pd', 'Au', 'Cu', 'H'] |
+|       maximum structure size      |                     37                    |
+|               preset              |                    None                   |
+| design space structures provided? |                    True                   |
++-----------------------------------+-------------------------------------------+
+```
 
-Let's featurize the hydrogen environment via `dscribe`'s `SOAP` class with 
-`sklearn`'s `GaussianProcessRegressor` for regression.
+Similarly, we can specify the regressor to be used. The class should 
+be "`sklearn`-like" with `fit` and `predict` methods.
 
+Here we will use `sklearn`'s `GaussianProcessRegressor` for regression.
 ```py
 >>> from sklearn.gaussian_process import GaussianProcessRegressor
 >>> from sklearn.gaussian_process.kernels import RBF
->>> from dscribe import SOAP
->>> from autocat.learning.predictors import Predictor
 >>> kernel = RBF(1.5)
->>> model_kwargs={"kernel": kernel}
->>> featurization_kwargs={
-...     "design_space_structures": design_space.design_space_structures,
-...     "kwargs": {"rcut": 7.0, "nmax": 8, "lmax": 8}
-... }
+>>> regressor = GaussianProcessRegressor(kernel=kernel)
+```
+
+Now that we have both our `Featurizer` and regressor, we can construct 
+a `Predictor` object.
+
+```py
+>>> from autocat.learning.predictors import Predictor
 >>> predictor = Predictor(
-...     model_class=GaussianProcessRegressor,
-...     model_kwargs=model_kwargs,
-...     featurizer_class=SOAP,
-...     featurization_kwargs=featurization_kwargs,
+...     regressor=regressor,
+...     featurizer=featurizer,
 ... )
+>>> predictor
++-----------+------------------------------------------------------------------+
+|           |                            Predictor                             |
++-----------+------------------------------------------------------------------+
+| regressor | <class 'sklearn.gaussian_process._gpr.GaussianProcessRegressor'> |
+|  is fit?  |                              False                               |
++-----------+------------------------------------------------------------------+
++-----------------------------------+-------------------------------------------+
+|                                   |                 Featurizer                |
++-----------------------------------+-------------------------------------------+
+|               class               |       dscribe.descriptors.soap.SOAP       |
+|               kwargs              |    {'rcut': 7.0, 'nmax': 8, 'lmax': 8}    |
+|            species list           | ['Fe', 'Ni', 'Pt', 'Pd', 'Au', 'Cu', 'H'] |
+|       maximum structure size      |                     37                    |
+|               preset              |                    None                   |
+| design space structures provided? |                    True                   |
++-----------------------------------+-------------------------------------------+
 ```
 
 ## Training and making predictions
 
 With our newly defined `Predictor` we can train it using data from our 
-`DesignSpace` and the `fit` method.
+`DesignSpace` and the `fit` method. Again, please note we are using random labels 
+here, solely for demonstration purposes.
 
 ```py
 >>> train_structures = design_space.design_space_structures[:5]
 
@@ -10,13 +10,13 @@ the structures will be clean mono-elemental surfaces which we can generate via
 ```py
 >>> # Generate the clean surfaces
 >>> from autocat.surface import generate_surface_structures
->>> from autocat.utils import extract_structures
+>>> from autocat.utils import flatten_structures_dict
 >>> surfs_dict = generate_surface_structures(
 ...     ["Pt", "Cu", "Li", "Ti"],
 ...     n_fixed_layers=2,
 ...     default_lat_param_lib="pbe_fd"
 ... )
->>> surfs = extract_structures(surfs_dict)
+>>> surfs = flatten_structures_dict(surfs_dict)
 ```
 
 In this case we specified that the default lattice parameters 
@@ -30,14 +30,24 @@ to your design space!
 ```py
 >>> # Generate the labels for each structure
 >>> import numpy as np
->>> labels = np.random.uniform(-1.5,1.5,size=len(ads_structs))
+>>> labels = np.random.uniform(-1.5,1.5,size=len(surfs))
 ```
 
 Taking the structures and labels we can define our `DesignSpace`.
 
 ```py
 >>> from autocat.learning.sequential import DesignSpace
 >>> design_space = DesignSpace(surfs, labels)
+>>> design_space
++-------------------------+--------------------------+
+|                         |       DesignSpace        |
++-------------------------+--------------------------+
+|    total # of systems   |            10            |
+| # of unlabelled systems |            0             |
+|  unique species present | ['Pt', 'Cu', 'Li', 'Ti'] |
+|      maximum label      |    1.1205404366846423    |
+|      minimum label      |   -1.3259701029215702    |
++-------------------------+--------------------------+
 ```
 
 ## Doing a single simulated sequential learning run
@@ -51,35 +61,117 @@ returned at the end of the run.
 As before, we will need to make choices with regard to the `Predictor` settings. 
 In this case we will use a `SineMatrix` featurizer alongside a `GaussianProcessRegressor`. 
 
+```py
+>>> from sklearn.gaussian_process import GaussianProcessRegressor
+>>> from sklearn.gaussian_process.kernels import RBF
+>>> from dscribe.descriptors.sinematrix import SineMatrix
+>>> from autocat.learning.featurizers import Featurizer
+>>> from autocat.learning.predictors import Predictor
+>>> kernel = RBF(1.5)
+>>> regressor = GaussianProcessRegressor(kernel=kernel)
+>>> featurizer = Featurizer(
+...     featurizer_class=SineMatrix,
+...     design_space_structures=design_space.design_space_structures
+... )
+>>> predictor = Predictor(regressor=regressor, featurizer=featurizer)
+>>> predictor
++-----------+------------------------------------------------------------------+
+|           |                            Predictor                             |
++-----------+------------------------------------------------------------------+
+| regressor | <class 'sklearn.gaussian_process._gpr.GaussianProcessRegressor'> |
+|  is fit?  |                              False                               |
++-----------+------------------------------------------------------------------+
++-----------------------------------+-------------------------------------------+
+|                                   |                 Featurizer                |
++-----------------------------------+-------------------------------------------+
+|               class               | dscribe.descriptors.sinematrix.SineMatrix |
+|               kwargs              |                    None                   |
+|            species list           |          ['Li', 'Ti', 'Pt', 'Cu']         |
+|       maximum structure size      |                     36                    |
+|               preset              |                    None                   |
+| design space structures provided? |                    True                   |
++-----------------------------------+-------------------------------------------+
+```
+
 We also need to select parameters with regard to candidate selection. 
 This includes the acquisition function to be used,  
 target window (if applicable), and number of candidates to pick at each iteration. 
+This can be done via the `CandidateSelector` object.
 Let's use a maximum uncertainty acquisition function to pick candidates based on their 
-associated uncertainty values. We'll also restrict the run to conduct 5 iterations.
+associated uncertainty values. 
+
+```py
+>>> from autocat.learning.sequential import CandidateSelector
+>>> candidate_selector = CandidateSelector(
+...     acquisition_function="MU",
+...     num_candidates_to_pick=1    
+... )
+>>> candidate_selector
++-------------------------------+--------------------+
+|                               | Candidate Selector |
++-------------------------------+--------------------+
+|      acquisition function     |         MU         |
+|    # of candidates to pick    |         1          |
+|         target window         |        None        |
+|          include hhi?         |       False        |
+| include segregation energies? |       False        |
++-------------------------------+--------------------+
+```
+
+Now we have everything we need to conduct a simulated sequential learning loop. 
+We'll restrict the run to conduct 5 iterations.
 
 ```py
->>> from sklearn.gaussian_process import GaussianProcessRegressor
->>> from dscribe import SineMatrix
 >>> from autocat.learning.sequential import simulated_sequential_learning
->>> kernel = RBF(1.5)
->>> model_kwargs = {"kernel": kernel}
->>> featurization_kwargs = {
-...     "design_space_structures": design_space.design_space_structures,
-... }
->>> predictor_kwargs = {
-...     "model_class": GaussianProcessRegressor,
-...     "model_kwargs": model_kwargs,
-...     "featurizer_class": SineMatrix,
-...     "featurization_kwargs": featurization_kwargs
-... }
->>> candidate_selection_kwargs = {"aq": "MU"}
 >>> sim_seq_learn = simulated_sequential_learning(
 ...     full_design_space=design_space,
+...     candidate_selector=candidate_selector,
+...     predictor=predictor,
 ...     init_training_size=1,
 ...     number_of_sl_loops=5,
-...     candidate_selection_kwargs=candidate_selection_kwargs,
-...     predictor_kwargs=predictor_kwargs,
 ... )
+>>> sim_seq_learn
++----------------------------------+--------------------+
+|                                  | Sequential Learner |
++----------------------------------+--------------------+
+|         iteration count          |         6          |
+| next candidate system structures |      ['Cu36']      |
+|  next candidate system indices   |        [5]         |
++----------------------------------+--------------------+
++-------------------------------+--------------------+
+|                               | Candidate Selector |
++-------------------------------+--------------------+
+|      acquisition function     |         MU         |
+|    # of candidates to pick    |         1          |
+|         target window         |        None        |
+|          include hhi?         |       False        |
+| include segregation energies? |       False        |
++-------------------------------+--------------------+
++-------------------------+--------------------------+
+|                         |       DesignSpace        |
++-------------------------+--------------------------+
+|    total # of systems   |            10            |
+| # of unlabelled systems |            4             |
+|  unique species present | ['Pt', 'Cu', 'Li', 'Ti'] |
+|      maximum label      |    0.9712050050259604    |
+|      minimum label      |   -1.3259701029215702    |
++-------------------------+--------------------------+
++-----------+------------------------------------------------------------------+
+|           |                            Predictor                             |
++-----------+------------------------------------------------------------------+
+| regressor | <class 'sklearn.gaussian_process._gpr.GaussianProcessRegressor'> |
+|  is fit?  |                               True                               |
++-----------+------------------------------------------------------------------+
++-----------------------------------+-------------------------------------------+
+|                                   |                 Featurizer                |
++-----------------------------------+-------------------------------------------+
+|               class               | dscribe.descriptors.sinematrix.SineMatrix |
+|               kwargs              |                    None                   |
+|            species list           |          ['Li', 'Ti', 'Pt', 'Cu']         |
+|       maximum structure size      |                     36                    |
+|               preset              |                    None                   |
+| design space structures provided? |                    True                   |
++-----------------------------------+-------------------------------------------+
 ```
 
 Within the returned `SequentialLearner` object we now have information we can use 
@@ -97,12 +189,13 @@ of running in parallel (since this is an embarrassingly parallel operation). Her
 three independent runs in serial. 
 
 ```py
+>>> from autocat.learning.sequential import multiple_simulated_sequential_learning_runs
 >>> runs_history = multiple_simulated_sequential_learning_runs(
 ...     full_design_space=design_space,
+...     candidate_selector=candidate_selector,
+...     predictor=predictor,
 ...     init_training_size=1,
 ...     number_of_sl_loops=5,
-...     candidate_selection_kwargs=candidate_selection_kwargs,
-...     predictor_kwargs=predictor_kwargs,
 ...     number_of_runs=3,
 ...     # number_of_parallel_jobs=N if you wanted to run in parallel
 ... )