This repository was archived by the owner on Sep 11, 2023. It is now read-only.
  
  
  - 
                Notifications
    
You must be signed in to change notification settings  - Fork 125
 
Object Storage
        Frank Noe edited this page Apr 16, 2016 
        ·
        10 revisions
      
    What? Implement a method to easily save/load PyEMMA high-level objects to/from disk.
Why? To facilitate interactive or scripted analysis in with large datasets. Without adaptations, the behavior of the pickle module is not well-defined because it's a priori not defined which attributes are data belonging to an object, and which are just links to other resources.
- Estimation parameters 
get_params()andset_params(): Input parameters used to construct the estimator object. - Estimation state 
get_state()andset_state(): State variables set by estimation. This includes estimates that connect data and model, such as convergence information. 
Can be mixed in to estimator or standalone.
- Model parameters 
get_model_params()andset_model_params(): Estimated or set parameters of the model. 
All of these are subclass of Models and inherit the model I/O properties.
Estimator/Model save and load:
    from pyemma import msm
    # save parametrized estimator
    mle = msm.estimate_markov_model([1, 0, 0, 0, 1, 1, 0], 1)
    mle.save('msm_mle.pyemma')
    # load parametrized estimator
    mle_recovered = pyemma.load('msm_mle.pyemma')
    mle_recovered.cktest(2)  # this works if estimation data was stored too
    # save just the model
    mle.model.save('msm.pyemma')
    # load just the model
    msm_recovered = pyemma.load('msm.pyemma')
    print msm_recovered.stationary_distribution  # this works with model parameters aloneWe can implement object storage with the pickle or cpickle modules.
- 
__getstate__()and__setstate__()need to be overloaded in order to save/load the desired content of Estimators, Models etc. - Does pickle have efficient protocols (compressed and fast)? Compare to 
np.savez_compressed 
We can implement object storage with np.savez_compressed and np.load
- suggested here: http://www.benfrederickson.com/dont-pickle-your-data/