1
- ## P2B2: Autoencoder Compressed Representation for Molecular Dynamics Simulation Data
1
+ ## P2B2: Predictive, recurrent, Autoencoder Compressed Representation for Molecular Dynamics Simulation Data
2
2
3
3
** Overview** : Cut down on manual inspection time for molecular simulation data
4
4
9
9
### Benchmark Specs Requirements
10
10
11
11
#### Description of the Data
12
- * Data source: MD Simulation output as PDB files (coarse-grained bead simulation)
13
- * Input dimensions: ~ 1.26e6 per time step (6000 lipids x 30 beads per lipid x (position + velocity + type))
14
- * Output dimensions: 1xN_Frame (N=100 hidden units)
15
- * Latent representation dimension:
16
- * Sample size: O(10^6) for simulation requiring O(10^8) time steps
17
- * Notes on data balance and other issues: unlabeled data with rare events
12
+ * See Pilot2 Readme for description
18
13
19
14
#### Expected Outcomes
20
15
* 'Telescope' into data: Find regions of interest based on higher level of structure than rest of regions
@@ -35,5 +30,46 @@ Using virtualenv
35
30
```
36
31
cd P2B2
37
32
workon keras
38
- python __main__ .py --home-dir=${HOME}/.virtualenvs/keras/lib/python2.7/site-packages --look-back 15 --train --epochs 20 --learning-rate 0.01 --cool --seed --batch-size 10 --seed
33
+ python p2b2_baseline_keras1 .py
39
34
```
35
+ ### Scaling Options
36
+ * ``` --case=FULL ``` Design autoencoder for data frame with coordinates for all beads
37
+ * ``` --case=CENTER ``` Design autoencoder for data frame with coordinates of the center-of-mass
38
+ * ``` --case=CENTERZ ``` Design autoencoder for data frame with z-coordiate of the center-of-mass
39
+
40
+ ### Expected Results
41
+
42
+ (keras) vanessen1@vandamme: ~ /Research/DeepLearning/ECP CANDLE/Benchmarks/Benchmarks.git/Pilot2/P2B2$ python p2b2_baseline_keras1.py
43
+ Using Theano backend.
44
+ {'num_hidden': [ ] , 'num_recurrent': [ 16, 16, 16] , 'noise_factor': 0, 'learning_rate': 0.01, 'batch_size': 32, 'look_forward': 1, 'epochs': 1, 'weight_decay': 0.0005, 'look_back': 10, 'cool': 'True'}
45
+ Reading Data...
46
+ Reading Data Files... 3k_Disordered->3k_run10_10us.35fs-DPPC.10-DOPC.70-CHOL.20.dir
47
+ ('X_train type and shape:', dtype('float64'), (89, 10, 3040))
48
+ ('X_train.min():', 38.831248919169106)
49
+ ('X_train.max():', 100.46649742126465)
50
+ Define the model and compile
51
+ using mlp network
52
+ Autoencoder Regression problem
53
+ ____________________________________________________________________________________________________
54
+ Layer (type) Output Shape Param # Connected to
55
+ ====================================================================================================
56
+ input_1 (InputLayer) (None, 10, 3040) 0
57
+ ____________________________________________________________________________________________________
58
+ timedistributed_1 (TimeDistribut (None, 10, 3040) 9244640 input_1[ 0] [ 0 ]
59
+ ====================================================================================================
60
+ Total params: 9,244,640
61
+ Trainable params: 9,244,640
62
+ Non-trainable params: 0
63
+ ____________________________________________________________________________________________________
64
+ 0%| | 0/1 [ 00:00<?, ?it/s]
65
+ Loss on epoch 0: 47.9408
66
+ 100%|------------------------------------| | 1/1 [ 00:35<00:00, 35.68s/it]
67
+ Cooling Learning Rate by factor of 10...
68
+ 0%| | 0/1 [ 00:00<?, ?it/s]
69
+ Loss on epoch 0: 30.2241
70
+ 100%|---------------------------------------------------------------| | 1/1 [ 00:35<00:00, 35.29s/it]
71
+ Cooling Learning Rate by factor of 10...
72
+ 0%| | 0/1 [ 00:00<?, ?it/s]
73
+ Loss on epoch 0: 23.4605
74
+ 100%|----------------------------------------------------------------------------------------------------------| 1/1 [ 00:35<00:00, 35.72s/it]
75
+ (keras)
0 commit comments