Skip to content

Commit fcb7d30

Browse files
iamazeemheuermh
authored andcommitted
Updated README.
1 parent 15510cb commit fcb7d30

File tree

1 file changed

+27
-26
lines changed

1 file changed

+27
-26
lines changed

README.md

Lines changed: 27 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,66 +1,67 @@
11
# BioJava-Spark
2-
Algorithms that are built around BioJava and are running on Apache Spark
32

3+
Algorithms that are built around BioJava and are running on Apache Spark
44

55
[![Build Status](https://travis-ci.org/biojava/biojava-spark.svg?branch=master)](https://travis-ci.org/biojava/biojava-spark)
66
[![License](http://img.shields.io/badge/license-LGPL_2.1-blue.svg?style=flat)](https://github.com/biojava/biojava/blob/master/LICENSE)
77
[![Status](http://img.shields.io/badge/status-experimental-red.svg?style=flat)](https://github.com/biojava/biojava-spark)
88
[![Version](http://img.shields.io/badge/version-0.2.1-blue.svg?style=flat)](https://github.com/biojava/biojava-spark/)
99

10-
# Starting up
10+
## Starting up
1111

1212
### Some initial instructions can be found on the mmtf-spark project
13-
https://github.com/rcsb/mmtf-spark
13+
https://github.com/sbl-sdsc/mmtf-spark
14+
1415
## First download and untar a Hadoop sequence file of the PDB (~7 GB download)
16+
1517
```bash
1618
wget http://mmtf.rcsb.org/v1.0/hadoopfiles/full.tar
1719
tar -xvf full.tar
1820
```
1921
Or you can get a C-alpha, phosphate, ligand only version (~800 Mb download)
22+
2023
```bash
2124
wget http://mmtf.rcsb.org/v1.0/hadoopfiles/reduced.tar
2225
tar -xvf reduced.tar
2326
```
2427
### Second add the biojava-spark dependecy to your pom
2528

2629
```xml
27-
<dependency>
28-
<groupId>org.biojava</groupId>
29-
<artifactId>biojava-spark</artifactId>
30-
<version>0.2.1</version>
31-
</dependency>
30+
<dependency>
31+
<groupId>org.biojava</groupId>
32+
<artifactId>biojava-spark</artifactId>
33+
<version>0.2.1</version>
34+
</dependency>
3235
```
3336

34-
35-
3637
## Extra Biojava examples
3738

3839
### Do some simple quality filtering
3940

4041
```java
41-
float maxResolution = 3.0f;
42-
float maxRfree = 0.3f;
43-
StructureDataRDD structureData = new StructureDataRDD("/path/to/file")
44-
.filterResolution(maxResolution)
45-
.filterRfree(maxRfree);
42+
float maxResolution = 3.0f;
43+
float maxRfree = 0.3f;
44+
StructureDataRDD structureData = new StructureDataRDD("/path/to/file")
45+
.filterResolution(maxResolution)
46+
.filterRfree(maxRfree);
4647
```
4748

4849
### Summarsing the elements in the PDB
50+
4951
```java
50-
Map<String, Long> elementCountMap = BiojavaSparkUtils.findAtoms(structureData).countByElement();
52+
Map<String, Long> elementCountMap = BiojavaSparkUtils.findAtoms(structureData).countByElement();
5153
```
5254

5355
### Finding inter-atomic contacts from the PDB
5456

5557
```java
56-
Double mean = BiojavaSparkUtils.findContacts(structureData,
57-
new AtomSelectObject()
58-
.groupNameList(new String[] {"PRO","LYS"})
59-
.elementNameList(new String[] {"C"})
60-
.atomNameList(new String[] {"CA"}),
61-
cutoff)
62-
.getDistanceDistOfAtomInts("CA", "CA")
63-
.mean();
64-
System.out.println("\nMean PRO-LYS CA-CA distance: "+mean);
58+
Double mean = BiojavaSparkUtils.findContacts(structureData,
59+
new AtomSelectObject()
60+
.groupNameList(new String[] {"PRO","LYS"})
61+
.elementNameList(new String[] {"C"})
62+
.atomNameList(new String[] {"CA"}),
63+
cutoff)
64+
.getDistanceDistOfAtomInts("CA", "CA")
65+
.mean();
66+
System.out.println("\nMean PRO-LYS CA-CA distance: " + mean);
6567
```
66-

0 commit comments

Comments
 (0)