You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: python/README.md
+28-12Lines changed: 28 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,16 +1,16 @@
1
1
# ibmos2spark
2
2
3
-
The package sets Spark Hadoop configurations for connecting to
3
+
The package sets Spark Hadoop configurations for connecting to
4
4
IBM Bluemix Object Storage and Softlayer Account Object Storage instances. This packages uses the new [stocator](https://github.com/SparkTC/stocator) driver, which implements the `swift2d` protocol, and is availble
5
-
on the latest IBM Apache Spark Service instances (and through IBM Data Science Experience).
5
+
on the latest IBM Apache Spark Service instances (and through IBM Data Science Experience).
6
6
7
7
8
-
Using the `stocator` driver connects your Spark executor nodes directly
8
+
Using the `stocator` driver connects your Spark executor nodes directly
9
9
to your data in object storage.
10
10
This is an optimized, high-performance method to connect Spark to your data. All IBM Apache Spark kernels
11
-
are instantiated with the `stocator` driver in the Spark kernel's classpath.
12
-
You can also run this locally by installing the [stocator driver](https://github.com/SparkTC/stocator)
13
-
and adding it to your local Apache Spark kernel's classpath.
11
+
are instantiated with the `stocator` driver in the Spark kernel's classpath.
12
+
You can also run this locally by installing the [stocator driver](https://github.com/SparkTC/stocator)
13
+
and adding it to your local Apache Spark kernel's classpath.
The usage of this package depends on *from where* your Object Storage instance was created. This package
24
-
is intended to connect to IBM's Object Storage instances obtained from Bluemix or Data Science Experience
25
-
(DSX) or from a separate account on IBM Softlayer. The instructions below show how to connect to
26
-
either type of instance.
24
+
is intended to connect to IBM's Object Storage instances (Swift OS). This OS can be obtained from Bluemix or Data Science Experience (DSX) or from a separate account on IBM Softlayer. The package also supports IBM Cloud Object Storage as well (COS).
25
+
The instructions below show how to connect to either type of instance.
27
26
28
27
The connection setup is essentially the same. But the difference for you is how you deliver the
29
28
credentials. If your Object Storage was created with Bluemix/DSX, with a few clicks on the side-tab
30
29
within a DSX Jupyter notebook, you can obtain your account credentials in the form of a Python dictionary.
31
30
If your Object Storage was created with a Softlayer account, each part of the credentials will
32
-
be found as text that you can copy and paste into the example code below.
31
+
be found as text that you can copy and paste into the example code below.
32
+
33
+
### CloudObjectStorage / Data Science Experience
34
+
```python
35
+
import ibmos2spark
36
+
37
+
credentials = {
38
+
'endpoint': 'https://s3-api.objectstorage.softlayer.net/', #just an example. Your url might be different
39
+
'access_key': '',
40
+
'secret_key': ''
41
+
}
42
+
43
+
cos = ibmos2spark.CloudObjectStorage(sc, credentials) #sc is the SparkContext instance
44
+
45
+
bucket_name ='some_bucket_name'
46
+
object_name ='file1'
47
+
data = sc.textFile(cos.url(object_name, bucket_name))
48
+
```
33
49
34
50
### Bluemix / Data Science Experience
35
51
36
52
```python
37
53
import ibmos2spark
38
54
39
-
#To obtain these credentials in IBM Spark, click the "insert to code"
55
+
#To obtain these credentials in IBM Spark, click the "insert to code"
40
56
#button below your data source found on the panel to the right of your notebook.
41
57
42
58
credentials = {
@@ -78,7 +94,7 @@ data = sc.textFile(slos.url(container_name, object_name))
0 commit comments