Skip to content

Commit 8267929

Browse files
committed
PYTHONSDK-97: Adding documentation and code samples for new helper functions.
1 parent 6b6c2c1 commit 8267929

File tree

5 files changed

+229
-21
lines changed

5 files changed

+229
-21
lines changed

README.md

Lines changed: 26 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,20 @@
1-
Spectra S3 Python3 SDK
2-
--------------
3-
1+
# Spectra S3 Python3 SDK
42
[![Apache V2 License](http://img.shields.io/badge/license-Apache%20V2-blue.svg)](https://github.com/SpectraLogic/ds3_python3_sdk/blob/master/LICENSE.md)
53

64
An SDK conforming to the Spectra S3 [specification](https://developer.spectralogic.com/doc/ds3api/1.2/wwhelp/wwhimpl/js/html/wwhelp.htm) for Python 3.6
75

8-
Contact Us
9-
----------
10-
6+
## Contact Us
117
Join us at our [Google Groups](https://groups.google.com/d/forum/spectralogicds3-sdks) forum to ask questions, or see frequently asked questions.
128

13-
Installing
14-
----------
15-
9+
## Installing
1610
To install the ds3_python3_sdk, either clone the latest code, or download a release bundle from [Releases](http://github.com/SpectraLogic/ds3_python3_sdk/releases). Once the code has been download, cd into the bundle, and install it with `sudo python3 setup.py install`
1711

1812
Once `setup.py` completes the ds3_python3_sdk should be installed and available to be imported into python scripts.
1913

20-
Documentation
21-
-------------
14+
## Documentation
2215
The documentation for the SDK can be found at [http://spectralogic.github.io/ds3_python3_sdk/sphinx/v3.4.1/](http://spectralogic.github.io/ds3_python3_sdk/sphinx/v3.4.1/)
2316

24-
SDK
25-
---
26-
17+
## SDK
2718
The SDK provides an interface for a user to add Spectra S3 functionality to their existing or new python application. In order to take advantage of the SDK you need to import the `ds3` python package and module. The following is an example that creates a Spectra S3 client from environment variables, creates a bucket, and lists all the buckets that are visible to the user.
2819

2920
```python
@@ -40,8 +31,7 @@ for bucket in getServiceResponse.result['BucketList']:
4031
print(bucket['Name'])
4132
```
4233

43-
Client
44-
---------
34+
## Client
4535
In the ds3_python3_sdk there are two ways that you can create a `Client` instance: environment variables, or manually. `ds3.createClientFromEnv` will create a `Client` using the following environment variables:
4636

4737
* `DS3_ENDPOINT` - The URL to the DS3 Endpoint
@@ -61,10 +51,27 @@ client = ds3.Client("endpoint", ds3.Credentials("access_key", "secret_key"))
6151

6252
The proxy URL can be passed in as the named parameter `proxy` to `Client()`.
6353

64-
Putting Data
65-
------------
54+
## Examples Communicating with the BP
55+
56+
[An example of using getService and getBucket to list all accessible buckets and objects](samples/listAll.py)
6657

67-
To put data to a Spectra S3 appliance you have to do it inside of the context of what is called a Bulk Job. Bulk Jobs allow the Spectra S3 appliance to plan how data should land to cache, and subsequently get written/read to/from tape. The basic flow of every job is:
58+
### HELPERS: Simple way of moving data to/from a file system
59+
There are helper utilities for putting and getting data to a BP. These are designed to simplify the user workflow so
60+
that you don't have to worry about BP job management. The helpers will create BP jobs as necessary, and transfer data
61+
in parallel to improve performance.
62+
63+
#### How to move everything:
64+
- [An example of putting ALL files in a directory to a BP bucket](samples/putting_all_files_in_directory.py)
65+
- [An example of getting ALL objects in a bucket and landing them in a directory](samples/getting_all_objects_in_bucket.py)
66+
67+
#### How to move some things:
68+
If you only want to move some items in a directory/bucket, you can specify them individually. These examples show how
69+
to put and get a specific file, but the principle can be expanded to transferring multiple items at once.
70+
- [An example of putting ONE file to a BP bucket](samples/putting_one_file_in_directory.py)
71+
- [An example of getting ONE object in a bucket](samples/getting_one_file_in_directory.py)
72+
73+
### Moving data the old way
74+
To put data to a Spectra S3 appliance you have to do it inside the context of what is called a Bulk Job. Bulk Jobs allow the Spectra S3 appliance to plan how data should land to cache, and subsequently get written/read to/from tape. The basic flow of every job is:
6875

6976
* Generate the list of objects that will either be sent to or retrieved from Spectra S3
7077
* Send a bulk put/get to Spectra S3 to plan the job
@@ -76,6 +83,4 @@ To put data to a Spectra S3 appliance you have to do it inside of the context of
7683

7784
[An example of getting data with the Python SDK](samples/gettingData.py)
7885

79-
[An example of using getService and getBucket to list all accessible buckets and objects](samples/listAll.py)
80-
8186
[An example of how give objects on the server a different name than what is on the filesystem, and how to delete objects by folder](samples/renaming.py)
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Copyright 2021 Spectra Logic Corporation. All Rights Reserved.
2+
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use
3+
# this file except in compliance with the License. A copy of the License is located at
4+
#
5+
# http://www.apache.org/licenses/LICENSE-2.0
6+
#
7+
# or in the "license" file accompanying this file.
8+
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
9+
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
# specific language governing permissions and limitations under the License.
11+
12+
import tempfile
13+
14+
from os import path, walk
15+
from ds3 import ds3, ds3Helpers
16+
17+
# This example gets ALL objects within the bucket books and lands them in a temp folder.
18+
# This uses the new helper functions which creates and manages the BP jobs behind the scenes.
19+
#
20+
# This assumes that there exists a bucket called books on the BP and it contains objects.
21+
# Running the putting_all_files_in_directory.py example will create this setup.
22+
23+
# The bucket that contains the objects.
24+
bucket_name = "books"
25+
26+
# The directory on the file system where the objects will be landed.
27+
# In this example, we are using a temporary directory for easy cleanup.
28+
destination_directory = tempfile.TemporaryDirectory(prefix="books-dir")
29+
30+
# Create a client which will be used to communicate with the BP.
31+
client = ds3.createClientFromEnv()
32+
33+
# Create the helper to gain access to the new data movement utilities.
34+
helper = ds3Helpers.Helper(client=client)
35+
36+
# Retrieve all the objects in the desired bucket and land them in the specified directory.
37+
#
38+
# You can optionally specify a objects_per_bp_job and max_threads to tune performance.
39+
get_job_ids = helper.get_all_files_in_bucket(destination_dir=destination_directory.name, bucket=bucket_name)
40+
print("BP get job IDS: " + get_job_ids.__str__())
41+
42+
# Verify that all the files have been landed in the folder.
43+
for root, dirs, files in walk(top=destination_directory.name):
44+
for name in files:
45+
print("File: " + path.join(root, name))
46+
for name in dirs:
47+
print("Dir: " + path.join(root, name))
48+
49+
# Clean up the temp directory where we landed the files.
50+
destination_directory.cleanup()
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Copyright 2021 Spectra Logic Corporation. All Rights Reserved.
2+
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use
3+
# this file except in compliance with the License. A copy of the License is located at
4+
#
5+
# http://www.apache.org/licenses/LICENSE-2.0
6+
#
7+
# or in the "license" file accompanying this file.
8+
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
9+
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
# specific language governing permissions and limitations under the License.
11+
12+
import tempfile
13+
14+
from os import path, walk
15+
from ds3 import ds3, ds3Helpers
16+
17+
# This example gets ONE objects within the bucket books and lands it in a temp folder.
18+
# This uses the new helper functions which creates and manages the BP job behind the scenes.
19+
#
20+
# This assumes that there exists a bucket called books on the BP and it contains the object beowulf.txt.
21+
# Running the putting_one_file_in_directory.py example will create this setup.
22+
23+
# The bucket that contains the objects.
24+
bucket_name = "books"
25+
26+
# The directory on the file system where the object will be landed.
27+
# In this example, we are using a temporary directory for easy cleanup.
28+
destination_directory = tempfile.TemporaryDirectory(prefix="books-dir")
29+
30+
# Create a client which will be used to communicate with the BP.
31+
client = ds3.createClientFromEnv()
32+
33+
# Create the helper to gain access to the new data movement utilities.
34+
helper = ds3Helpers.Helper(client=client)
35+
36+
# Create a HelperGetObject for each item you want to retrieve from the BP bucket.
37+
# This example only gets one object, but you can transfer more than one at a time.
38+
# For each object you must specify the name of the object on the BP, and the file path where you want to land the file.
39+
# Optionally, if versioning is enabled on your bucket, you can specify the specific version to retrieve.
40+
# If you don't specify a version, the most recent will be retrieved.
41+
file_path = path.join(destination_directory.name, "beowulf.txt")
42+
get_objects = [ds3Helpers.HelperGetObject(object_name="beowulf.txt", destination_path=file_path)]
43+
44+
# Retrieve the objects in the desired bucket.
45+
# You can optionally specify max_threads to tune performance.
46+
get_job_id = helper.get_objects(get_objects=get_objects, bucket=bucket_name)
47+
print("BP get job ID: " + get_job_id)
48+
49+
# Verify that all the files have been landed in the folder.
50+
for root, dirs, files in walk(top=destination_directory.name):
51+
for name in files:
52+
print("File: " + path.join(root, name))
53+
for name in dirs:
54+
print("Dir: " + path.join(root, name))
55+
56+
# Clean up the temp directory where we landed the files.
57+
destination_directory.cleanup()
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Copyright 2021 Spectra Logic Corporation. All Rights Reserved.
2+
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use
3+
# this file except in compliance with the License. A copy of the License is located at
4+
#
5+
# http://www.apache.org/licenses/LICENSE-2.0
6+
#
7+
# or in the "license" file accompanying this file.
8+
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
9+
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
# specific language governing permissions and limitations under the License.
11+
12+
import os
13+
14+
from ds3 import ds3, ds3Helpers
15+
16+
# This example puts ALL files within the sub-folder /samples/resources to the bucket called books.
17+
# This uses the new helper functions which creates and manages the BP jobs behind the scenes.
18+
19+
# The bucket where to land the files.
20+
bucket_name = "books"
21+
22+
# The directory that contains files to be archived to BP.
23+
# In this example, we are moving all files in the ds3_python3_sdk/samples/resources folder.
24+
directory_with_files = os.path.join(os.path.dirname(str(__file__)), "resources")
25+
26+
# Create a client which will be used to communicate with the BP.
27+
client = ds3.createClientFromEnv()
28+
29+
# Make sure the bucket that we will be sending objects to exists
30+
client.put_bucket(ds3.PutBucketRequest(bucket_name))
31+
32+
# Create the helper to gain access to the new data movement utilities.
33+
helper = ds3Helpers.Helper(client=client)
34+
35+
# Archive all the files in the desired directory to the specified bucket.
36+
# Note that the file's object names will be relative to the root directory you specified.
37+
# For example: resources/beowulf.txt will be named just beowulf.txt in the BP bucket.
38+
#
39+
# You can optionally specify a objects_per_bp_job and max_threads to tune performance.
40+
put_job_ids = helper.put_all_objects_in_directory(source_dir=directory_with_files, bucket=bucket_name)
41+
print("BP put job IDs: " + put_job_ids.__str__())
42+
43+
# we now verify that all our objects have been sent to DS3
44+
bucketResponse = client.get_bucket(ds3.GetBucketRequest(bucket_name))
45+
46+
for obj in bucketResponse.result['ContentsList']:
47+
print(obj['Key'])
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Copyright 2021 Spectra Logic Corporation. All Rights Reserved.
2+
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use
3+
# this file except in compliance with the License. A copy of the License is located at
4+
#
5+
# http://www.apache.org/licenses/LICENSE-2.0
6+
#
7+
# or in the "license" file accompanying this file.
8+
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
9+
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
# specific language governing permissions and limitations under the License.
11+
12+
import os
13+
14+
from ds3 import ds3, ds3Helpers
15+
16+
# This example puts ONE file /samples/resources/beowulf.txt to the bucket called books.
17+
# This uses the new helper functions which creates and manages a single BP job.
18+
19+
# The bucket where to land the files.
20+
bucket_name = "books"
21+
22+
# The file path being put to the BP.
23+
file_path = os.path.join(os.path.dirname(str(__file__)), "resources", "beowulf.txt")
24+
25+
# Create a client which will be used to communicate with the BP.
26+
client = ds3.createClientFromEnv()
27+
28+
# Make sure the bucket that we will be sending objects to exists
29+
client.put_bucket(ds3.PutBucketRequest(bucket_name))
30+
31+
# Create the helper to gain access to the new data movement utilities.
32+
helper = ds3Helpers.Helper(client=client)
33+
34+
# Create a HelperPutObject for each item you want to send to the BP.
35+
# This example only puts one file, but you can send more than one at a time.
36+
# For each object you must specify the name it will be called on the BP, the file path, and the size of the file.
37+
file_size = os.path.getsize(file_path)
38+
put_objects = [ds3Helpers.HelperPutObject(object_name="beowulf.txt", file_path=file_path, size=file_size)]
39+
40+
# Archive the files to the specified bucket
41+
# You can optionally specify max_threads to tune performance.
42+
put_job_id = helper.put_objects(put_objects=put_objects, bucket=bucket_name)
43+
print("BP put job ID: " + put_job_id)
44+
45+
# we now verify that all our objects have been sent to DS3
46+
bucketResponse = client.get_bucket(ds3.GetBucketRequest(bucket_name))
47+
48+
for obj in bucketResponse.result['ContentsList']:
49+
print(obj['Key'])

0 commit comments

Comments
 (0)