Skip to content

Commit 36b5afc

Browse files
sirosenada-globus
andauthored
Break up compute example and make it publishable (#8)
* Break up compute example and make it publishable Break the readme up into chunks with github-specific rendering directives, then use the build script to concatenate those chunks and produce a monolithic example page for the docs site. In the process, fully review the example documentation, convert it to asciidoc, and make stylistic adjustments (most notably, bold terms to follow the docs.globus.org style guidance). Also switch many present-progressive phrasings to imperatives and remove excess descriptive text whenever possible. The doc build script itself (and the schema) gained support for arbitrary file inclusions from the example directory, in support of the structure which does not currently match other existing examples. * Apply suggestions from code review Co-authored-by: Ada <107940310+ada-globus@users.noreply.github.com> --------- Co-authored-by: Ada <107940310+ada-globus@users.noreply.github.com>
1 parent bdef277 commit 36b5afc

13 files changed

+381
-173
lines changed

compute_transfer_examples/README.md

Lines changed: 0 additions & 171 deletions
This file was deleted.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# configuration for conversion to docs.globus.org
2+
title: 'Tar and Transfer Files with Compute'
3+
short_description: |
4+
Use Globus Compute to bundle files into a tarball, which you then transfer
5+
using Globus Transfer.
6+
7+
Two examples are included here, one in which the files are located on the
8+
server which runs Globus Compute, and one in which the files are on a user's
9+
machine and must be moved to the Compute host.
10+
11+
example_dir: 'compute_tar_and_transfer'
12+
append_source_blocks: false
13+
index_source:
14+
concat:
15+
files:
16+
- 'README.adoc'
17+
- 'register_function.adoc'
18+
- 'example_flow1.adoc'
19+
- 'example_flow2.adoc'
20+
include_files:
21+
- 'compute_transfer_example_1_definition.json'
22+
- 'compute_transfer_example_1_schema.json'
23+
- 'compute_transfer_example_2_definition.json'
24+
- 'compute_transfer_example_2_schema.json'
25+
- 'register_compute_function.py'
26+
27+
menu_weight: 400
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
= Tar and Transfer with Globus Compute
2+
3+
These examples demonstrate how to build **flow**s that combine Globus Compute and Globus Transfer to process and move data.
4+
5+
Each of these examples creates an archive file from the user's files and transfers that archive to a destination.
6+
In one case the source data is already on the server running Globus Connect Server and Globus Compute, and in the other it is on a source **collection** owned by the end user.
7+
8+
== Prerequisites
9+
10+
To run these examples, you must have a properly configured server and some local software installed.
11+
12+
You must have a co-located Globus Connect Server Collection and Globus Compute **endpoint**, either hosted on the same server or at least with access to a shared filesystem.
13+
14+
Globus Connect Server Collection::
15+
+
16+
You can follow
17+
link:https://docs.globus.org/globus-connect-server/v5.4/[this guide for setting up a Globus Connect Server Collection]
18+
to install Globus Connect Server and configure a **collection**.
19+
+
20+
For ease of use, we recommend using a Guest Collection.
21+
22+
Globus Compute Endpoint::
23+
+
24+
link:https://globus-compute.readthedocs.io/en/latest/endpoints/installation.html[This guide for setting up a Globus Compute Endpoint]
25+
covers installation of the Globus Compute software.
26+
+
27+
This Compute **endpoint** must have read/write permissions on the same storage location where the Globus Connect Server **ollection** is hosted.
28+
29+
Globus CLI::
30+
+
31+
You will also need the Globus CLI installed (link:https://docs.globus.org/cli/#installation[CLI installation docs]).
32+
+
33+
Globus CLI documentation recommends installation with `pipx`, as in `pipx install globus-cli`.
34+
35+
Globus Compute SDK::
36+
+
37+
You must have the `globus-compute-sdk` Python package available.
38+
We strongly recommend using a virtual environment for this installation; installing with `pip install globus-compute-sdk`.
39+
+
40+
You can follow
41+
link:https://globus-compute.readthedocs.io/en/stable/quickstart.html#installation[the Globus Compute install documentation]
42+
to install the Compute SDK client package in a virtualenv.
43+
44+
ifdef::env-github[]
45+
== Next: Learn About the `do_tar` Compute **Function**
46+
47+
link:./register_function.adoc[Register the `do_tar` Compute **Function**.]
48+
endif::[]

compute_transfer_examples/compute_transfer_example_1_definition.json renamed to compute_transfer_examples/tar_and_transfer/compute_transfer_example_1_definition.json

File renamed without changes.

compute_transfer_examples/compute_transfer_example_1_schema.json renamed to compute_transfer_examples/tar_and_transfer/compute_transfer_example_1_schema.json

File renamed without changes.

compute_transfer_examples/compute_transfer_example_2_definition.json renamed to compute_transfer_examples/tar_and_transfer/compute_transfer_example_2_definition.json

File renamed without changes.

compute_transfer_examples/compute_transfer_example_2_schema.json renamed to compute_transfer_examples/tar_and_transfer/compute_transfer_example_2_schema.json

File renamed without changes.
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
== Example Flow 1
2+
3+
In this first example, the Compute and Transfer **flow** takes a user-provided list of files that already exist in a preconfigured source **collection**.
4+
5+
The **flow** creates a tarfile from those files and transfers the tarfile to a user-provided destination collection.
6+
7+
The **flow** will:
8+
9+
1. Set constants for the **run**
10+
2. Create an output directory named after the **run**'s ID on the source collection
11+
3. Invoke the `do_tar` **function** to create a tar archive from the input source files and save it in the output directory
12+
4. Transfer the resulting tarfile to the destination collection provided in the **flow** input
13+
5. Delete the output directory on the source collection
14+
15+
=== Create the **Flow**
16+
17+
1. Edit `compute_transfer_example_1_definition.json` and replace the placeholder values:
18+
19+
- `gcs_endpoint_id`: The source **collection** ID
20+
- `compute_endpoint_id`: The Compute **endpoint** ID
21+
- `compute_function_id`: The UUID of the registered `do_tar` **function**
22+
23+
If the **collection** has a configured base path, also edit `gcs_base_path`.
24+
25+
2. Create the **flow**:
26+
+
27+
[source,bash,role=clippable-code]
28+
----
29+
globus flows create "Compute and Transfer Flow Example 1" \
30+
./compute_transfer_example_1_definition.json \
31+
--input-schema ./compute_transfer_example_1_schema.json
32+
----
33+
34+
3. Save the **flow** ID returned by this command
35+
36+
ifndef::env-github[]
37+
[.accordionize]
38+
--
39+
.compute_transfer_example_1_definition.json
40+
[%collapsible]
41+
====
42+
[source,json,role=clippable-code]
43+
----
44+
include::compute_transfer_example_1_definition.json[]
45+
----
46+
====
47+
.compute_transfer_example_1_schema.json
48+
[%collapsible]
49+
====
50+
[source,json,role=clippable-code]
51+
----
52+
include::compute_transfer_example_1_schema.json[]
53+
----
54+
====
55+
--
56+
endif::[]
57+
58+
=== Run the **Flow**
59+
60+
1. Create the **flow** input JSON file:
61+
+
62+
[source,json,role=clippable-code]
63+
----
64+
{
65+
"source_paths": ["/path/to/file1", "/path/to/file2"],
66+
"destination_path": "/path/to/your/destination/file.tar.gz",
67+
"destination_endpoint_id": "your-destination-endpoint-uuid"
68+
}
69+
----
70+
71+
2. Start the **flow**:
72+
+
73+
[source,bash,role=clippable-code]
74+
----
75+
globus flows start "$FLOW_ID" \
76+
--input "<FLOW INPUT FILE>" \
77+
--label "Compute and Transfer Flow Example 1 Run"
78+
----
79+
+
80+
And save the **run** ID for use in the next command.
81+
82+
3. Monitor the **run** progress:
83+
+
84+
[source,bash,role=clippable-code]
85+
----
86+
globus flows run show "<RUN_ID>"
87+
----
88+
** At this point, the **run** _may_ become `INACTIVE`, depending on the type of **collection** being used.
89+
** For inactive **run**s due to data access requirements, this can be resolved by resuming the **run** and following the prompts:
90+
+
91+
[source,bash,role=clippable-code]
92+
----
93+
globus flows run resume "<RUN_ID>"
94+
----
95+
+
96+
When prompted, run `globus session consent` and rerun `globus flows run resume` to resume the **run**.
97+
98+
ifdef::env-github[]
99+
== Next: Example Flow 2, with Data on a Separate **Collection**
100+
101+
link:./example_flow2.adoc[Example Flow 2.]
102+
endif::[]

0 commit comments

Comments
 (0)