Skip to content

Commit 1a4a43c

Browse files
authored
Merge pull request #11125 from IQSS/11057-globus-downloads
Globus support: download optimizations
2 parents 13cbc1e + 2ba5108 commit 1a4a43c

File tree

12 files changed

+652
-180
lines changed

12 files changed

+652
-180
lines changed
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
## Globus framework improvements
2+
3+
The improvements and optimizations in this release build on top of the earlier work (such as PR #10781). They are based on the experience gained at IQSS as part of the production rollout of the Large Data Storage services that utilizes Globus.
4+
5+
The changes in this PR (#11125) focus on improving Globus *downloads* (i.e., transfers from Dataverse-linked Globus volumes to users' Globus collections). Most importantly, th mechanism of "Asynchronous Task Monitoring", first introduced in #10781 for *uploads*, has been extended to handle downloads as well. This generally makes downloads more reliable (specifically, in how Dataverse manages temporary access rules granted to users, minimizing the risk of consequent downloads failing because of stale access rules left in place).
6+
7+
See `globus-use-experimental-async-framework` under [Feature Flags](https://guides.dataverse.org/en/latest/installation/config.html#feature-flags) and [dataverse.files.globus-monitoring-server](https://guides.dataverse.org/en/latest/installation/config.html#dataverse-files-globus-monitoring-server) in the Installation Guide.
8+
9+
Multiple other improvements have been made making the underlying Globus framework more reliable and robust.
10+
11+

doc/sphinx-guides/source/installation/config.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3501,7 +3501,7 @@ please find all known feature flags below. Any of these flags can be activated u
35013501
- Turns off automatic selection of a dataset thumbnail from image files in that dataset. When set to ``On``, a user can still manually pick a thumbnail image or upload a dedicated thumbnail image.
35023502
- ``Off``
35033503
* - globus-use-experimental-async-framework
3504-
- Activates a new experimental implementation of Globus polling of ongoing remote data transfers that does not rely on the instance staying up continuously for the duration of the transfers and saves the state information about Globus upload requests in the database. Added in v6.4. Affects :ref:`:GlobusPollingInterval`. Note that the JVM option :ref:`dataverse.files.globus-monitoring-server` described above must also be enabled on one (and only one, in a multi-node installation) Dataverse instance.
3504+
- Activates a new experimental implementation of Globus polling of ongoing remote data transfers that does not rely on the instance staying up continuously for the duration of the transfers and saves the state information about Globus upload requests in the database. Added in v6.4; extended in v6.6 to cover download transfers, in addition to uploads. Affects :ref:`:GlobusPollingInterval`. Note that the JVM option :ref:`dataverse.files.globus-monitoring-server` described above must also be enabled on one (and only one, in a multi-node installation) Dataverse instance.
35053505
- ``Off``
35063506
* - index-harvested-metadata-source
35073507
- Index the nickname or the source name (See the optional ``sourceName`` field in :ref:`create-a-harvesting-client`) of the harvesting client as the "metadata source" of harvested datasets and files. If enabled, the Metadata Source facet will show separate groupings of the content harvested from different sources (by harvesting client nickname or source name) instead of the default behavior where there is one "Harvested" grouping for all harvested content.

src/main/java/edu/harvard/iq/dataverse/DatasetPage.java

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5867,13 +5867,12 @@ public List<DatasetField> getDatasetSummaryFields() {
58675867
return DatasetUtil.getDatasetSummaryFields(workingVersion, customFields);
58685868
}
58695869

5870-
public boolean isShowPreviewButton(Long fileId) {
5871-
List<ExternalTool> previewTools = getPreviewToolsForDataFile(fileId);
5870+
public boolean isShowPreviewButton(DataFile dataFile) {
5871+
List<ExternalTool> previewTools = getPreviewToolsForDataFile(dataFile);
58725872
return previewTools.size() > 0;
58735873
}
58745874

5875-
public boolean isShowQueryButton(Long fileId) {
5876-
DataFile dataFile = datafileService.find(fileId);
5875+
public boolean isShowQueryButton(DataFile dataFile) {
58775876

58785877
if(dataFile.isRestricted()
58795878
|| !dataFile.isReleased()
@@ -5882,26 +5881,28 @@ public boolean isShowQueryButton(Long fileId) {
58825881
return false;
58835882
}
58845883

5885-
List<ExternalTool> fileQueryTools = getQueryToolsForDataFile(fileId);
5884+
List<ExternalTool> fileQueryTools = getQueryToolsForDataFile(dataFile);
58865885
return fileQueryTools.size() > 0;
58875886
}
58885887

5889-
public List<ExternalTool> getPreviewToolsForDataFile(Long fileId) {
5890-
return getCachedToolsForDataFile(fileId, ExternalTool.Type.PREVIEW);
5888+
public List<ExternalTool> getPreviewToolsForDataFile(DataFile dataFile) {
5889+
return getCachedToolsForDataFile(dataFile, ExternalTool.Type.PREVIEW);
58915890
}
58925891

5893-
public List<ExternalTool> getQueryToolsForDataFile(Long fileId) {
5894-
return getCachedToolsForDataFile(fileId, ExternalTool.Type.QUERY);
5892+
public List<ExternalTool> getQueryToolsForDataFile(DataFile dataFile) {
5893+
return getCachedToolsForDataFile(dataFile, ExternalTool.Type.QUERY);
58955894
}
5896-
public List<ExternalTool> getConfigureToolsForDataFile(Long fileId) {
5897-
return getCachedToolsForDataFile(fileId, ExternalTool.Type.CONFIGURE);
5895+
5896+
public List<ExternalTool> getConfigureToolsForDataFile(DataFile dataFile) {
5897+
return getCachedToolsForDataFile(dataFile, ExternalTool.Type.CONFIGURE);
58985898
}
58995899

5900-
public List<ExternalTool> getExploreToolsForDataFile(Long fileId) {
5901-
return getCachedToolsForDataFile(fileId, ExternalTool.Type.EXPLORE);
5900+
public List<ExternalTool> getExploreToolsForDataFile(DataFile dataFile) {
5901+
return getCachedToolsForDataFile(dataFile, ExternalTool.Type.EXPLORE);
59025902
}
59035903

5904-
public List<ExternalTool> getCachedToolsForDataFile(Long fileId, ExternalTool.Type type) {
5904+
public List<ExternalTool> getCachedToolsForDataFile(DataFile dataFile, ExternalTool.Type type) {
5905+
Long fileId = dataFile.getId();
59055906
Map<Long, List<ExternalTool>> cachedToolsByFileId = new HashMap<>();
59065907
List<ExternalTool> externalTools = new ArrayList<>();
59075908
switch (type) {
@@ -5928,7 +5929,6 @@ public List<ExternalTool> getCachedToolsForDataFile(Long fileId, ExternalTool.Ty
59285929
if (cachedTools != null) { //if already queried before and added to list
59295930
return cachedTools;
59305931
}
5931-
DataFile dataFile = datafileService.find(fileId);
59325932
cachedTools = externalToolService.findExternalToolsByFile(externalTools, dataFile);
59335933
cachedToolsByFileId.put(fileId, cachedTools); //add to map so we don't have to do the lifting again
59345934
return cachedTools;
@@ -6728,6 +6728,7 @@ public boolean isGlobusTransferRequested() {
67286728
* valid files to transfer.
67296729
*/
67306730
public void startGlobusTransfer(boolean transferAll, boolean popupShown) {
6731+
logger.fine("inside startGlobusTransfer; "+(transferAll ? "transferAll" : "NOTtransferAll") + " " + (popupShown ? "popupShown" : "NOTpopupShown"));
67316732
if (transferAll) {
67326733
this.setSelectedFiles(workingVersion.getFileMetadatas());
67336734
}

src/main/java/edu/harvard/iq/dataverse/FileDownloadHelper.java

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,13 +72,23 @@ public FileDownloadHelper() {
7272
// file downloads and multiple (batch) downloads - since both use the same
7373
// terms/etc. popup.
7474
public void writeGuestbookAndStartDownload(GuestbookResponse guestbookResponse, boolean isGlobusTransfer) {
75+
logger.fine("inside FileDownloadHelper.writeGuestbookAndStartDownload() " + (isGlobusTransfer ? "Globus Transfer" : "NOT a Globus Transfer"));
7576
PrimeFaces.current().executeScript("PF('guestbookAndTermsPopup').hide()");
7677
guestbookResponse.setEventType(GuestbookResponse.DOWNLOAD);
7778
// Note that this method is only ever called from the file-download-popup -
7879
// meaning we know for the fact that we DO want to save this
7980
// guestbookResponse permanently in the database.
80-
if(isGlobusTransfer) {
81-
globusService.writeGuestbookAndStartTransfer(guestbookResponse, true);
81+
// Do keep in mind that "true" in writeGuestbookAndStartTransfer() below
82+
// would mean "DO SKIP writing the guestbookResponse", and "false" means
83+
// "DO write ..."
84+
if(isGlobusTransfer) {
85+
// Note that *single-file* Globus transfers are NOT handled here.
86+
// Instead they are coming in through this method with isGlobusTransfer=false,
87+
// and then picked up by the fileDownloadService, below, which in turn
88+
// recognizes them as Globus types via guestbookResponse.getFileFormat() == "GlobusTransfer"
89+
// and treats them as such... I'm not super clear as to why they can't
90+
// be handled here instead... Don't ask.
91+
globusService.writeGuestbookAndStartTransfer(guestbookResponse, false);
8292
} else {
8393
if (guestbookResponse.getSelectedFileIds() != null) {
8494
// this is a batch (multiple file) download.

src/main/java/edu/harvard/iq/dataverse/api/Datasets.java

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4123,7 +4123,7 @@ public Response requestGlobusUpload(@Context ContainerRequestContext crc, @PathP
41234123
case 400:
41244124
return badRequest("Unable to grant permission");
41254125
case 409:
4126-
return conflict("Permission already exists");
4126+
return conflict("Permission already exists or no more permissions allowed");
41274127
default:
41284128
return error(null, "Unexpected error when granting permission");
41294129
}
@@ -4494,7 +4494,7 @@ public Response requestGlobusDownload(@Context ContainerRequestContext crc, @Pat
44944494
case 400:
44954495
return badRequest("Unable to grant permission");
44964496
case 409:
4497-
return conflict("Permission already exists");
4497+
return conflict("Permission already exists or no more permissions allowed");
44984498
default:
44994499
return error(null, "Unexpected error when granting permission");
45004500
}
@@ -4548,8 +4548,17 @@ public Response monitorGlobusDownload(@Context ContainerRequestContext crc, @Pat
45484548
return wr.getResponse();
45494549
}
45504550

4551+
JsonObject jsonObject = null;
4552+
try {
4553+
jsonObject = JsonUtil.getJsonObject(jsonData);
4554+
} catch (Exception ex) {
4555+
logger.warning("Globus download monitoring: error parsing json: " + jsonData + " " + ex.getMessage());
4556+
return badRequest("Error parsing json body");
4557+
4558+
}
4559+
45514560
// Async Call
4552-
globusService.globusDownload(jsonData, dataset, authUser);
4561+
globusService.globusDownload(jsonObject, dataset, authUser);
45534562

45544563
return ok("Async call to Globus Download started");
45554564

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
package edu.harvard.iq.dataverse.globus;
2+
3+
/**
4+
*
5+
* @author landreev
6+
*/
7+
public class ExpiredTokenException extends Exception {
8+
public ExpiredTokenException(String message) {
9+
super(message);
10+
}
11+
12+
public ExpiredTokenException(String message, Throwable cause) {
13+
super(message, cause);
14+
}
15+
16+
}

0 commit comments

Comments
 (0)