Skip to content

Commit 2d8054f

Browse files
committed
Merge branch 'develop' into 11918-template-apis
2 parents d73c3be + d2b6a46 commit 2d8054f

30 files changed

+767
-141
lines changed
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
## Get Dataset/Dataverse Storage Driver API
2+
3+
### Changed Json response - breaking change!
4+
5+
The API for getting the Storage Driver info has been changed/extended.
6+
/api/datasets/{identifier}/storageDriver
7+
/api/admin/dataverse/{dataverse-alias}/storageDriver
8+
Rather than returning just the name/id of the driver (with the key "message"), the api call now returns a JSONObject with the driver's "name", "type" and "label", and booleans indicating whether the driver has "directUpload", "directDownload", and/or "uploadOutOfBand" enabled.
9+
10+
This change also affects the /api/admin/dataverse/{dataverse-alias}/storageDriver api call. In addition, this call now supports an optional ?getEffective=true to find the effective storageDriver (the driver that will be used for new datasets in the collection)
11+
12+
See also [the guides](https://dataverse-guide--11664.org.readthedocs.build/en/11664/api/native-api.html#configure-a-dataset-to-store-all-new-files-in-a-specific-file-store), #11695, and #11664.

doc/release-notes/11695-change-api-get-storage-driver.md

Lines changed: 0 additions & 12 deletions
This file was deleted.
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
### External Vocabulary Mechanism enhancement
2+
3+
- The external vocabulary mechanism (see https://github.com/gdcc/dataverse-external-vocab-support/) now supports
4+
assigning metadatablock dataset field types of fieldType textbox (multiline inputs) as managed fields. This new functionality is
5+
being leveraged to support automated generation of citation text for Related Publications entries. (a url could be added once the work in the external vocabulary repo is done).
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
It is now possible to define storage quotas on individual datasets. See the API guide for more information.
2+
The practical use case is for datasets in the top-level, root collection. This does not address the use case of a user creating multiple datasets. But there is an open dev. issue for adding per-user storage quotas as well.
3+
4+
A convenience API `/api/datasets/{id}/uploadlimits` has been added to show the remaining storage and/or number of files quotas, if present.

doc/sphinx-guides/source/admin/dataverses-datasets.rst

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -56,17 +56,19 @@ To direct new files (uploaded when datasets are created or edited) for all datas
5656

5757
(Note that for ``dataverse.files.store1.label=MyLabel``, you should pass ``MyLabel``.)
5858

59-
The current driver can be seen using::
59+
A store assigned directly to a collection can be seen using::
6060

6161
curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver
6262

63-
Or to recurse the chain of parents to find the effective storageDriver::
63+
This may be null. To get the effective storageDriver for a collection, which may be inherited from a parent collection or be the installation default, you can use::
6464

6565
curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver?getEffective=true
66+
67+
This will never be null.
6668

67-
(Note that for ``dataverse.files.store1.label=MyLabel``, ``store1`` will be returned.)
69+
(Note that for ``dataverse.files.store1.label=MyLabel``, the JSON response will include "name":"store1" and "label":"MyLabel".)
6870

69-
and can be reset to the default store with::
71+
To delete a store assigned directly to a collection (so that the colllection's effective store is inherted from it's parent or is the global default), use::
7072

7173
curl -H "X-Dataverse-key: $API_TOKEN" -X DELETE http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver
7274
@@ -261,15 +263,17 @@ To identify invalid data values in specific datasets (if, for example, an attemp
261263
Configure a Dataset to Store All New Files in a Specific File Store
262264
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
263265

264-
Configure a dataset to use a specific file store (this API can only be used by a superuser) ::
266+
Configure an individual dataset to use a specific file store (this API can only be used by a superuser) ::
265267
266268
curl -H "X-Dataverse-key: $API_TOKEN" -X PUT -d $storageDriverLabel http://$SERVER/api/datasets/$dataset-id/storageDriver
267269
268-
The current driver can be seen using::
270+
The effective store can be seen using::
269271

270272
curl http://$SERVER/api/datasets/$dataset-id/storageDriver
271273

272-
It can be reset to the default store as follows (only a superuser can do this) ::
274+
The output of the API will include the id, label, type (for example, "file" or "s3") as well as the support for direct download and upload.
275+
276+
To remove an assigned store, and allow the dataset to inherit the store from it's parent collection, use the following (only a superuser can do this) ::
273277

274278
curl -H "X-Dataverse-key: $API_TOKEN" -X DELETE http://$SERVER/api/datasets/$dataset-id/storageDriver
275279

doc/sphinx-guides/source/api/changelog.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,14 @@ v6.9
1414
- The way to set per-format size limits for tabular ingest has changed. JSON input is now used. See :ref:`:TabularIngestSizeLimit`.
1515
- In the past, the settings API would accept any key and value. This is no longer the case because validation has been added. See :ref:`settings_put_single`, for example.
1616
- For GET /api/notifications/all the JSON response has changed breaking the backward compatibility of the API.
17+
- For GET /api/admin/dataverse/{dataverse-alias}/storageDriver and /api/datasets/{identifier}/storageDriver the driver name is no longer returned in data.message. Instead, it is returned as data.name (along with other information about the storageDriver).
1718

1819
v6.8
1920
----
2021

2122
- For POST /api/files/{id}/metadata passing an empty string ("description":"") or array ("categories":[]) will no longer be ignored. Empty fields will now clear out the values in the file's metadata. To ignore the fields simply do not include them in the JSON string.
2223
- For PUT /api/datasets/{id}/editMetadata the query parameter "sourceInternalVersionNumber" has been removed and replaced with "sourceLastUpdateTime" to verify that the data being edited hasn't been modified and isn't stale.
2324
- For GET /api/dataverses/$dataverse-alias/links the JSON response has changed breaking the backward compatibility of the API.
24-
- For GET /api/admin/dataverse/{dataverse-alias}/storageDriver and /api/datasets/{identifier}/storageDriver the driver name is no longer returned in data.message. This value is now returned in data.name.
2525
- For PUT /api/dataverses/$dataverse-alias/inputLevels custom input levels that had been previously set will no longer be deleted. To delete input levels send an empty list (deletes all), then send the new/modified list.
2626
- For GET /api/externalTools and /api/externalTools/{id} the responses are now formatted as JSON (previously the toolParameters and allowedApiCalls were a JSON object and array (respectively) that were serialized as JSON strings) and any configured "requirements" are included.
2727

doc/sphinx-guides/source/api/native-api.rst

Lines changed: 72 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1250,16 +1250,22 @@ Collection Storage Quotas
12501250
12511251
curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/storage/quota"
12521252
1253-
Will output the storage quota allocated (in bytes), or a message indicating that the quota is not defined for the specific collection. The user identified by the API token must have the ``Manage`` permission on the collection.
1253+
Will output the storage quota allocated (in bytes), or a message indicating that the quota is not defined for the collection. If this is an unpublished collection, the user must have the ``ViewUnpublishedDataverse`` permission.
1254+
With an optional query parameter ``showInherited=true`` it will show the applicable quota potentially defined on the nearest parent when the collection does not have a quota configured directly.
12541255

1256+
.. code-block::
1257+
1258+
curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/storage/use"
1259+
1260+
Will output the dynamically cached total storage size (in bytes) used by the collection. The user identified by the API token must have the ``Edit`` permission on the collection.
12551261

12561262
To set or change the storage allocation quota for a collection:
12571263

12581264
.. code-block::
12591265
1260-
curl -X POST -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/storage/quota/$SIZE_IN_BYTES"
1266+
curl -X PUT -H "X-Dataverse-key:$API_TOKEN" -d $SIZE_IN_BYTES "$SERVER_URL/api/dataverses/$ID/storage/quota"
12611267
1262-
This is API is superuser-only.
1268+
This API is superuser-only.
12631269

12641270

12651271
To delete a storage quota configured for a collection:
@@ -1268,9 +1274,70 @@ To delete a storage quota configured for a collection:
12681274
12691275
curl -X DELETE -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/storage/quota"
12701276
1271-
This is API is superuser-only.
1277+
This API is superuser-only.
1278+
1279+
Storage Quotas on Individual Datasets
1280+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1281+
1282+
.. code-block::
1283+
1284+
curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/datasets/$ID/storage/quota"
1285+
1286+
Will output the storage quota allocated (in bytes), or a message indicating that the quota is not defined for this dataset. If this is an unpublished dataset, the user must have the ``ViewUnpublishedDataset`` permission.
1287+
With an optional query parameter ``showInherited=true`` it will show the applicable quota potentially defined on the nearest parent collection when the dataset does not have a quota configured directly.
1288+
1289+
.. code-block::
1290+
1291+
curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/datasets/$ID/storage/use"
1292+
1293+
Will output the dynamically cached total storage size (in bytes) used by the dataset. The user identified by the API token must have the ``Edit`` permission on the dataset.
1294+
1295+
To set or change the storage allocation quota for a dataset:
1296+
1297+
.. code-block::
1298+
1299+
curl -X PUT -H "X-Dataverse-key:$API_TOKEN" -d $SIZE_IN_BYTES "$SERVER_URL/api/datasets/$ID/storage/quota"
1300+
1301+
This API is superuser-only.
1302+
1303+
1304+
To delete a storage quota configured for a dataset:
1305+
1306+
.. code-block::
1307+
1308+
curl -X DELETE -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/datasets/$ID/storage/quota"
1309+
1310+
This API is superuser-only.
1311+
1312+
The following convenience API shows the dynamic values of the *remaining* storage size and/or file number quotas on the dataset, if present. For example:
1313+
1314+
.. code-block::
1315+
1316+
curl -H "X-Dataverse-key: $API_TOKEN" "http://localhost:8080/api/datasets/$dataset-id/uploadlimits"
1317+
{
1318+
"status": "OK",
1319+
"data": {
1320+
"uploadLimits": {
1321+
"numberOfFilesRemaining": 20,
1322+
"storageQuotaRemaining": 1048576
1323+
}
1324+
}
1325+
}
1326+
1327+
Or, when neither limit is present:
1328+
1329+
.. code-block::
1330+
1331+
{
1332+
"status": "OK",
1333+
"data": {
1334+
"uploadLimits": {}
1335+
}
1336+
}
1337+
1338+
This API requires the Edit permission on the dataset.
12721339

1273-
Use the ``/settings`` API to enable or disable the enforcement of storage quotas that are defined across the instance via the following setting. For example,
1340+
Use the ``/settings`` API to enable or disable the enforcement of storage quotas that are defined across the instance via the following setting:
12741341

12751342
.. code-block::
12761343

doc/sphinx-guides/source/developers/big-data-support.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ Globus File Transfer
165165
Note: Globus file transfer is still experimental but feedback is welcome! See :ref:`support`.
166166

167167
Users can transfer files via `Globus <https://www.globus.org>`_ into and out of datasets, or reference files on a remote Globus endpoint, when their Dataverse installation is configured to use a Globus accessible store(s)
168-
and a community-developed `dataverse-globus <https://github.com/scholarsportal/dataverse-globus>`_ app has been properly installed and configured.
168+
and a community-developed `dataverse-globus <https://github.com/gdcc/dataverse-globus>`_ app has been properly installed and configured.
169169

170170
Globus endpoints can be in a variety of places, from data centers to personal computers.
171171
This means that from within the Dataverse software, a Globus transfer can feel like an upload or a download (with Globus Personal Connect running on your laptop, for example) or it can feel like a true transfer from one server to another (from a cluster in a data center into a Dataverse dataset or vice versa).

doc/sphinx-guides/source/installation/config.rst

Lines changed: 22 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -10,27 +10,6 @@ Once you have finished securing and configuring your Dataverse installation, you
1010
.. contents:: |toctitle|
1111
:local:
1212

13-
.. _comma-separated-config-values:
14-
15-
Comma-separated configuration values
16-
------------------------------------
17-
18-
Many configuration options (both MicroProfile/JVM settings and database settings) accept comma-separated lists. For all such settings, Dataverse applies consistent, lightweight parsing:
19-
20-
- Whitespace immediately around commas is ignored (e.g., ``GET, POST`` is equivalent to ``GET,POST``).
21-
- Tokens are otherwise preserved exactly as typed. There is no quote parsing and no escape processing.
22-
- Embedded commas within a token are not supported.
23-
24-
Examples include (but are not limited to):
25-
26-
- :ref:`dataverse.cors.origin <dataverse.cors.origin>`
27-
- :ref:`dataverse.cors.methods <dataverse.cors.methods>`
28-
- :ref:`dataverse.cors.headers.allow <dataverse.cors.headers.allow>`
29-
- :ref:`dataverse.cors.headers.expose <dataverse.cors.headers.expose>`
30-
- :ref:`:UploadMethods`
31-
32-
This behavior is implemented centrally and applies across all Dataverse settings that accept comma-separated values.
33-
3413
.. _securing-your-installation:
3514

3615
Securing Your Installation
@@ -2537,6 +2516,28 @@ Setting Up Integrations
25372516

25382517
Before going live, you might want to consider setting up integrations to make it easier for your users to deposit or explore data. See the :doc:`/admin/integrations` section of the Admin Guide for details.
25392518

2519+
.. _comma-separated-config-values:
2520+
2521+
Comma-Separated Configuration Values
2522+
------------------------------------
2523+
2524+
Many configuration options (both MicroProfile/JVM settings and database settings) accept comma-separated lists. For all such settings, Dataverse applies consistent, lightweight parsing:
2525+
2526+
- Whitespace immediately around commas is ignored (e.g., ``GET, POST`` is equivalent to ``GET,POST``).
2527+
- Tokens are otherwise preserved exactly as typed. There is no quote parsing and no escape processing.
2528+
- Embedded commas within a token are not supported.
2529+
2530+
Examples include (but are not limited to):
2531+
2532+
- :ref:`dataverse.cors.origin <dataverse.cors.origin>`
2533+
- :ref:`dataverse.cors.methods <dataverse.cors.methods>`
2534+
- :ref:`dataverse.cors.headers.allow <dataverse.cors.headers.allow>`
2535+
- :ref:`dataverse.cors.headers.expose <dataverse.cors.headers.expose>`
2536+
- :ref:`:UploadMethods`
2537+
2538+
This behavior is implemented centrally and applies across all Dataverse settings that accept comma-separated values.
2539+
2540+
25402541
.. _jvm-options:
25412542

25422543
JVM Options

src/main/java/edu/harvard/iq/dataverse/DataFileServiceBean.java

Lines changed: 13 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@
4040
import jakarta.inject.Named;
4141
import jakarta.persistence.*;
4242
import jakarta.persistence.criteria.*;
43+
import org.apache.commons.lang3.StringUtils;
4344

4445
/**
4546
*
@@ -281,60 +282,21 @@ public List<FileMetadata> findFileMetadataByDatasetVersionId(Long datasetVersion
281282
.setMaxResults(maxResults)
282283
.getResultList();
283284
}
284-
285-
public List<FileMetadata> findFileMetadataByDatasetVersionIdLabelSearchTerm(Long datasetVersionId, String searchTerm, String userSuppliedSortField, String userSuppliedSortOrder){
286-
FileSortFieldAndOrder sortFieldAndOrder = new FileSortFieldAndOrder(userSuppliedSortField, userSuppliedSortOrder);
287285

288-
String sortField = sortFieldAndOrder.getSortField();
289-
String sortOrder = sortFieldAndOrder.getSortOrder();
290-
String searchClause = "";
291-
if(searchTerm != null && !searchTerm.isEmpty()){
292-
searchClause = " and (lower(o.label) like '%" + searchTerm.toLowerCase() + "%' or lower(o.description) like '%" + searchTerm.toLowerCase() + "%')";
293-
}
294-
295-
String queryString = "select o from FileMetadata o where o.datasetVersion.id = :datasetVersionId"
296-
+ searchClause
297-
+ " order by o." + sortField + " " + sortOrder;
298-
return em.createQuery(queryString, FileMetadata.class)
299-
.setParameter("datasetVersionId", datasetVersionId)
300-
.getResultList();
301-
}
302-
303-
public List<Integer> findFileMetadataIdsByDatasetVersionIdLabelSearchTerm(Long datasetVersionId, String searchTerm, String userSuppliedSortField, String userSuppliedSortOrder){
304-
FileSortFieldAndOrder sortFieldAndOrder = new FileSortFieldAndOrder(userSuppliedSortField, userSuppliedSortOrder);
305-
306-
searchTerm=searchTerm.trim();
307-
String sortField = sortFieldAndOrder.getSortField();
308-
String sortOrder = sortFieldAndOrder.getSortOrder();
309-
String searchClause = "";
310-
if(searchTerm != null && !searchTerm.isEmpty()){
311-
searchClause = " and (lower(o.label) like '%" + searchTerm.toLowerCase() + "%' or lower(o.description) like '%" + searchTerm.toLowerCase() + "%')";
312-
}
313-
314-
//the createNativeQuary takes persistant entities, which Integer.class is not,
315-
//which is causing the exception. Hence, this query does not need an Integer.class
316-
//as the second parameter.
317-
return em.createNativeQuery("select o.id from FileMetadata o where o.datasetVersion_id = " + datasetVersionId
318-
+ searchClause
319-
+ " order by o." + sortField + " " + sortOrder)
320-
.getResultList();
321-
}
322-
323-
public List<Long> findDataFileIdsByDatasetVersionIdLabelSearchTerm(Long datasetVersionId, String searchTerm, String userSuppliedSortField, String userSuppliedSortOrder){
286+
public List<Long> findDataFileIdsByDatasetVersionIdLabelSearchTerm(Long datasetVersionId, String userSuppliedSearchTerm, String userSuppliedSortField, String userSuppliedSortOrder) {
324287
FileSortFieldAndOrder sortFieldAndOrder = new FileSortFieldAndOrder(userSuppliedSortField, userSuppliedSortOrder);
325-
326-
searchTerm=searchTerm.trim();
327-
String sortField = sortFieldAndOrder.getSortField();
328-
String sortOrder = sortFieldAndOrder.getSortOrder();
329-
String searchClause = "";
330-
if(searchTerm != null && !searchTerm.isEmpty()){
331-
searchClause = " and (lower(o.label) like '%" + searchTerm.toLowerCase() + "%' or lower(o.description) like '%" + searchTerm.toLowerCase() + "%')";
288+
String searchTerm = !StringUtils.isBlank(userSuppliedSearchTerm) ? "%"+userSuppliedSearchTerm.trim().toLowerCase()+"%" : null;
289+
290+
String selectClause = "select o.datafile_id from FileMetadata o where o.datasetversion_id = " + datasetVersionId;
291+
String searchClause = searchTerm != null ? " and (lower(o.label) like ? or lower(o.description) like ?)" : "";
292+
String orderByClause = " order by o." + sortFieldAndOrder.getSortField() + " " + sortFieldAndOrder.getSortOrder();
293+
294+
Query query = em.createNativeQuery(selectClause + searchClause + orderByClause);
295+
if (searchTerm != null) {
296+
query.setParameter(1, searchTerm);
297+
query.setParameter(2, searchTerm);
332298
}
333-
334-
return em.createNativeQuery("select o.datafile_id from FileMetadata o where o.datasetVersion_id = " + datasetVersionId
335-
+ searchClause
336-
+ " order by o." + sortField + " " + sortOrder)
337-
.getResultList();
299+
return query.getResultList();
338300
}
339301

340302
public List<FileMetadata> findFileMetadataByDatasetVersionIdLazy(Long datasetVersionId, int maxResults, String userSuppliedSortField, String userSuppliedSortOrder, int firstResult) {

0 commit comments

Comments
 (0)