Skip to content

Commit 020103f

Browse files
committed
Merge branch 'develop' of github.com:IQSS/dataverse into 11157-builtin-users-oidc-auth
2 parents fcc133a + c687d18 commit 020103f

File tree

16 files changed

+119
-16
lines changed

16 files changed

+119
-16
lines changed
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
### video subtitles (vtt files)
2+
3+
The `IQSS/dataverse` PR sets the content type for new(!) files with extension `vtt` to `text/vtt`
4+
what is presented as "_Web Video Text Tracks_". The PR also enables full text indexing for these files,
5+
if [configured](https://guides.dataverse.org/en/latest/installation/config.html#solrfulltextindexing).
6+
7+
The `gdcc/dataverse-previewer` PRs provide a new version of the video previewer.
8+
The new previewer version presents `vtt` files as subtitles for videos,
9+
the naming convention is `<video-basename>.<language-tag>.vtt`.
10+
The previewer does not rely on the content type.
11+
A proper content type may hint users to ask permission for the subtitles together with a video.
12+
13+
Existing files with extension `vtt` will keep content type `application/octet-stream` presented as "_Unknown_".
14+
The following query shows the number of files per extension with an "_Unknown_" content type:
15+
16+
SELECT substring(m.label from (length(label) - strpos(reverse(m.label), '.') + 2)) AS extension, COUNT(*) as count
17+
FROM datafile f LEFT JOIN filemetadata m ON f.id = m.datafile_id
18+
WHERE f.contenttype = 'application/octet-stream'
19+
GROUP BY extension;
20+
21+
If `vtt` does not appear in the result, you are done.
22+
Otherwise, you may want to update the content type for existing files and reindex those datasets.
23+
24+
First figure out which datasets would need [reindexing](https://guides.dataverse.org/en/latest/admin/solr-search-index.html#manual-reindexing):
25+
26+
select distinct
27+
o.protocol, o.authority, o.identifier,
28+
v.versionnumber, v.minorversionnumber, v.versionstate
29+
from datafile f
30+
left join filemetadata m on f.id = m.datafile_id
31+
left join datasetversion v on v.id = m.datasetversion_id
32+
left join dvobject o on o.id = v.dataset_id
33+
WHERE contenttype = 'application/octet-stream'
34+
AND 'vtt' = substring(m.label from (length(label) - strpos(reverse(m.label), '.') + 2))
35+
;
36+
37+
Then update the content type for the files:
38+
39+
UPDATE datafile SET contenttype = 'text/vtt' WHERE id IN (
40+
SELECT datafile_id FROM filemetadata m
41+
WHERE contenttype = 'application/octet-stream'
42+
AND 'vtt' = substring(m.label from (length(label) - strpos(reverse(m.label), '.') + 2))
43+
);
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
The "string" type has been added as a new field type for metadata fields.
2+
3+
In contrast to "text" fields, "string" fields are stored and indexed exactly as provided, without any text analysis or transformations.
4+
5+
This field type is suitable for fields like IDs (e.g. ORCIDs) or enums, where exact matches are required when searching.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
### Backward Incompatibilities
2+
3+
An undocumented Search API parameter called "show_my_data" has been removed. It was never exercised by tests and is believed to be unused. API users should use the [MyData] API instead. See the [API changelog](https://dataverse-guide--11375.org.readthedocs.build/en/11375/api/changelog.html), #11287 and #11375.
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
### Tabular Tags can now be replaced
2+
3+
Previously the API POST /files/{id}/metadata/tabularTags could only add new tags to the tabular tags list. Now with the query parameter ?replace=true the list of tags will be replaced.
4+
5+
See also [the guides](https://dataverse-guide--11359.org.readthedocs.build/en/11359/api/native-api.html#updating-file-tabular-tags), #11292, and #11359.

doc/sphinx-guides/source/admin/metadatacustomization.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ Each of the three main sections own sets of properties:
144144
| | | | \• email |
145145
| | | | \• text |
146146
| | | | \• textbox |
147+
| | | | \• string |
147148
| | | | \• url |
148149
| | | | \• int |
149150
| | | | \• float |
@@ -315,6 +316,12 @@ FieldType definitions
315316
| | section of the Dataset + File |
316317
| | Management page in the User Guide. |
317318
+---------------+------------------------------------+
319+
| string | Any text may be entered into this |
320+
| | field. The value is stored and |
321+
| | indexed exactly as provided, |
322+
| | without any text analysis or |
323+
| | transformations. |
324+
+---------------+------------------------------------+
318325
| url | If not empty, field must contain |
319326
| | a valid URL. |
320327
+---------------+------------------------------------+

doc/sphinx-guides/source/api/changelog.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,11 @@ This API changelog is experimental and we would love feedback on its usefulness.
77
:local:
88
:depth: 1
99

10+
v6.7
11+
----
12+
13+
- An undocumented :doc:`search` parameter called "show_my_data" has been removed. It was never exercised by tests and is believed to be unused. API users should use the :ref:`api-mydata` API instead.
14+
1015
v6.6
1116
----
1217

doc/sphinx-guides/source/api/native-api.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4669,6 +4669,8 @@ Updating File Tabular Tags
46694669
46704670
Updates the tabular tags for an existing tabular file where ``ID`` is the database id of the file to update or ``PERSISTENT_ID`` is the persistent id (DOI or Handle) of the file. Requires a ``jsonString`` expressing the tabular tag names.
46714671
4672+
The list of "tabularTags" will be added to the existing list unless the optional ``replace=true`` query parameter is included. The inclusion of this parameter will cause the pre-existing tags to be deleted and the "tabularTags" to be added. Sending an empty list will remove all of the pre-existing tags.
4673+
46724674
The JSON representation of tabular tags (``tags.json``) looks like this::
46734675
46744676
{
@@ -4698,6 +4700,9 @@ The fully expanded example above (without environment variables) looks like this
46984700
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST \
46994701
"http://demo.dataverse.org/api/files/24/metadata/tabularTags" \
47004702
-H "Content-type:application/json" --upload-file tags.json
4703+
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST \
4704+
"http://demo.dataverse.org/api/files/24/metadata/tabularTags?replace=true" \
4705+
-H "Content-type:application/json" --upload-file tags.json
47014706
47024707
A curl example using a ``PERSISTENT_ID``
47034708
@@ -4711,6 +4716,9 @@ A curl example using a ``PERSISTENT_ID``
47114716
curl -H "X-Dataverse-key:$API_TOKEN" -X POST \
47124717
"$SERVER_URL/api/files/:persistentId/metadata/tabularTags?persistentId=$PERSISTENT_ID" \
47134718
-H "Content-type:application/json" --upload-file $FILE_PATH
4719+
curl -H "X-Dataverse-key:$API_TOKEN" -X POST \
4720+
"$SERVER_URL/api/files/:persistentId/metadata/tabularTags?persistentId=$PERSISTENT_ID&replace=true" \
4721+
-H "Content-type:application/json" --upload-file $FILE_PATH
47144722
47154723
The fully expanded example above (without environment variables) looks like this:
47164724
@@ -4719,6 +4727,9 @@ The fully expanded example above (without environment variables) looks like this
47194727
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST \
47204728
"https://demo.dataverse.org/api/files/:persistentId/metadata/tabularTags?persistentId=doi:10.5072/FK2/AAA000" \
47214729
-H "Content-type:application/json" --upload-file tags.json
4730+
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST \
4731+
"https://demo.dataverse.org/api/files/:persistentId/metadata/tabularTags?persistentId=doi:10.5072/FK2/AAA000&replace=true" \
4732+
-H "Content-type:application/json" --upload-file tags.json
47224733
47234734
Note that the specified tabular tags must be valid. The supported tags are:
47244735
@@ -7423,6 +7434,8 @@ As a superuser::
74237434
74247435
Note that this API is probably only useful for testing.
74257436
7437+
.. _api-mydata:
7438+
74267439
MyData
74277440
------
74287441

src/main/java/edu/harvard/iq/dataverse/DatasetFieldType.java

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,8 @@ public class DatasetFieldType implements Serializable, Comparable<DatasetFieldTy
3636
* The set of possible metatypes of the field. Used for validation and layout.
3737
*/
3838
public enum FieldType {
39-
TEXT, TEXTBOX, DATE, EMAIL, URL, FLOAT, INT, NONE
40-
};
39+
TEXT, TEXTBOX, STRING, DATE, EMAIL, URL, FLOAT, INT, NONE
40+
};
4141

4242
@Id
4343
@GeneratedValue(strategy = GenerationType.IDENTITY)
@@ -558,6 +558,8 @@ public SolrField getSolrField() {
558558
solrType = SolrField.SolrType.INTEGER;
559559
} else if (fieldType.equals(FieldType.FLOAT)) {
560560
solrType = SolrField.SolrType.FLOAT;
561+
} else if (fieldType.equals(FieldType.STRING)) {
562+
solrType = SolrField.SolrType.STRING;
561563
}
562564

563565
Boolean anyParentAllowsMultiplesBoolean = false;

src/main/java/edu/harvard/iq/dataverse/api/Files.java

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
package edu.harvard.iq.dataverse.api;
22

3+
import com.google.api.client.util.Lists;
34
import com.google.gson.Gson;
45
import com.google.gson.JsonObject;
56
import edu.harvard.iq.dataverse.*;
@@ -947,7 +948,7 @@ public Response setFileCategories(@Context ContainerRequestContext crc, @PathPar
947948
@AuthRequired
948949
@Path("{id}/metadata/tabularTags")
949950
@Produces(MediaType.APPLICATION_JSON)
950-
public Response setFileTabularTags(@Context ContainerRequestContext crc, @PathParam("id") String dataFileId, String jsonBody) {
951+
public Response setFileTabularTags(@Context ContainerRequestContext crc, @PathParam("id") String dataFileId, String jsonBody, @QueryParam("replace") boolean replaceData) {
951952
return response(req -> {
952953
DataFile dataFile = execCommand(new GetDataFileCommand(req, findDataFileOrDie(dataFileId)));
953954
if (!dataFile.isTabularData()) {
@@ -957,6 +958,9 @@ public Response setFileTabularTags(@Context ContainerRequestContext crc, @PathPa
957958
try (StringReader stringReader = new StringReader(jsonBody)) {
958959
jsonObject = Json.createReader(stringReader).readObject();
959960
JsonArray requestedTabularTagsJson = jsonObject.getJsonArray("tabularTags");
961+
if (replaceData) {
962+
dataFile.setTags(Lists.newArrayList());
963+
}
960964
for (JsonValue jsonValue : requestedTabularTagsJson) {
961965
JsonString jsonString = (JsonString) jsonValue;
962966
try {

src/main/java/edu/harvard/iq/dataverse/api/Search.java

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,6 @@ public Response search(
6666
@QueryParam("fq") final List<String> filterQueries,
6767
@QueryParam("show_entity_ids") boolean showEntityIds,
6868
@QueryParam("show_api_urls") boolean showApiUrls,
69-
@QueryParam("show_my_data") boolean showMyData,
7069
@QueryParam("query_entities") boolean queryEntities,
7170
@QueryParam("metadata_fields") List<String> metadataFields,
7271
@QueryParam("geo_point") String geoPointRequested,
@@ -97,8 +96,8 @@ public Response search(
9796
objectTypeCountsMap.put(SearchConstants.UI_DATASETS, 0L);
9897
objectTypeCountsMap.put(SearchConstants.UI_FILES, 0L);
9998

100-
// users can't change these (yet anyway)
101-
boolean dataRelatedToMe = showMyData; //getDataRelatedToMe();
99+
// hard-coded to false since dataRelatedToMe is only used by MyData (DataRetrieverAPI)
100+
boolean dataRelatedToMe = false;
102101

103102
try {
104103
// we have to add "" (root) otherwise there is no permissions check
@@ -290,15 +289,6 @@ public boolean tokenLessSearchAllowed() {
290289
return tokenLessSearchAllowed;
291290
}
292291

293-
private boolean getDataRelatedToMe() {
294-
/**
295-
* @todo support Data Related To Me:
296-
* https://github.com/IQSS/dataverse/issues/1299
297-
*/
298-
boolean dataRelatedToMe = false;
299-
return dataRelatedToMe;
300-
}
301-
302292
private int getNumberOfResultsPerPage(int numResultsPerPage) {
303293
/**
304294
* @todo should maxLimit be configurable?

0 commit comments

Comments
 (0)