Skip to content
Merged
9 changes: 9 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/DatasetPage.java
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,7 @@ public enum DisplayMode {
private List<SelectItem> linkingDVSelectItems;
private Dataverse linkingDataverse;
private Dataverse selectedHostDataverse;
private Boolean hasDataversesToChoose;

public Dataverse getSelectedHostDataverse() {
return selectedHostDataverse;
Expand Down Expand Up @@ -1781,6 +1782,14 @@ public void setDataverseTemplates(List<Template> dataverseTemplates) {
this.dataverseTemplates = dataverseTemplates;
}

public boolean isHasDataversesToChoose() {
// TODO we actually need to look for dataverses where a user has permission to add a dataset
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to create a new dataset with a user that had no permission to any of the 100K numbered test dataverses. It took a few seconds when searching with 2 digits.
So a variant of DataverseServiceBean.filterDataversesForHosting (on github) without a pattern for the query and exiting after one (or two?) successfull authourisation checks would take way too long for users with permission for too few dataverses.

A more efficient query might be something like

SELECT dv
FROM roleassignment ra
JOIN dataverse      dv
  on dv.id = ra.definitionpoint_id
where ra.assigneeidentifier in ('@user001', ':authenticated-users');

However, RoleAssignment.findAssigneesWithPermissionOnDvObject (on github) is dazzling me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - that query is more complex because it has to look for filedownloads configured on the dataset owning a file, etc. I haven't fully thought it through, but this custom code from QDR might be a reasonable model: https://github.com/QualitativeDataRepository/dataverse/blob/b9c0bdc374518cc499eba564e8ad04ec059bc1ff/src/main/java/edu/harvard/iq/dataverse/PermissionServiceBean.java#L1014-L1051 - it checks for superuser first, and if, the user isn't one, gets the list of roles that include the relevant permission (AddDataset in this case) and looks for a case where the user has that role on something. I think you could further constrain by looking for the roleassignment's dvobject having dtype='Dataverse'. Setting the limit to 2 would then let you see if that user has a choice somewhere. I think this is reasonably performant, but I haven't tested on a very large database (as with the case in this PR, it was orders of magnitude faster than the code I replaced).

Do you want to try that? And maybe cache the result so it isn't called 7 times and call this PR ready to move forward?

Copy link
Contributor Author

@jo-pol jo-pol Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the dazzling query PermissionServiceBean.LIST_ALL_DATAVERSES_USER_HAS_PERMISSION I might have to do something with groups too. That might be more work than @janvanmansum baragained for.

Anyway, even on a simplified solution I get:

jakarta.ejb.EJBException: An exception occurred while creating a query in EntityManager: 
Exception Description: Syntax error parsing [SELECT dv FROM roleassignment ra JOIN dataverse      dv   ON dv.id = ra.definitionpoint_id WHERE ra.assigneeidentifier = '@user001' OR ra.assigneeidentifier = ':authenticated-users' limit 2]. 
[90, 182] The expression is not a valid conditional expression.

The query works in psql.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplified attempt: DANS-KNAW-jp@4fcf8e1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hacked PermissionServiceBean.findPermittedCollections with "limit 2" for when it should have returned 50K instances. That reacted quickly. So I need

permissionService.findPermittedCollections(req, user, Set.of(Permission.CreateDataset))

with a variant of the query in the simplified attempt.

Copy link
Member

@qqmyers qqmyers Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jo-pol - Not sure where my comment went - so ~retyping.
The main issue with the 'simplified attempt' is that it's a native sql query and not JPA. That could be addressed by using createNativeQuery() or using JPA. I suggest the following code (have not tested). It handles all the groups the user might be in and limits which roles are checked, along with keeping your check that the definitionpoints are Dataverses.

I'm not sure if it is more efficient that the findPermittedCollections - certainly looks simpler. The code below does break finding the roles and groups out into separate queries (less efficient?), but it may be more efficient to be checking the role ids (which hopefully are indexed) rather than doing the bit-level or on the permissions. I'm not sure how well postgres can optimize in either case.

        Set<RoleAssignee> ras = new HashSet<>(groupService.groupsFor(req));
        ras.add(user);

        List<String> raIds = ras.stream().map(roas -> roas.getIdentifier()).collect(Collectors.toList());

        List<Long> roleIds = new ArrayList<>();
        
        for (DataverseRole role : roleService.findAll()) {
            if (role.permissions().contains(Permission.AddDataset)) {
                roleIds.add(role.getId());
            }
        }
        // Just check for more than one matching record
        // More efficient than counting all records
        try {
            return em.createQuery(
                "SELECT ra.id FROM RoleAssignment ra " +
                    "JOIN Dataverse dv ON dv.id = ra.definitionPoint.id " +
                    "WHERE ra.assigneeIdentifier IN :assigneeIdentifiers " +
                    "AND ra.role.id IN :roleIds",
            Long.class)
            .setParameter("assigneeIdentifiers", raIds)
            .setParameter("roleIds", roleIds)
            .setMaxResults(2)
            .getResultList().size() > 1;

        } catch (NoResultException e) {
            return false;
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qqmyers

That could be addressed by using createNativeQuery()

That is how the permissionService executes the query anyway and how I reused it with limit 2 added. Reusing existing logic reduces maintenance risks IMHO. Updated the cover message with how I tested.

if (this.hasDataversesToChoose == null) {
this.hasDataversesToChoose = dataverseService.getDataverseCount() > 1;
}
return this.hasDataversesToChoose;
}

public Template getDefaultTemplate() {
return defaultTemplate;
}
Expand Down
17 changes: 16 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/DataverseConverter.java
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,34 @@
import jakarta.faces.convert.Converter;
import jakarta.faces.convert.FacesConverter;

import java.util.logging.Logger;

/**
*
* @author skraffmiller
*/
@FacesConverter("dataverseConverter")
public class DataverseConverter implements Converter {
private static final Logger logger = Logger.getLogger(DataverseConverter.class.getCanonicalName());


//@EJB
DataverseServiceBean dataverseService = CDI.current().select(DataverseServiceBean.class).get();

@Override
public Object getAsObject(FacesContext facesContext, UIComponent component, String submittedValue) {
return dataverseService.find(new Long(submittedValue));
if (submittedValue == null || !submittedValue.matches("[0-9]+")) {
logger.fine("Submitted value is not a host dataverse number but: " + submittedValue);
return CDI.current().select(DatasetPage.class).get().getSelectedHostDataverse();
}
else {
try {
return dataverseService.find(Long.parseLong(submittedValue));
} catch (NumberFormatException e) {
logger.warning("Submitted value is out of range for a Long: " + submittedValue);
return CDI.current().select(DatasetPage.class).get().getSelectedHostDataverse();
}
}
//return dataverseService.findByAlias(submittedValue);
}

Expand Down
2 changes: 1 addition & 1 deletion src/main/webapp/dataset.xhtml
Original file line number Diff line number Diff line change
Expand Up @@ -777,7 +777,7 @@
<!-- Create/Edit editMode -->
<ui:fragment rendered="#{DatasetPage.editMode == 'METADATA' or DatasetPage.editMode == 'CREATE'}">
<p:focus context="datasetForm"/>
<div class="form-group">
<div class="form-group" jsf:rendered="#{DatasetPage.hasDataversesToChoose and DatasetPage.editMode == 'CREATE'}">
<label jsf:for="#{DatasetPage.editMode == 'CREATE' ? 'selectHostDataverse_input' : 'select_HostDataverse_Static'}" class="col-md-3 control-label">
#{bundle.hostDataverse}
<span class="glyphicon glyphicon-question-sign tooltip-icon"
Expand Down