Skip to content

/v1/datasets/tags returns 404 due to @validate_dataset_token expecting dataset_id #26605

@skimhiro

Description

@skimhiro

Self Checks

  • I have read the Contributing Guide and Language Policy.
  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report, otherwise it will be closed.
  • 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

1.9.1

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Endpoint:

GET /v1/datasets/tags

Actual Response:

{
  "code": "not_found",
  "message": "Dataset not found. You have requested this URI [/v1/datasets/tags] but did you mean /v1/datasets/tags or /v1/datasets/<uuid:dataset_id>/tags or /v1/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/segments/<uuid:segment_id>/child_chunks/<uuid:child_chunk_id> ?",
  "status": 404
}

🧩 Problem Description

When calling /v1/datasets/tags, the API incorrectly returns a 404 Dataset not found error.
The root cause is that the @validate_dataset_token decorator assumes the presence of a dataset_id parameter, even for endpoints that do not include one (like /datasets or /datasets/tags).

This results in a failed lookup (DatasetService.get_dataset(None)), which raises a NotFound("Dataset not found.").


✅ Expected Behavior

GET /v1/datasets/tags should return the list of knowledge-type tags successfully, as it did previously, without requiring a dataset ID.


⚙️ Root Cause

  • The decorator @validate_dataset_token is applied globally to dataset-related APIs.
  • It does not check whether dataset_id exists in the route arguments before attempting validation.
  • This causes all routes without dataset_id to fail.

🩹 Proposed Fix

Remove or bypass @validate_dataset_token for non-dataset-specific routes (e.g., /datasets, /datasets/tags).

Updated Code Example:

class DatasetTagsApi(DatasetApiResource):
    @service_api_ns.doc("list_dataset_tags")
    @service_api_ns.doc(description="Get all knowledge type tags")
    @service_api_ns.doc(
        responses={
            200: "Tags retrieved successfully",
            401: "Unauthorized - invalid API token",
        }
    )
    # @validate_dataset_token  # Removed decorator to avoid dataset_id check
    @service_api_ns.marshal_with(build_dataset_tag_fields(service_api_ns))
    def get(self, tenant_id):
        """Get all knowledge type tags."""
        assert isinstance(current_user, Account)
        cid = current_user.current_tenant_id
        assert cid is not None
        tags = TagService.get_tags("knowledge", cid)

        # Prepare response JSON
        response = []
        for tag in tags:
            binding_count = TagService.get_tag_binding_count(tag.id)
            response.append(
                {
                    "id": tag.id,
                    "name": tag.name,
                    "type": tag.type,
                    "binding_count": binding_count,
                }
            )
        return response, 200

✔️ Expected Behavior

Response

  {
    "id": "5ff031f2-0655-40b3-b955-8a057fa415d9",
    "name": "test2",
    "type": "knowledge",
    "binding_count": "1"
  },
  {
    "id": "e2d7cc22-b9b1-40cc-a135-8f81a0ea32c8",
    "name": "test",
    "type": "knowledge",
    "binding_count": "1"
  }
]

❌ Actual Behavior

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions