Skip to content

Conversation

@s1st
Copy link

@s1st s1st commented Jan 10, 2026

Summary

Fixes langfuse/langfuse#10184

When datasets have folder-format names containing slashes (e.g., folder/subfolder/dataset), the SDK was returning 404 errors for get_run, get_runs, and delete_run operations because the Fern-generated API client doesn't properly URL-encode slashes in path parameters.

This PR adds wrapper methods in the main Langfuse client that properly URL-encode dataset and run names before passing them to the Fern client, following the same pattern already used for get_dataset().

Changes

  • Added get_dataset_run() wrapper method with debug logging
  • Added get_dataset_runs() wrapper method with debug logging
  • Added delete_dataset_run() wrapper method with debug logging
  • Added 4 test cases for folder-format dataset names

Test plan

  • All existing tests pass
  • New tests pass for folder-format dataset names with slashes
  • Verified no double-encoding occurs
poetry run pytest -s -v tests/test_datasets.py

Important

Adds wrapper methods in client.py for URL encoding dataset and run names, supporting folder-format names, with tests in test_datasets.py.

  • Behavior:
    • Adds get_dataset_run(), get_dataset_runs(), and delete_dataset_run() wrapper methods in client.py to handle URL encoding for dataset and run names with slashes.
    • Ensures compatibility with folder-format names (e.g., folder/subfolder/dataset).
  • Tests:
    • Adds 4 test cases in test_datasets.py to verify functionality with folder-format dataset names.
    • Tests cover fetching, listing, and deleting dataset runs with slashes in names.
  • Misc:
    • Debug logging added to new wrapper methods for better traceability.

This description was created by Ellipsis for 9062bf9. You can customize this summary. It will automatically update as commits are pushed.

Disclaimer: Experimental PR review

Greptile Overview

Greptile Summary

This PR adds three wrapper methods to handle URL encoding for dataset run operations when dataset/run names contain slashes (folder-format names like folder/subfolder/dataset).

Changes Made

New Wrapper Methods in client.py:

  • get_dataset_run() - Fetches a specific dataset run with proper URL encoding
  • get_dataset_runs() - Lists all runs for a dataset with proper URL encoding
  • delete_dataset_run() - Deletes a dataset run with proper URL encoding

All three methods follow the same pattern as the existing get_dataset() wrapper:

  1. Accept dataset/run names as parameters
  2. URL-encode them using _url_encode() (without is_url_param=True since these are path parameters, not query parameters)
  3. Pass encoded values to the Fern-generated API client
  4. Include debug logging and error handling

New Tests in test_datasets.py:

  • 4 comprehensive tests covering dataset/run operations with folder-format names
  • Tests verify that slashes are properly encoded and preserved throughout the request/response cycle

Implementation Details

The fix addresses the root cause: the Fern-generated API client passes parameters directly to httpx without encoding. For path parameters embedded in URLs (like /datasets/{name}/runs/{runName}), manual URL encoding is required before passing to the Fern client.

The _url_encode() method uses urllib.parse.quote(url, safe="") to encode all characters including slashes. The safe="" parameter is critical - without it, slashes would not be encoded, and URLs like /datasets/folder/subfolder/dataset/runs/run1 would be misinterpreted by the server.

Issue Found

One existing test still uses the direct API call langfuse.api.datasets.get_runs() instead of the new wrapper method, which could cause failures for datasets with special characters in their names.

Confidence Score: 4/5

  • This PR is generally safe to merge with one minor inconsistency that should be fixed.
  • Score of 4 reflects solid implementation following existing patterns correctly, with comprehensive tests covering the new functionality. The new wrapper methods properly handle URL encoding for path parameters, and imports are correctly placed per custom rules. However, one existing test (line 223 in test_datasets.py) still uses the direct API call instead of the new wrapper method, creating an inconsistency that could lead to test failures for datasets with special characters. This is a non-critical issue that doesn't affect the core functionality but should be fixed for consistency and robustness.
  • Pay attention to tests/test_datasets.py line 223 - this should be updated to use the new wrapper method for consistency.

Important Files Changed

File Analysis

Filename Score Overview
langfuse/_client/client.py 5/5 Adds three wrapper methods (get_dataset_run, get_dataset_runs, delete_dataset_run) with proper URL encoding for path parameters. Implementation follows existing patterns correctly, imports are properly placed at the top per custom rules.
tests/test_datasets.py 3/5 Adds 4 new tests for folder-format names. Tests are well-structured but existing test at line 223 still uses direct API call instead of new wrapper, creating inconsistency.

Sequence Diagram

sequenceDiagram
    participant User
    participant Langfuse as Langfuse Client
    participant Wrapper as Wrapper Method
    participant URLEncode as _url_encode()
    participant FernAPI as Fern API Client
    participant HTTPx as httpx
    participant Server as Langfuse Server

    User->>Langfuse: get_dataset_run("folder/dataset", "run/name")
    Langfuse->>Wrapper: Call wrapper method
    Wrapper->>URLEncode: _url_encode("folder/dataset")
    URLEncode-->>Wrapper: "folder%2Fdataset"
    Wrapper->>URLEncode: _url_encode("run/name")
    URLEncode-->>Wrapper: "run%2Fname"
    Wrapper->>FernAPI: api.datasets.get_run(encoded_dataset, encoded_run)
    FernAPI->>HTTPx: request("api/public/datasets/folder%2Fdataset/runs/run%2Fname")
    HTTPx->>Server: GET /api/public/datasets/folder%2Fdataset/runs/run%2Fname
    Server-->>HTTPx: 200 OK with DatasetRunWithItems
    HTTPx-->>FernAPI: Response
    FernAPI-->>Wrapper: DatasetRunWithItems
    Wrapper-->>Langfuse: DatasetRunWithItems
    Langfuse-->>User: DatasetRunWithItems
Loading

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

@CLAassistant
Copy link

CLAassistant commented Jan 10, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 10, 2026

Additional Comments (1)

tests/test_datasets.py
This test still uses the direct API call langfuse.api.datasets.get_runs(dataset_name) which bypasses the URL encoding wrapper. This means if someone creates a dataset with slashes in the name, this test would fail with a 404 error.

The test should use the new wrapper method instead:

    runs = langfuse.get_dataset_runs(dataset_name)

This ensures consistency with the fix being introduced in this PR and makes the test more robust for datasets with special characters in their names.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_datasets.py
Line: 223:223

Comment:
This test still uses the direct API call `langfuse.api.datasets.get_runs(dataset_name)` which bypasses the URL encoding wrapper. This means if someone creates a dataset with slashes in the name, this test would fail with a 404 error.

The test should use the new wrapper method instead:

```suggestion
    runs = langfuse.get_dataset_runs(dataset_name)
```

This ensures consistency with the fix being introduced in this PR and makes the test more robust for datasets with special characters in their names.

How can I resolve this? If you propose a fix, please make it concise.

Fixes langfuse/langfuse#10184

Added wrapper methods in the Langfuse client that properly URL-encode
dataset and run names before passing to the Fern API client:

- get_dataset_run(): Fetch a specific dataset run by name
- get_dataset_runs(): Fetch all runs for a dataset
- delete_dataset_run(): Delete a dataset run

This enables support for folder-format dataset names containing slashes
(e.g., "folder/subfolder/dataset") which previously returned 404 errors
due to double URL encoding.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@s1st s1st force-pushed the fix/dataset-folder-url-encoding branch from 9062bf9 to d71efd5 Compare January 10, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: dataset folder gives 404 in python sdk

2 participants