Skip to content

Add modern Uploads and Datasets APIs, deprecate legacy Table API#187

Merged
va3093 merged 5 commits intomainfrom
add-uploads-datasets-api
Nov 14, 2025
Merged

Add modern Uploads and Datasets APIs, deprecate legacy Table API#187
va3093 merged 5 commits intomainfrom
add-uploads-datasets-api

Conversation

@va3093
Copy link
Contributor

@va3093 va3093 commented Nov 14, 2025

Context

The Dune API has evolved to use versioned endpoints (/v1/uploads/* and /v1/datasets/*) as defined in the OpenAPI specification, but the Python client was still using legacy unversioned routes (/table/*). This PR modernizes the client to support the new API structure while maintaining full backward compatibility.

Key findings from the OpenAPI spec analysis:

  • /v1/uploads/* endpoints provide comprehensive table management (list, create, upload CSV, insert, clear, delete)
  • /v1/datasets/* endpoints enable dataset discovery and schema inspection across all dataset types
  • The legacy /table/* endpoints are not present in the current OpenAPI spec, indicating they should be deprecated

What Changed

New APIs Added

UploadsAPI (dune_client/api/uploads.py) - Modern table management

  • list_uploads() - List all uploaded tables with pagination
  • create_table() - Create empty table with defined schema
  • upload_csv() - Upload CSV with automatic schema inference
  • insert_data() - Insert data into existing table (CSV/NDJSON)
  • clear_table() - Remove all data while preserving structure
  • delete_table() - Permanently delete table and data

DatasetsAPI (dune_client/api/datasets.py) - Dataset discovery

  • list_datasets() - Browse datasets with filtering by owner/type
  • get_dataset() - Get detailed dataset info including schema

Response Models

Added 17 new dataclass models to models.py:

  • Dataset models: DatasetType, DatasetOwner, DatasetColumn, Dataset, DatasetListResponse, DatasetResponse
  • Upload models: TableOwner, TableColumn, TableElement, UploadListResponse, UploadCreateResponse, CSVUploadResponse, InsertDataResponse, ClearTableResponse, DeleteTableResponse

All models follow existing patterns with DataClassJsonMixin and from_dict() constructors.

Backward Compatibility

TableAPI deprecation - All 5 methods in dune_client/api/table.py now use @deprecated decorator:

  • upload_csv() → Use UploadsAPI.upload_csv() instead
  • create_table() → Use UploadsAPI.create_table() instead
  • insert_table() → Use UploadsAPI.insert_data() instead
  • clear_data() → Use UploadsAPI.clear_table() instead
  • delete_table() → Use UploadsAPI.delete_table() instead

Deprecation warnings include version (1.9.0) and clear migration guidance. All legacy methods remain fully functional.

Method Resolution Order (MRO)

Updated ExtendedAPI inheritance in api/extensions.py:

  • Placed UploadsAPI and DatasetsAPI before TableAPI in the inheritance list
  • This ensures modern methods take precedence over deprecated ones with the same name
  • Added type: ignore[misc] to suppress expected mypy conflicts between incompatible method signatures
  • The client correctly uses new UploadsAPI methods while keeping deprecated TableAPI available

Testing

Unit tests (10 new tests, all passing):

  • tests/unit/test_uploads_api.py - Mock-based tests for all UploadsAPI methods
  • tests/unit/test_datasets_api.py - Mock-based tests for all DatasetsAPI methods

E2E integration tests:

  • tests/e2e/test_uploads_integration.py - Real API tests for table lifecycle
  • tests/e2e/test_datasets_integration.py - Real API tests for dataset discovery

Configuration

Updated pyproject.toml:

  • Added PT009 to test file ignores to allow unittest-style assertions
  • This maintains consistency with the existing test style preferences

Verification

✅ All 48 unit tests pass
✅ MyPy strict type checking passes (21 source files)
✅ Ruff formatting check passes
✅ Ruff linting passes
✅ Deprecation warnings work correctly
✅ New API methods accessible via DuneClient
✅ Zero breaking changes - full backward compatibility maintained

This commit introduces the modern /v1/uploads and /v1/datasets API
endpoints while maintaining backward compatibility with the legacy
/table endpoints.

Changes:
- Add UploadsAPI with /v1/uploads/* endpoints for table management
- Add DatasetsAPI with /v1/datasets/* endpoints for dataset discovery
- Deprecate all TableAPI methods with migration guidance
- Add comprehensive response models for new APIs
- Update ExtendedAPI to include new APIs with proper MRO
- Add unit and E2E tests for new functionality
- Configure ruff to allow unittest-style assertions in tests

All existing TableAPI methods remain functional but emit deprecation
warnings pointing users to the new UploadsAPI methods.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment @cursor review or bugbot run to trigger another review on this PR

namespace=self.test_namespace,
table_name=self.test_table_name,
)
self.assertIsInstance(delete_result, DeleteTableResponse)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Namespace Confusion Leads to Orphaned Tables

The test_upload_csv_and_delete test calls upload_csv() which returns only table_name in the response (not namespace), then attempts to delete the table using a hardcoded namespace "test". The CSV upload endpoint likely creates tables in the authenticated user's namespace, not "test", causing the delete operation to target the wrong namespace and potentially fail or leave orphaned tables.

Fix in Cursor Fix in Web

Copy link
Collaborator

@bh2smith bh2smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

- Remove double /v1 prefix from all routes (api_version already includes it)
- Update Dataset models to match actual API responses (full_name instead of slug/namespace)
- Update Table models to match actual API responses (full_name format)
- Change owner from id to handle+type fields
- Add nullable field to column models
- Change InsertDataResponse to use 'name' field instead of 'table_name'
- Update datasets integration tests to use full_name and required filters
- Update uploads integration tests to use DUNE_NAMESPACE env var and handle CSV upload naming
- Fix column types in test schemas (int -> integer)
- Change Dataset mocks to use full_name instead of slug/namespace
- Update owner structure to use handle+type instead of id+handle
- Add nullable field to all column mocks
- Change metadata from description field to dict
- Update TableElement mocks to use full_name format
- Change InsertDataResponse mock to use 'name' instead of 'table_name'
- Fix all route expectations to remove /v1 prefix (already in api_version)
@va3093 va3093 force-pushed the add-uploads-datasets-api branch from 94ace33 to 9a497df Compare November 14, 2025 14:35
@va3093 va3093 force-pushed the add-uploads-datasets-api branch from 9a497df to 1f6cb0a Compare November 14, 2025 15:07
@va3093 va3093 merged commit 3dacf5e into main Nov 14, 2025
2 checks passed
@va3093 va3093 deleted the add-uploads-datasets-api branch November 14, 2025 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments