feat: implement comprehensive dataset management functionality by Patrick-Ehimen · Pull Request #29 · Patrick-Ehimen/lighthouse-agent-tooling

Patrick-Ehimen · 2025-10-22T13:52:33Z

Description

This PR implements comprehensive dataset management functionality for the Lighthouse AI integration system. The feature enables AI agents to create, manage, and organize collections of files with metadata, versioning, and access control through the MCP protocol.

Key Features

SDK Wrapper Enhancements

createDataset: Upload multiple files and create dataset with metadata and progress tracking
updateDataset: Add/remove files and update dataset properties with version management
getDataset: Retrieve detailed dataset information by ID
listDatasets: List all datasets with pagination support
deleteDataset: Remove dataset and optionally associated files

New MCP Tools

lighthouse_create_dataset: Create datasets with comprehensive file validation
lighthouse_list_datasets: List datasets with pagination and filtering options
lighthouse_get_dataset: Retrieve detailed dataset information and file lists
lighthouse_update_dataset: Update existing datasets with new files and metadata

Service Layer Improvements

Extended ILighthouseService interface with dataset methods
Implemented dataset storage in both real and mock Lighthouse services
Added dataset caching and automatic version management
Integrated with existing file upload/download infrastructure

The implementation maintains backward compatibility while adding powerful new dataset management capabilities for AI agents.

Summary by Sourcery

Implement comprehensive dataset management support across the SDK, service layer, and MCP tools, enabling AI agents to manage collections of files with rich metadata, pagination, versioning, and access control.

New Features:

Introduce SDK methods to create, update, retrieve, list, and delete datasets with metadata, versioning, encryption, and progress tracking.
Implement corresponding dataset management APIs in ILighthouseService and its real and mock implementations with caching and automatic version handling.
Add new MCP tools (lighthouse_create_dataset, lighthouse_list_datasets, lighthouse_get_dataset, lighthouse_update_dataset) and register them in the MCP server for dataset operations.

Enhancements:

Extend DTOs and types with DatasetOptions, DatasetInfo, and ListDatasetsResponse for structured dataset handling
Integrate dataset workflows with existing file upload/download infrastructure while preserving backward compatibility

sourcery-ai · 2025-10-22T13:52:39Z

Reviewer's Guide

This PR introduces end-to-end dataset management support by extending the SDK, service layer, and MCP server tooling. The SDK wrapper gains create/update/get/list/delete dataset methods with progress tracking and versioning; the real and mock LighthouseService implementations are updated with dataset storage, caching, and conversion between SDK and service models; new MCP tools are registered and implemented for dataset operations; and corresponding types and test scaffolding are added.

Sequence diagram for dataset creation via MCP tool and service

sequenceDiagram
actor User
participant MCPServer
participant LighthouseCreateDatasetTool
participant ILighthouseService
participant LighthouseAISDK
User->>MCPServer: Request to create dataset
MCPServer->>LighthouseCreateDatasetTool: execute(args)
LighthouseCreateDatasetTool->>ILighthouseService: createDataset(params)
ILighthouseService->>LighthouseAISDK: createDataset(filePaths, options)
LighthouseAISDK-->>ILighthouseService: DatasetInfo
ILighthouseService-->>LighthouseCreateDatasetTool: Dataset
LighthouseCreateDatasetTool-->>MCPServer: ToolResult
MCPServer-->>User: Dataset creation result

Sequence diagram for updating a dataset via MCP tool and service

sequenceDiagram
actor User
participant MCPServer
participant LighthouseUpdateDatasetTool
participant ILighthouseService
participant LighthouseAISDK
User->>MCPServer: Request to update dataset
MCPServer->>LighthouseUpdateDatasetTool: execute(args)
LighthouseUpdateDatasetTool->>ILighthouseService: updateDataset(params)
ILighthouseService->>LighthouseAISDK: updateDataset(datasetId, options)
LighthouseAISDK-->>ILighthouseService: DatasetInfo
ILighthouseService-->>LighthouseUpdateDatasetTool: Dataset
LighthouseUpdateDatasetTool-->>MCPServer: ToolResult
MCPServer-->>User: Dataset update result

Class diagram for new and updated dataset types

classDiagram
class DatasetOptions {
  +string name
  +string description
  +boolean encrypt
  +Record<string, any> metadata
  +string[] tags
  +onProgress(progress: ProgressInfo)
}
class DatasetInfo {
  +string id
  +string name
  +string description
  +string[] files
  +string version
  +Date createdAt
  +Date updatedAt
  +boolean encrypted
  +Record<string, any> metadata
  +string[] tags
  +number totalSize
  +number fileCount
}
class ListDatasetsResponse {
  +DatasetInfo[] datasets
  +number total
  +boolean hasMore
  +string cursor
}
DatasetOptions --> ProgressInfo
ListDatasetsResponse --> DatasetInfo

Class diagram for ILighthouseService interface (updated)

classDiagram
class ILighthouseService {
  +createDataset(params): Promise<Dataset>
  +updateDataset(params): Promise<Dataset>
  +getDataset(datasetId): Promise<Dataset | undefined>
  +listDatasets(params): Promise<DatasetsListResult>
  +deleteDataset(datasetId, deleteFiles): Promise<void>
  +clear(): void
}
class Dataset {
  +string id
  +string name
  +string description
  +UploadResult[] files
  +Record<string, any> metadata
  +string version
  +Date createdAt
  +Date updatedAt
  +boolean encrypted
  +AccessCondition[] accessConditions
}
class DatasetsListResult {
  +Dataset[] datasets
  +number total
  +boolean hasMore
}
ILighthouseService --> Dataset
ILighthouseService --> DatasetsListResult
DatasetsListResult --> Dataset

Class diagram for LighthouseAISDK dataset management methods

classDiagram
class LighthouseAISDK {
  +createDataset(filePaths, options): Promise<DatasetInfo>
  +updateDataset(datasetId, options): Promise<DatasetInfo>
  +getDataset(datasetId): Promise<DatasetInfo>
  +listDatasets(limit, offset): Promise<ListDatasetsResponse>
  +deleteDataset(datasetId, deleteFiles): Promise<void>
}
LighthouseAISDK --> DatasetInfo
LighthouseAISDK --> ListDatasetsResponse

Class diagram for new MCP dataset tools

classDiagram
class LighthouseCreateDatasetTool {
  +execute(args): Promise<ProgressAwareToolResult>
  +static getDefinition(): MCPToolDefinition
}
class LighthouseListDatasetsTool {
  +execute(args): Promise<ProgressAwareToolResult>
  +static getDefinition(): MCPToolDefinition
}
class LighthouseGetDatasetTool {
  +execute(args): Promise<ProgressAwareToolResult>
  +static getDefinition(): MCPToolDefinition
}
class LighthouseUpdateDatasetTool {
  +execute(args): Promise<ProgressAwareToolResult>
  +static getDefinition(): MCPToolDefinition
}
LighthouseCreateDatasetTool --> MCPToolDefinition
LighthouseListDatasetsTool --> MCPToolDefinition
LighthouseGetDatasetTool --> MCPToolDefinition
LighthouseUpdateDatasetTool --> MCPToolDefinition

File-Level Changes

Change	Details	Files
Extend SDK wrapper with dataset lifecycle methods	Added createDataset with file uploads, metadata, progress tracking and error handling Implemented updateDataset, getDataset, listDatasets and deleteDataset methods Introduced DatasetOptions, DatasetInfo and ListDatasetsResponse types	`packages/sdk-wrapper/src/LighthouseAISDK.ts` `packages/sdk-wrapper/src/types.ts` `packages/sdk-wrapper/src/index.ts`
Augment service layer for dataset support	Extended ILighthouseService interface with dataset methods Implemented create/update/get/list/delete in LighthouseService with caching Added corresponding datasetStore logic to MockLighthouseService	`apps/mcp-server/src/services/ILighthouseService.ts` `apps/mcp-server/src/services/LighthouseService.ts` `apps/mcp-server/src/services/MockLighthouseService.ts`
Register new dataset tools in MCP server	Imported and registered create/list/get/update dataset tools in server.ts Updated tools index to export dataset tool classes Updated getAllToolDefinitions to include new definitions	`apps/mcp-server/src/server.ts` `apps/mcp-server/src/tools/index.ts`
Implement MCP tools for dataset operations	Created LighthouseCreateDatasetTool with parameter validation and execution Created LighthouseUpdateDatasetTool with update logic and response formatting Added LighthouseListDatasetsTool and LighthouseGetDatasetTool classes	`apps/mcp-server/src/tools/LighthouseCreateDatasetTool.ts` `apps/mcp-server/src/tools/LighthouseUpdateDatasetTool.ts` `apps/mcp-server/src/tools/LighthouseListDatasetsTool.ts` `apps/mcp-server/src/tools/LighthouseGetDatasetTool.ts`
Add test scaffolding for dataset tools	Added dataset-tools.test.ts to set up future unit tests for tool classes	`apps/mcp-server/src/__tests__/dataset-tools.test.ts`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

There’s a lot of duplicated logic mapping SDK’s DatasetInfo to your internal Dataset type across the service layer—consider extracting a shared converter to reduce repetition and prevent drift.
I noticed you registered create, list, get, and update dataset tools but did not add a deleteDataset tool to the registry—adding that would round out the full CRUD support.
You’re generating dataset IDs and version strings inline in multiple methods; pulling that into a shared utility or factory would keep things consistent and make future changes easier.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- There’s a lot of duplicated logic mapping SDK’s DatasetInfo to your internal Dataset type across the service layer—consider extracting a shared converter to reduce repetition and prevent drift.
- I noticed you registered create, list, get, and update dataset tools but did not add a deleteDataset tool to the registry—adding that would round out the full CRUD support.
- You’re generating dataset IDs and version strings inline in multiple methods; pulling that into a shared utility or factory would keep things consistent and make future changes easier.

## Individual Comments

### Comment 1
<location> `apps/mcp-server/src/services/LighthouseService.ts:399-401` </location>
<code_context>
+        id: datasetInfo.id,
+        name: datasetInfo.name,
+        description: datasetInfo.description || "",
+        files: datasetInfo.files.map((hash) => ({
+          cid: hash,
+          size: 0, // Would need to fetch individual file info
+          encrypted: datasetInfo.encrypted,
+          accessConditions: params.accessConditions,
</code_context>

<issue_to_address>
**suggestion:** File size is set to 0 for all dataset files, which may mislead consumers.

If file size is needed, fetch or calculate it during dataset creation. Otherwise, indicate in documentation or types that size is unavailable.

Suggested implementation:

```typescript
        files: datasetInfo.files.map((hash) => ({
          cid: hash,
          // File size is currently unavailable; set to 0 as a placeholder.
          // Update this logic to fetch actual file size if/when available.
          size: 0,
          encrypted: datasetInfo.encrypted,
          accessConditions: params.accessConditions,
          tags: params.tags,
          uploadedAt: datasetInfo.createdAt,
          originalPath: "",
          hash: hash,
        })),

```

If you have a type definition for the file object (e.g., `DatasetFile`), update its documentation to clarify that `size` may be unavailable and is set to 0 as a placeholder. For example:

```typescript
// In types/DatasetFile.ts or similar
/**
 * File size in bytes. May be 0 if unavailable.
 */
size: number;
```

If you later implement logic to fetch the actual file size, replace the placeholder with the real value.
</issue_to_address>

### Comment 2
<location> `apps/mcp-server/src/services/LighthouseService.ts:403` </location>
<code_context>
+          cid: hash,
+          size: 0, // Would need to fetch individual file info
+          encrypted: datasetInfo.encrypted,
+          accessConditions: params.accessConditions,
+          tags: params.tags,
+          uploadedAt: datasetInfo.createdAt,
</code_context>

<issue_to_address>
**suggestion:** Access conditions are passed to the Dataset object but not handled in updateDataset.

If updates to access conditions are needed, add support in updateDataset; otherwise, document that access conditions cannot be changed after creation.

Suggested implementation:

```typescript
/**
 * Updates a dataset.
 * Note: accessConditions can be updated via this function.
 */
async function updateDataset(datasetId: string, updateParams: Partial<Dataset>) {
  // Fetch the existing dataset
  const dataset = await getDatasetById(datasetId);

  // Update allowed fields
  if (updateParams.name !== undefined) dataset.name = updateParams.name;
  if (updateParams.description !== undefined) dataset.description = updateParams.description;
  if (updateParams.tags !== undefined) dataset.tags = updateParams.tags;
  if (updateParams.metadata !== undefined) dataset.metadata = updateParams.metadata;

  // Support updating accessConditions
  if (updateParams.accessConditions !== undefined) {
    dataset.accessConditions = updateParams.accessConditions;
    // If files also need accessConditions updated, update them as well
    if (dataset.files) {
      dataset.files = dataset.files.map(file => ({
        ...file,
        accessConditions: updateParams.accessConditions
      }));
    }
  }

  // Save the updated dataset
  await saveDataset(dataset);
  return dataset;
}

```

```typescript
/**
 * Updates a dataset.
 * Note: accessConditions cannot be changed after creation.
 */
async function updateDataset(datasetId: string, updateParams: Partial<Dataset>) {
  // Fetch the existing dataset
  const dataset = await getDatasetById(datasetId);

  // Update allowed fields
  if (updateParams.name !== undefined) dataset.name = updateParams.name;
  if (updateParams.description !== undefined) dataset.description = updateParams.description;
  if (updateParams.tags !== undefined) dataset.tags = updateParams.tags;
  if (updateParams.metadata !== undefined) dataset.metadata = updateParams.metadata;

  // Do NOT allow updating accessConditions after creation
  // If updateParams.accessConditions is present, ignore it and document this behavior

  // Save the updated dataset
  await saveDataset(dataset);
  return dataset;
}

```

If you choose Option 1, ensure that the `Dataset` type/interface includes `accessConditions` as an updatable field.  
If you choose Option 2, you may want to throw an error or log a warning if `updateParams.accessConditions` is provided, to make the restriction explicit.
</issue_to_address>

### Comment 3
<location> `apps/mcp-server/src/services/MockLighthouseService.ts:479-481` </location>
<code_context>
+
+      // Optionally delete associated files
+      if (deleteFiles) {
+        for (const file of dataset.files) {
+          this.fileStore.delete(file.cid);
+          this.currentStorageSize -= file.size;
+        }
+      }
</code_context>

<issue_to_address>
**issue (bug_risk):** Deleting files from fileStore may result in negative currentStorageSize.

Add a check to ensure currentStorageSize does not drop below zero when deleting files, accounting for possible inaccuracies in file sizes or repeated deletions.
</issue_to_address>

### Comment 4
<location> `apps/mcp-server/src/tools/LighthouseCreateDatasetTool.ts:128` </location>
<code_context>
+      return "filePaths is required and must be a non-empty array";
+    }
+
+    if (params.filePaths.length > 50) {
+      return "Maximum 50 files allowed per dataset";
+    }
</code_context>

<issue_to_address>
**suggestion:** Maximum file count per dataset is hardcoded to 50.

Consider moving this limit to a constant or configuration file if it may change in the future.

Suggested implementation:

```typescript
    const MAX_FILES_PER_DATASET = 50;

    if (params.filePaths.length > MAX_FILES_PER_DATASET) {
      return `Maximum ${MAX_FILES_PER_DATASET} files allowed per dataset`;
    }

```

If you have a central configuration file for such constants, consider moving `MAX_FILES_PER_DATASET` there and importing it instead of defining it locally.
</issue_to_address>

### Comment 5
<location> `apps/mcp-server/src/tools/LighthouseListDatasetsTool.ts:136-152` </location>
<code_context>
+          description: dataset.description,
+          version: dataset.version,
+          fileCount: dataset.files.length,
+          totalSize: dataset.files.reduce((sum, file) => sum + file.size, 0),
+          encrypted: dataset.encrypted,
+          tags: dataset.metadata?.keywords || [],
</code_context>

<issue_to_address>
**suggestion:** Total size calculation may be inaccurate if file sizes are not set.

If file sizes are missing or zero, either exclude totalSize from the response or clearly document its limitations.

```suggestion
        datasets: response.datasets.map((dataset) => {
          // Check if all files have a valid size
          const allFilesHaveValidSize = dataset.files.every(
            (file) => typeof file.size === 'number' && file.size > 0
          );

          const datasetResponse: any = {
            id: dataset.id,
            name: dataset.name,
            description: dataset.description,
            version: dataset.version,
            fileCount: dataset.files.length,
            encrypted: dataset.encrypted,
            tags: dataset.metadata?.keywords || [],
            createdAt: dataset.createdAt.toISOString(),
            updatedAt: dataset.updatedAt.toISOString(),
            metadata: {
              author: dataset.metadata?.author,
              license: dataset.metadata?.license,
              category: dataset.metadata?.category,
            },
          };

          // Only include totalSize if all files have a valid size
          if (allFilesHaveValidSize) {
            // Note: totalSize is only accurate if all file sizes are set and non-zero
            datasetResponse.totalSize = dataset.files.reduce((sum, file) => sum + file.size, 0);
          }

          return datasetResponse;
        }),
```
</issue_to_address>

### Comment 6
<location> `apps/mcp-server/src/__tests__/dataset-tools.test.ts:4` </location>
<code_context>
+/**
+ * Dataset Tools Tests
+ */
+
+import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
+import { Logger } from "@lighthouse-tooling/shared";
</code_context>

<issue_to_address>
**issue (testing):** No test cases found for dataset management tools.

Please add thorough unit and integration tests for all new dataset management tools and service methods, covering core functionality, edge cases, and error handling, especially for file operations, metadata, and access control.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-10-22T13:53:34Z

+          cid: hash,
+          size: 0, // Would need to fetch individual file info
+          encrypted: datasetInfo.encrypted,
+          accessConditions: params.accessConditions,


suggestion: Access conditions are passed to the Dataset object but not handled in updateDataset.

If updates to access conditions are needed, add support in updateDataset; otherwise, document that access conditions cannot be changed after creation.

Suggested implementation:

/** * Updates a dataset. * Note: accessConditions can be updated via this function. */ async function updateDataset(datasetId: string, updateParams: Partial<Dataset>) { // Fetch the existing dataset const dataset = await getDatasetById(datasetId); // Update allowed fields if (updateParams.name !== undefined) dataset.name = updateParams.name; if (updateParams.description !== undefined) dataset.description = updateParams.description; if (updateParams.tags !== undefined) dataset.tags = updateParams.tags; if (updateParams.metadata !== undefined) dataset.metadata = updateParams.metadata; // Support updating accessConditions if (updateParams.accessConditions !== undefined) { dataset.accessConditions = updateParams.accessConditions; // If files also need accessConditions updated, update them as well if (dataset.files) { dataset.files = dataset.files.map(file => ({ ...file, accessConditions: updateParams.accessConditions })); } } // Save the updated dataset await saveDataset(dataset); return dataset; }

/** * Updates a dataset. * Note: accessConditions cannot be changed after creation. */ async function updateDataset(datasetId: string, updateParams: Partial<Dataset>) { // Fetch the existing dataset const dataset = await getDatasetById(datasetId); // Update allowed fields if (updateParams.name !== undefined) dataset.name = updateParams.name; if (updateParams.description !== undefined) dataset.description = updateParams.description; if (updateParams.tags !== undefined) dataset.tags = updateParams.tags; if (updateParams.metadata !== undefined) dataset.metadata = updateParams.metadata; // Do NOT allow updating accessConditions after creation // If updateParams.accessConditions is present, ignore it and document this behavior // Save the updated dataset await saveDataset(dataset); return dataset; }

If you choose Option 1, ensure that the Dataset type/interface includes accessConditions as an updatable field.
If you choose Option 2, you may want to throw an error or log a warning if updateParams.accessConditions is provided, to make the restriction explicit.

sourcery-ai · 2025-10-22T13:53:34Z

+        for (const file of dataset.files) {
+          this.fileStore.delete(file.cid);
+          this.currentStorageSize -= file.size;


issue (bug_risk): Deleting files from fileStore may result in negative currentStorageSize.

Add a check to ensure currentStorageSize does not drop below zero when deleting files, accounting for possible inaccuracies in file sizes or repeated deletions.

sourcery-ai · 2025-10-22T13:53:34Z

+      return "filePaths is required and must be a non-empty array";
+    }
+
+    if (params.filePaths.length > 50) {


suggestion: Maximum file count per dataset is hardcoded to 50.

Consider moving this limit to a constant or configuration file if it may change in the future.

Suggested implementation:

const MAX_FILES_PER_DATASET = 50; if (params.filePaths.length > MAX_FILES_PER_DATASET) { return `Maximum ${MAX_FILES_PER_DATASET} files allowed per dataset`; }

If you have a central configuration file for such constants, consider moving MAX_FILES_PER_DATASET there and importing it instead of defining it locally.

sourcery-ai · 2025-10-22T13:53:34Z

+        datasets: response.datasets.map((dataset) => ({
+          id: dataset.id,
+          name: dataset.name,
+          description: dataset.description,
+          version: dataset.version,
+          fileCount: dataset.files.length,
+          totalSize: dataset.files.reduce((sum, file) => sum + file.size, 0),
+          encrypted: dataset.encrypted,
+          tags: dataset.metadata?.keywords || [],
+          createdAt: dataset.createdAt.toISOString(),
+          updatedAt: dataset.updatedAt.toISOString(),
+          metadata: {
+            author: dataset.metadata?.author,
+            license: dataset.metadata?.license,
+            category: dataset.metadata?.category,
+          },
+        })),


suggestion: Total size calculation may be inaccurate if file sizes are not set.

If file sizes are missing or zero, either exclude totalSize from the response or clearly document its limitations.

Suggested change

datasets: response.datasets.map((dataset) => ({

id: dataset.id,

name: dataset.name,

description: dataset.description,

version: dataset.version,

fileCount: dataset.files.length,

totalSize: dataset.files.reduce((sum, file) => sum + file.size, 0),

encrypted: dataset.encrypted,

tags: dataset.metadata?.keywords || [],

createdAt: dataset.createdAt.toISOString(),

updatedAt: dataset.updatedAt.toISOString(),

metadata: {

author: dataset.metadata?.author,

license: dataset.metadata?.license,

category: dataset.metadata?.category,

},

})),

datasets: response.datasets.map((dataset) => {

// Check if all files have a valid size

const allFilesHaveValidSize = dataset.files.every(

(file) => typeof file.size === 'number' && file.size > 0

);

const datasetResponse: any = {

id: dataset.id,

name: dataset.name,

description: dataset.description,

version: dataset.version,

fileCount: dataset.files.length,

encrypted: dataset.encrypted,

tags: dataset.metadata?.keywords || [],

createdAt: dataset.createdAt.toISOString(),

updatedAt: dataset.updatedAt.toISOString(),

metadata: {

author: dataset.metadata?.author,

license: dataset.metadata?.license,

category: dataset.metadata?.category,

},

};

// Only include totalSize if all files have a valid size

if (allFilesHaveValidSize) {

// Note: totalSize is only accurate if all file sizes are set and non-zero

datasetResponse.totalSize = dataset.files.reduce((sum, file) => sum + file.size, 0);

}

return datasetResponse;

}),

…ion features

feat: implement comprehensive dataset management functionality

a9fc13b

sourcery-ai bot reviewed Oct 22, 2025

View reviewed changes

Patrick-Ehimen added 2 commits October 27, 2025 11:18

fix: resolve dataset management functionality and test issues

228e0d8

Merge main into feat/dataset-management - combine dataset and encrypt…

f484e2a

…ion features

Patrick-Ehimen merged commit 6e8811a into main Oct 27, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement comprehensive dataset management functionality#29

feat: implement comprehensive dataset management functionality#29
Patrick-Ehimen merged 3 commits intomainfrom
feat/dataset-management

Patrick-Ehimen commented Oct 22, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Oct 22, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Oct 22, 2025

Uh oh!

sourcery-ai bot Oct 22, 2025

Uh oh!

sourcery-ai bot Oct 22, 2025

Uh oh!

sourcery-ai bot Oct 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Patrick-Ehimen commented Oct 22, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key Features

SDK Wrapper Enhancements

New MCP Tools

Service Layer Improvements

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for dataset creation via MCP tool and service

Sequence diagram for updating a dataset via MCP tool and service

Class diagram for new and updated dataset types

Class diagram for ILighthouseService interface (updated)

Class diagram for LighthouseAISDK dataset management methods

Class diagram for new MCP dataset tools

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Patrick-Ehimen commented Oct 22, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Oct 22, 2025 •

edited

Loading