Skip to content

feat(webapi): batch task submission via dedicated endpoint #2694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

yaugenst-flex
Copy link
Collaborator

@yaugenst-flex yaugenst-flex commented Jul 24, 2025

batch = Batch(simulations=sims, use_batch_endpoint=True)
batch_data = batch.run()

Greptile Summary

This PR introduces a new batch task submission feature that allows submitting multiple simulations via a dedicated server endpoint instead of individual API calls. The implementation adds a new BatchTask class and refactors the existing task management system into a cleaner class hierarchy.

Key Changes:

  • New Batch Endpoint: Introduces upload_batch() function and BatchTask class that can create multiple simulation tasks in a single API call via the new /batch-tasks endpoint
  • Class Hierarchy Refactor: Creates BaseTask abstract base class and renames SimulationTask to Task (with backward compatibility alias), establishing a cleaner inheritance structure
  • Batch Types: Adds BatchType enum with different batch processing types (BATCH, INVDES, PERMUTATION, PARALLEL, MONTE_CARLO, RF_SWEEP) for server-side optimization
  • Container Integration: Updates Batch class with use_batch_endpoint flag and batch_type field, allowing users to opt into the new batch submission method
  • Performance Optimization: Reduces HTTP overhead by creating all tasks in a single batch request, then uploading simulation files individually

The feature maintains full backward compatibility by defaulting use_batch_endpoint=False, ensuring existing code continues to work unchanged. The implementation follows established patterns in the codebase with proper validation, error handling, and HTTP communication through existing utilities.

Important Files Changed

Click to expand file changes
Filename Score Overview
tidy3d/web/core/task_core.py 4/5 Adds BatchTask class and refactors SimulationTask into BaseTask hierarchy with batch creation methods
tidy3d/web/core/types.py 5/5 Adds BatchType enum for server-side batch processing optimization with minimal, focused changes
tidy3d/web/api/container.py 4/5 Introduces batch endpoint configuration and splits job creation logic between individual and batch approaches
tidy3d/web/api/webapi.py 4/5 Implements new upload_batch() function and refactors from SimulationTask to Task throughout the module
tests/test_web/test_webapi.py 5/5 Adds comprehensive test coverage for batch endpoint functionality and backward compatibility

Confidence score: 4/5

• This PR is generally safe to merge with robust backward compatibility and comprehensive test coverage
• Score reflects some unimplemented server-side functionality and extensive TODO comments indicating ongoing development
• Files need attention: tidy3d/web/core/task_core.py for placeholder methods and tidy3d/web/api/container.py for mixed batch/individual handling patterns

Sequence Diagram

sequenceDiagram
    participant User
    participant Batch
    participant BatchTask
    participant webapi
    participant Server
    participant Task

    User->>Batch: create Batch(simulations, use_batch_endpoint=True)
    User->>Batch: access .jobs property
    
    Batch->>Batch: _create_jobs_from_batch()
    Batch->>webapi: upload_batch(simulations, folder_name, ...)
    
    webapi->>webapi: validate simulations
    webapi->>BatchTask: create(simulations, folder_name, ...)
    
    BatchTask->>Server: POST /projects/{folder_id}/batch-tasks
    Server-->>BatchTask: {batchId, tasks: [{taskId, taskName}]}
    
    BatchTask-->>webapi: (batch_id, task_ids)
    webapi-->>Batch: (batch_id, task_ids)
    
    loop for each simulation
        Batch->>Task: get(task_id)
        Task-->>Batch: task instance
        Batch->>Task: upload_simulation(stub, ...)
        Task->>Server: upload simulation file
    end
    
    Batch-->>User: jobs dict with pre-assigned task_ids
    
    User->>Batch: start()
    loop for each job
        Batch->>Task: start()
        Task->>Server: POST /tasks/{task_id}/submit
    end
    
    User->>Batch: monitor()
    loop monitoring
        Batch->>Task: get status
        Task->>Server: GET /tasks/{task_id}/detail
        Server-->>Task: task status
        Task-->>Batch: status
    end
    
    User->>Batch: load(path_dir)
    loop for each job
        Batch->>Task: download(path)
        Task->>Server: download simulation data
        Server-->>Task: simulation data
    end
    Batch-->>User: BatchData
Loading

@yaugenst-flex yaugenst-flex self-assigned this Jul 24, 2025
@yaugenst-flex yaugenst-flex mentioned this pull request Jul 28, 2025
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 7 comments

Edit Code Review Bot Settings | Greptile

Comment on lines +300 to +303
# TODO: Replace with batch delete endpoint when available
# DELETE /tidy3d/projects/{folder_id}/batch/{batch_id}
for task_id in self.task_ids.values():
http.delete(f"tidy3d/tasks/{task_id}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Individual task deletion in loop could leave batch in inconsistent state if some deletions fail - consider error handling and rollback strategy

Comment on lines +318 to +321
# TODO: Replace with batch submit endpoint when available
# POST /tidy3d/projects/{folder_id}/batch/{batch_id}/submit
for task_id in self.task_ids.values():
http.post(f"tidy3d/tasks/{task_id}/submit")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Individual task submission in loop could leave batch partially submitted if some fail - consider atomic batch submission or proper error handling

Comment on lines +78 to +79
PERMUTATION = "Permutation"
PARALLEL = "Parallel"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Inconsistent naming convention: 'Permutation' and 'Parallel' use title case while others use uppercase. Consider using 'PERMUTATION' and 'PARALLEL' for consistency.

MONTE_CARLO = "MONTE_CARLO"
RF_SWEEP = "RF_SWEEP"

DEFAULT = "RF_SWEEP" # noqa: PIE796
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: The PIE796 noqa comment suppresses a 'prefer using a unique enum value' warning, but having DEFAULT = RF_SWEEP creates an alias that could cause confusion. Consider using a separate method like get_default() instead.

)

def batch_request_matcher(request):
import json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Redundant import of json module - already imported at line 4

Suggested change
import json
def batch_request_matcher(request):
json_data = json.loads(request.body)
assert "tasks" in json_data
assert "batchType" in json_data
assert "groupName" in json_data
for task in json_data["tasks"]:
assert "groupName" in task
assert task["groupName"] == json_data["groupName"]
return True, None

Comment on lines +809 to +810
assert job.task_id == f"task_id_{i}"
assert job._cached_properties.get("task_id") == f"task_id_{i}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Accessing _cached_properties directly may be fragile - consider using a public method if this internal access pattern is common

Comment on lines +354 to +357
for task_name, simulation in simulations.items():
if isinstance(simulation, (ModeSolver, ModeSimulation)):
simulation = get_reduced_simulation(simulation, reduce_simulation)
simulations[task_name] = simulation
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Modifying simulations dict during iteration could cause unexpected behavior. Consider creating a new dict or processing modifications separately.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Bot Settings | Greptile


def _create_jobs_from_batch(self) -> dict[TaskName, Job]:
"""Create jobs from batch submission response."""
batch_id, task_ids = self._submit_batch()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: The batch_id returned from _submit_batch is not stored or used anywhere, potentially missing functionality for batch tracking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant