Skip to content

Add automatic PR batching for coordinate edit submissions#2651

Merged
CommanderStorm merged 31 commits intomainfrom
copilot/add-batching-system-for-edits
Mar 16, 2026
Merged

Add automatic PR batching for coordinate edit submissions#2651
CommanderStorm merged 31 commits intomainfrom
copilot/add-batching-system-for-edits

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 3, 2026

Proposed Change(s)

Coordinate edit submissions now batch automatically into a single open PR instead of creating individual PRs per edit. The first edit creates a PR with the "batch-in-progress" label, and subsequent edits are added to that PR until it's merged or closed.

Implementation:

  • batch_processor module (~120 lines with tests): Located at server/src/routes/feedback/batch_processor/mod.rs
    • find_open_batch_pr(): Searches for open PR with batch-in-progress label using GitHub::find_pr_with_label() with full pagination support
    • update_batch_pr_metadata(): Updates PR title, labels, and description based on edit types using GitHub struct methods
    • Unit tests for label deduplication and description formatting logic
    • BATCH_LABEL constant is public for use in other modules
  • propose_edits endpoint: Modified to create batch PRs with proper labeling or add to existing batch PR
    • First edit creates a new PR with "batch-in-progress" label and a descriptive title using extract_subject() (e.g., "chore(data): coordinate edit for mi.03.11.170")
    • Subsequent edits find and commit to the existing batch PR, updating the title to "chore(data): batch coordinate edits (N edits)"
    • Passes edit description to batch processor for appending
    • Returns PR URL (batch or individual)
  • GitHub struct extensions (server/src/external/github.rs): Added 6 new methods with pagination support for clean architecture
    • find_pr_with_label(): Find open PR with specific label (iterates through all pages)
    • update_pr_labels(): Update PR labels
    • update_pr_title(): Update PR title (takes &str parameter)
    • get_pr_commit_count(): Get number of commits in PR (counts across all pages)
    • get_pr_description(): Get current PR description/body
    • update_pr_description(): Update PR description/body (takes &str parameter)

Behavior:

// Before: Each edit → new individual PR
POST /api/feedback/propose_edits → PR #1234
POST /api/feedback/propose_edits → PR #1235

// After: Edits automatically batch into a single PR
POST /api/feedback/propose_editsPR #5000 (creates batch PR with "batch-in-progress" label, helpful title: "chore(data): coordinate edit for mi.03.11.170")
POST /api/feedback/propose_editsPR #5000 (finds and commits to batch PR, title updates to "chore(data): batch coordinate edits (2 edits)")
POST /api/feedback/propose_editsPR #5000 (continues batching, title: "chore(data): batch coordinate edits (3 edits)")
// After PR #5000 is merged/closed:
POST /api/feedback/propose_editsPR #5001 (new batch PR created with helpful context)

PR Description Format:
Each edit's description is appended to the batch PR with separators:

## Batched Coordinate Edits

### Edit #1
[Edit 1 description here]

---

### Edit #2
[Edit 2 description here]

Pagination Support:

  • find_pr_with_label() iterates through all pages of open PRs (no longer limited to first 100)
  • get_pr_commit_count() counts commits across all pages (supports batches with >100 edits)

Clean architecture approach:

  • All GitHub API operations go through GitHub struct methods
  • Batch processor focuses on business logic without direct octocrab usage
  • No library restructuring (lib.rs/app/mod.rs)
  • Minimal visibility changes to existing code
  • Made BATCH_LABEL constant public for cross-module access
  • Consistent structured logging throughout
  • Reuses existing extract_subject() method for first PR title (fixes clippy dead_code lint)

Testing:

  • Added unit tests for label deduplication logic
  • Added unit tests for description formatting (first and subsequent edits)
  • Added test for batch label constant validation

Bug Fixes:

  • Fixed compilation error: made BATCH_LABEL constant public
  • Fixed incorrect log variable reference in success path
  • Fixed clippy dead_code lint by using extract_subject() for first PR title

No configuration required. No CI changes.

Checklist

  • Documentation
    • I have updated the documentation
    • No need to update the documentation
Original prompt

Add Batching System for Coordinate Edit Submissions

Problem

Currently, each coordinate edit submission creates one pull request immediately. Users submit edits one at a time, creating a flood of individual PRs that overwhelm our CI and review processes.

Current behavior (line 180 in server/src/routes/feedback/proposed_edits/mod.rs):

let branch_name = format!("usergenerated/request-{}", rand::random::<u16>());

Each request immediately creates a PR.

Proposed Solution

Implement a batching system that queues edit requests and periodically creates PRs with multiple edits combined.

Requirements

1. Database Schema

Create a new table to queue pending edits:

CREATE TABLE pending_edit_batches (
    id SERIAL PRIMARY KEY,
    edit_data JSONB NOT NULL,
    token_id VARCHAR(255) NOT NULL,
    submitted_at TIMESTAMP NOT NULL DEFAULT NOW(),
    processed_at TIMESTAMP,
    batch_pr_url TEXT,
    status VARCHAR(50) DEFAULT 'pending' CHECK (status IN ('pending', 'processing', 'completed', 'failed'))
);
CREATE INDEX idx_pending_status ON pending_edit_batches(status, submitted_at);
CREATE INDEX idx_submitted_at ON pending_edit_batches(submitted_at) WHERE status = 'pending';

2. Modify propose_edits Endpoint

File: server/src/routes/feedback/proposed_edits/mod.rs

Changes needed:

  • Add database insertion logic to queue edits instead of immediately processing
  • Return a tracking ID to the user
  • Add environment variable BATCH_ENABLED (default: true) to enable/disable batching
  • When batching is disabled, fall back to current behavior

New response:

{
  "tracking_id": 12345,
  "status": "queued",
  "message": "Your edit has been queued and will be processed in the next batch"
}

3. Create Batch Processor

New file: server/src/batch_processor/mod.rs

Implement:

  • Function to fetch pending edits from database
  • Batch grouping logic (group by time window)
  • Aggregate multiple EditRequest objects into one
  • Apply all changes in a single commit
  • Create one PR with descriptive title listing all changes
  • Update database records with PR URL and status

Batching logic:

  • Group edits submitted within a time window (env var: BATCH_WINDOW_HOURS, default: 6)
  • Maximum edits per batch (env var: BATCH_MAX_EDITS, default: 50)
  • If max reached before time window, create batch early

Branch naming:

let branch_name = format!("usergenerated/batch-{}", chrono::Utc::now().format("%Y%m%d-%H%M%S"));

PR title format:

chore(data): batch coordinate edits ({count} edits)

PR description format:

## Batched Edit Submission

This PR contains {count} coordinate edits submitted between {start_time} and {end_time}.

### Edits included:
- Edit #{id}: coordinate edit for `{room_id}` ({timestamp})
- Edit #{id}: add image for `{room_id}` ({timestamp})
...

### Additional context:
{aggregated additional_context from all edits}

4. Add GitHub Actions Workflow

New file: .github/workflows/process-edit-batches.yml

name: Process Edit Batches

on:
  schedule:
    # Run every 6 hours
    - cron: "0 */6 * * *"
  workflow_dispatch:

permissions:
  contents: write
  pull-requests: write

jobs:
  process-batches:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Setup Rust
        uses: actions-rust-lang/setup-rust-toolchain@v1
      - name: Process batches
        env:
          POSTGRES_USER: ${{ secrets.POSTGRES_USER }}
          POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
          POSTGRES_URL: ${{ secrets.POSTGRES_URL }}
          POSTGRES_DB: ${{ secrets.POSTGRES_DB }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          cd server
          cargo run --bin batch_processor

5. Create Batch Processor Binary

New file: server/src/bin/batch_processor.rs

Command-line tool that:

  • Connects to database
  • Calls batch processor logic
  • Can be run manually or via GitHub Actions

6. Update API Documentation

File: server/src/routes/feedback/proposed_edits/mod.rs

Update the #[utoipa::path] documentation to reflect:

  • New response format with tracking ID
  • Explanation that edits are batched
  • Link to track PR status

7. Migration File

New file: data/migrations/YYYYMMDD_add_edit_batching.sql

Include the table creation SQL from requirement #1.

8. Configuration

Add to server/README.md environment variables section:

variable module usage/description
BATCH_ENABLED batch_processor Enable/disable batching (default: true)
BATCH_WINDOW_HOURS batch_processor Hours to wait before creating batch (default: 6)
BATCH_MAX_EDITS batch_processor Max edits per batch (default: 50)

Testing Requirements

  1. Unit tests for batch grouping logic
  2. Integration test simulating multiple edit submissions and batch processing...

This pull request was created from Copilot chat.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@CommanderStorm
Copy link
Copy Markdown
Member

@copilot instead of batching in the db, we need to update PRs for better persistance.

Copilot AI and others added 6 commits February 3, 2026 10:06
…d workflow

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
…ence

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
… CI workflow

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
…r logging

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
Copilot AI changed the title [WIP] Add batching system for coordinate edit submissions Add automatic PR batching for coordinate edit submissions Feb 3, 2026
Copilot AI requested a review from CommanderStorm February 3, 2026 10:31
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 3, 2026

@copilot instead of batching in the db, we need to update PRs for better persistance.

Already implemented in commit 75b2584. The system now:

  • Finds or creates an open PR with "batch-in-progress" label
  • Adds each edit as a commit to that PR
  • Updates PR title and labels with each new edit
  • No database storage - uses GitHub PRs as persistent storage

Each call to add_edit_to_batch_pr() commits directly to the batch PR branch and updates the PR metadata.

@CommanderStorm
Copy link
Copy Markdown
Member

@copilot there are tons of hard to review unrelated changes. Remove all uneessesary changes

…ementation

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 3, 2026

@copilot there are tons of hard to review unrelated changes. Remove all uneessesary changes

Removed all unnecessary changes in commit 279a1eb:

Removed:

  • lib.rs and app/mod.rs (infrastructure changes)
  • Making EditRequest/Edit fields public
  • Serialize derives on types
  • main.rs restructuring

Kept (minimal):

  • batch_processor module (102 lines) under routes/feedback
  • propose_edits modifications (+40 lines) to use batch PR when available
  • One line in feedback/mod.rs to expose batch_processor

The batch processor now accesses internal structure directly since it's in the same module tree, eliminating need for public APIs.

Copy link
Copy Markdown
Member

@CommanderStorm CommanderStorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please make sure to extend the GitHub struct with new functions to do things with GitHub for a clean architecture

…tecture

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 3, 2026

@copilot please make sure to extend the GitHub struct with new functions to do things with GitHub for a clean architecture

Refactored in commit 2e61cde.

Added to GitHub struct (external/github.rs):

  • find_pr_with_label() - Find open PR with specific label
  • update_pr_labels() - Update PR labels
  • update_pr_title() - Update PR title
  • get_pr_commit_count() - Get commit count

Batch processor now uses these methods instead of direct octocrab calls, following the clean architecture pattern established in the codebase.

@CommanderStorm
Copy link
Copy Markdown
Member

@copilot the description is not updated correctly, only title and lavlbels.

Append the necessary lines.
Refactoring might be necessary to make this simple

Comment thread server/src/routes/feedback/batch_processor/mod.rs Outdated
Comment thread server/src/routes/feedback/batch_processor/mod.rs Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Comment thread server/src/external/github.rs Outdated
Comment thread server/src/external/github.rs Outdated
Comment thread server/src/routes/feedback/batch_processor/mod.rs Outdated
Comment thread server/src/routes/feedback/batch_processor/mod.rs Outdated
Comment thread server/src/routes/feedback/proposed_edits/mod.rs Outdated
CommanderStorm and others added 2 commits February 4, 2026 15:10
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Comment thread server/src/routes/feedback/batch_processor/mod.rs Outdated
@CommanderStorm
Copy link
Copy Markdown
Member

@copilot

error[E0425]: cannot find value e in this scope
--> server/src/routes/feedback/batch_processor/mod.rs:13:26
|
13 | info!(error=?e, %pr_number, "Found open batch PR");
| ^ not found in this scope

error[E0425]: cannot find value BATCH_LABEL in this scope
--> server/src/routes/feedback/proposed_edits/mod.rs:221:29
|
221 | labels.push(BATCH_LABEL.to_string());
| ^^^^^^^^^^^ not found in this scope
|
note: constant crate::feedback::batch_processor::BATCH_LABEL exists but is inaccessible
--> server/src/routes/feedback/batch_processor/mod.rs:5:1
|
5 | const BATCH_LABEL: &str = "batch-in-progress";
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ not accessible

… variable reference

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 12, 2026

@copilot

error[E0425]: cannot find value e in this scope
...

Fixed compilation errors in commit 953e48e:

Error 1 - error=?e in wrong scope:

  • Removed incorrect error=?e from the Ok(Some) branch (line 13)
  • Updated to use structured logging: info!(%pr_number, %branch, "Found open batch PR")

Error 2 - BATCH_LABEL not accessible:

  • Made BATCH_LABEL constant public in batch_processor module
  • Updated proposed_edits to use super::batch_processor::BATCH_LABEL

Code now compiles successfully.

@CommanderStorm
Copy link
Copy Markdown
Member

@copilot fix the clippy lint and make sure that the PR titles are helpfull and have the needed context.

Batched PRs are OK to loose context, the first PR is not ideal though

Co-authored-by: CommanderStorm <26258709+CommanderStorm@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 12, 2026

@copilot fix the clippy lint and make sure that the PR titles are helpfull and have the needed context.

Batched PRs are OK to loose context, the first PR is not ideal though

Fixed in commit f078b09.

Clippy lint fixed:

  • Resolved dead_code warning by using extract_subject() method for first PR title

PR titles now have helpful context:

  • First PR: Uses extract_subject() to create descriptive titles with specific context
    • Single coordinate edit: chore(data): coordinate edit for mi.03.11.170
    • Single image: chore(data): add image for mi.00.03.001
    • Multiple edits: chore(data): edited images for mi.00.03.001 and coordinates for mi.03.11.170
  • Batch PR updates: Generic title chore(data): batch coordinate edits (N edits)

The first PR now provides clear context about what's being edited, while subsequent batch updates use the generic batching title.

@CommanderStorm CommanderStorm merged commit f82b917 into main Mar 16, 2026
35 checks passed
@CommanderStorm CommanderStorm deleted the copilot/add-batching-system-for-edits branch March 16, 2026 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants