CLAUDE.md - Confluence Test Data Generator

Project Overview

Purpose: Generate realistic test data for Confluence Cloud instances using production-based multipliers derived from analyzing thousands of real Confluence backups.

Key Features:

Async concurrency for high-volume content creation
Intelligent rate limit handling with exponential backoff and adaptive throttling
Production-based multipliers loaded from CSV file
Content-only mode (--content-only) for scale testing (spaces, pages, blogposts only)
Checkpointing for resumable large-scale runs
Benchmarking with time extrapolation for planning large runs
Pre-generated random text pool for reduced CPU overhead at scale
Optimized connection pooling for both sync and async HTTP sessions

Target User: Teams who need to test Confluence backup/restore scenarios with realistic data volumes.

Related Project: This mirrors the architecture of the Jira Test Data Generator.

File Structure

.
├── .ai/                          # AI-driven development resources
│   ├── README.md                # Directory overview and active plans
│   ├── plans/                   # Implementation plans by initiative
│   │   └── confluence-test-data-generator/
│   │       ├── PLAN.md          # Implementation plan
│   │       └── DESIGN.md        # Design document
│   └── guides/
│       └── AI_IMPLEMENTATION_GUIDE.md
├── confluence_data_generator.py  # Main orchestrator (all generators wired up)
├── confluence_user_generator.py  # Standalone user/group generator (DONE)
├── generators/                   # Modular generators package
│   ├── __init__.py              # Package exports
│   ├── base.py                  # ConfluenceAPIClient, RateLimitState (~694 lines)
│   ├── benchmark.py             # BenchmarkTracker, PhaseMetrics (~489 lines)
│   ├── blogposts.py             # BlogPostGenerator (DONE)
│   ├── checkpoint.py            # CheckpointManager (~696 lines)
│   ├── folders.py               # FolderGenerator (DONE)
│   ├── spaces.py                # SpaceGenerator (DONE)
│   ├── pages.py                 # PageGenerator (DONE)
│   ├── attachments.py           # AttachmentGenerator (DONE)
│   ├── comments.py              # CommentGenerator (DONE)
│   └── templates.py             # TemplateGenerator (DONE)
├── tests/                        # Unit tests (90%+ coverage required)
│   ├── conftest.py              # Shared pytest fixtures
│   ├── test_base.py             # ConfluenceAPIClient tests (73 tests)
│   ├── test_benchmark.py        # BenchmarkTracker tests (75 tests)
│   ├── test_blogposts.py        # BlogPostGenerator tests (49 tests)
│   ├── test_checkpoint.py       # CheckpointManager tests (65 tests)
│   ├── test_comments.py         # CommentGenerator tests (53 tests)
│   ├── test_folders.py          # FolderGenerator tests (39 tests)
│   ├── test_templates.py        # TemplateGenerator tests (23 tests)
│   ├── test_attachments.py      # AttachmentGenerator tests (53 tests)
│   ├── test_pages.py            # PageGenerator tests (52 tests)
│   ├── test_spaces.py           # SpaceGenerator tests (55 tests)
│   ├── test_cleanup.py          # Cleanup command tests (21 tests)
│   └── test_user_generator.py   # User generator tests (51 tests)
├── .github/workflows/
│   ├── test.yml                 # Tests with 90% coverage threshold
│   ├── lint.yml                 # Ruff linting
│   └── code-scanning.yml        # CodeQL security scanning
├── item_type_multipliers.csv    # Multiplier configuration
├── test_connectivity.py         # API connectivity test
├── requirements.txt             # Python dependencies
├── requirements-dev.txt         # Test dependencies
├── pytest.ini                   # Pytest configuration
├── ruff.toml                    # Ruff linter configuration
├── .env.example                 # API token template
├── README.md                    # User-facing documentation
└── CLAUDE.md                    # This file - for AI agents

Development Guidelines

Code Quality Requirements

Test Coverage: Minimum 90% coverage enforced by CI. Never lower this threshold.
Linting: All code must pass ruff check . and ruff format --check .
Security: CodeQL scanning enabled - all security alerts must be resolved

Task Completion Checklist

Before presenting a summary of completed work, always run the integration test first (see "Integration Testing Before User Review" in the Workflow section). This catches API-level issues that unit tests miss. Only present the summary after the integration test passes and cleanup is done.

Before marking any task complete, verify documentation is up to date:

Internal Documentation (in .ai/plans/):

Implementation plan status updated (mark tasks as COMPLETED)
Any API learnings or gotchas documented
Architecture decisions captured

External Documentation:

README.md - User-facing usage instructions and examples
CLAUDE.md - Technical details, patterns, and AI agent guidance
CLI help text in the code itself (--help output)

When to Update:

New feature added → Update README usage section + CLAUDE.md file structure
New CLI option → Update README CLI table + CLAUDE.md Command Line Options
Bug fix with learnings → Add to CLAUDE.md Coding Patterns section
API endpoint used → Verify CLAUDE.md API Endpoints section is current

Coding Patterns (Lessons from PR Reviews)

Security: Never Log Secrets

# BAD - CodeQL will flag this
print(f"Token: {'*' * 8}...{api_token[-4:]}")

# GOOD - No part of the secret revealed
print(f"Token: {'*' * 12} (configured)")

Don't Mutate Function Inputs

# BAD - Mutates caller's dictionary
@classmethod
def from_dict(cls, data: dict) -> "MyClass":
    value = data.pop("key", default)  # Modifies input!
    return cls(value=value, **data)

# GOOD - Work on a copy
@classmethod
def from_dict(cls, data: dict) -> "MyClass":
    data = dict(data)  # Shallow copy
    value = data.pop("key", default)
    return cls(value=value, **data)

Use `replace()` for Atomic File Operations

# BAD - rename() fails if target exists
source_path.rename(target_path)

# GOOD - replace() does atomic overwrite
source_path.replace(target_path)

Document Thread-Safety Assumptions

class CheckpointManager:
    """Manages checkpoint file operations for resumable generation.

    Note: This class is designed for single-threaded use by the main
    orchestrator. Checkpoint updates are serialized through the orchestrator
    even when using async concurrency for API calls. Do not call checkpoint
    methods from multiple concurrent tasks.
    """

Serialize Versioning Per Resource + Retry on 409

Confluence uses Hibernate optimistic locking (HIBERNATEVERSION column). Three things cause 409 CONFLICT (StaleStateException):

Concurrent writes: Two requests read version N, both try to write N+1
Eventual consistency: Sequential GET after PUT returns a stale version
Cross-operation conflicts: Other writes (e.g. property updates) increment HIBERNATEVERSION even though the API version number doesn't change

The fix has three parts:

(a) Group versions by resource, process each resource sequentially
(b) Fetch version once, increment locally
(c) On failure, wait briefly, re-read the version from the API, and retry

# GOOD - sequential per resource, local tracking, retry with re-read on 409
async def _versions_for_page(page_id, n):
    data = await api_get(f"pages/{page_id}")
    current = data["version"]["number"]
    for _ in range(n):
        for retry in range(3):                         # Retry on conflict
            next_ver = current + 1
            success = await api_put(f"pages/{page_id}", version=next_ver)
            if success:
                current = next_ver
                break
            # Re-read version on failure (409 from cross-operation conflict)
            await asyncio.sleep(0.5 * (retry + 1))
            fresh = await api_get(f"pages/{page_id}")
            current = fresh["version"]["number"]

# Parallel across pages, sequential within each page
for batch in batched(page_ids, concurrency):
    tasks = [_versions_for_page(pid, versions_per_page[pid]) for pid in batch]
    await asyncio.gather(*tasks)

This applies to any operation that does read-modify-write on the same resource: page versions, blogpost versions, attachment versions, comment versions.

Suppress Error Logging for Caller-Handled Errors

When a caller has its own retry logic for specific HTTP errors (e.g., 409 conflicts), use suppress_errors to prevent the base class from logging ERROR on every attempt:

# BAD - base class logs ERROR on every 409, even when caller retries successfully
success, _ = await self._api_call_async("PUT", f"pages/{page_id}", data=update_data)

# GOOD - 409s logged at DEBUG; caller's retry loop handles them
success, _ = await self._api_call_async(
    "PUT", f"pages/{page_id}", data=update_data, suppress_errors=(409,)
)

Only Log Errors on Final Retry Failure

For transient errors (5xx, connection errors), log at DEBUG during retries and only escalate to ERROR when all retries are exhausted. This keeps logs clean when retries succeed.

Truncate HTML Error Responses

Confluence returns full HTML pages for 5xx errors. Use _truncate_error_response() to avoid dumping thousands of lines of HTML into logs:

# BAD - logs entire HTML page
self.logger.error(f"Response: {error_text}")

# GOOD - detects HTML and truncates
self.logger.error(f"Response: {self._truncate_error_response(error_text)}")

Attachment Uploads Require Separate Session

The base _api_call_async() hardcodes Content-Type: application/json. Attachment uploads need multipart form data, so AttachmentGenerator creates a dedicated _async_upload_session with X-Atlassian-Token: no-check header and no Content-Type (let aiohttp set it from FormData).

Always Handle Exceptions from `asyncio.gather`

When using asyncio.gather(*tasks, return_exceptions=True), you must check results for Exception instances and log them. This has been flagged in PRs #10, #14, #16, and #17 — it's the most common review finding.

# BAD - silently drops exceptions
results = await asyncio.gather(*tasks, return_exceptions=True)
for result in results:
    if isinstance(result, dict):
        created.append(result)

# GOOD - log exceptions and track errors
results = await asyncio.gather(*tasks, return_exceptions=True)
for result in results:
    if isinstance(result, dict):
        created.append(result)
    elif isinstance(result, Exception):
        self._record_error()
        self.logger.error(f"Task failed with exception: {result}")

Update Checkpoint Progress Before Completing a Phase

In the orchestrator, always call checkpoint.update_phase_count() after work completes but before _complete_phase(). If the process is interrupted between _start_phase and _complete_phase, the checkpoint's created_count will still be 0, causing a resume to re-create everything.

# BAD - resume retries all items if interrupted before _complete_phase
templates = self.template_gen.create_templates(spaces, remaining)
self.benchmark.end_phase("templates", len(templates))
self._complete_phase("templates")

# GOOD - checkpoint knows how many were created even if interrupted
templates = self.template_gen.create_templates(spaces, remaining)
self.benchmark.end_phase("templates", len(templates))
if self.checkpoint:
    self.checkpoint.update_phase_count("templates", len(templates))
self._complete_phase("templates")

Keep Sync and Async Implementations in Parity

When a generator has both sync and async methods, every behavior must exist in both paths: retry logic, checkpoint updates, error handling, request delays. Review both paths together — a fix applied to only one path will be flagged.

Use `request_delay` Not Hard-Coded Sleeps

Sync loops between API calls must use self.request_delay (passed from CLI --request-delay), not hard-coded values like time.sleep(0.1). Hard-coded sleeps ignore user configuration and behave differently from async paths.

# BAD - ignores CLI configuration
time.sleep(0.1)

# GOOD - respects user-configured delay
if self.request_delay > 0:
    time.sleep(self.request_delay)

Tests Must Assert the Behavior They Claim to Test

Test names and docstrings must match what is actually asserted. A test called test_type_alternation that only asserts is not None doesn't test alternation. A test called test_exception_handling that mocks a 400 response doesn't exercise the exception branch.

# BAD - claims to test alternation but doesn't verify it
def test_create_template_type_alternation(self):
    t0 = generator.create_template("TEST1", 0)
    assert t0 is not None  # Only checks existence

# GOOD - actually verifies the alternation behavior
def test_create_template_type_alternation(self):
    t0 = generator.create_template("TEST1", 0)
    t1 = generator.create_template("TEST1", 1)
    assert t0["templateType"] == "page"
    assert t1["templateType"] == "blogpost"

Keep Doc Counts Accurate

When CLAUDE.md or PLAN.md reference test counts (e.g., "23 tests"), verify the actual count matches. Run grep -c "def test_" tests/test_file.py before writing the number.

API Integration Gotchas

Verify endpoints exist before implementing. Confluence v2/v3 APIs are incomplete—many operations documented for v1 don't exist in newer APIs. Before writing code for a new API call:

Check the Confluence Cloud REST API docs
Test the endpoint manually with curl or Postman
Be prepared to fall back to v1 (/rest/api/) if v2 (/api/v2/) doesn't support the operation

When fixing Atlassian/Confluence API issues, always verify both the endpoint AND the resource naming conventions. Atlassian Cloud uses site-specific naming that differs from documentation examples:

Group names: Cloud uses confluence-users-{site-name} not just confluence-users
User IDs: Account IDs are long alphanumeric strings, not usernames
Space keys: Query params for lookup (?keys=KEY), not path segments (/spaces/KEY)
Labels vs Categories: Both use the same legacy endpoint with different prefixes (global vs team)

Always test against a real instance after API fixes—mock tests don't catch naming convention issues.

Space Permissions Use RBAC Role Assignments (Not Direct Permissions)

Confluence Cloud uses RBAC (role-based access control). The legacy v1 permission endpoint (POST /rest/api/space/{key}/permission) returns "not supported in roles-only mode", and the v2 POST permissions endpoint returns 405. Instead, use the v2 role-assignment API:

GET /api/v2/space-roles — list available roles (Collaborator, Viewer, Manager, Admin, View only)
GET /api/v2/spaces/{id}/role-assignments — list current assignments
POST /api/v2/spaces/{id}/role-assignments — add assignments (body must be a JSON array, not a single object)

The "View only" role is not available on standard spaces and returns 400 — filter it out before assigning.

Space permissions are not in the multiplier CSV. The count is computed as num_spaces × num_users × num_roles.

User Discovery Uses v1 CQL Search

The v2 /api/v2/users endpoint returns 400. Use the v1 CQL search instead:

GET /rest/api/search/user?cql=type="user"&limit=50&start=0
Response wraps user data: {"results": [{"user": {"accountId": ..., "accountType": ...}}], "totalSize": N}
Filter to accountType == "atlassian" to exclude app/bot users

Architecture & Design Patterns

Core Classes

`RateLimitState` (dataclass) - `generators/base.py`

Purpose: Track rate limiting state across API calls (thread-safe for async)
Fields:
- retry_after: Seconds to wait (from Retry-After header)
- consecutive_429s: Count of consecutive rate limit hits
- current_delay: Current exponential backoff delay
- max_delay: Maximum delay cap (60s)
- adaptive_delay: Auto-adjusted delay based on recent 429 rate
- _lock: asyncio.Lock for thread-safe updates in async context

`ConfluenceAPIClient` (base class) - `generators/base.py`

Purpose: Base class for all generators with shared API functionality
Key Methods:
- _api_call(): Synchronous API call with rate limiting
- _api_call_async(): Async API call with rate limiting (supports suppress_errors param)
- _truncate_error_response(): Truncate HTML/long error responses for clean logging
- _create_session(): Create requests session with connection pooling
- _get_async_session(): Get/create aiohttp session with connection pooling
- generate_random_text(): Get random text from pre-generated pool

`CheckpointManager` - `generators/checkpoint.py`

Purpose: Track progress and enable resumable data generation
Design Note: Single-threaded use only (see Thread-Safety above)
Key Features:
- JSON-based checkpoint file (confluence_checkpoint_{PREFIX}.json)
- Phase-level progress tracking (pending/in_progress/complete)
- Per-space tracking for pages and blogposts (scales to millions)
- Atomic file writes (temp file + replace)
- Content-only mode support

`BenchmarkTracker` - `generators/benchmark.py`

Purpose: Track performance metrics and provide time extrapolations
Key Features:
- Per-phase timing with items/second rate calculation
- Rate limit (429) and error tracking with percentages
- Time extrapolation based on observed rates

Confluence Content Types

Based on item_type_multipliers.csv, the tool creates:

Category	Types
Spaces	space, space_property, space_label, space_permission*, space_look_and_feel
Pages	page, page_label, page_property, page_restriction, page_version
Blogposts	blogpost, blogpost_label, blogpost_property, blogpost_restriction, blogpost_version
Attachments	attachment_v2, attachment_label, attachment_version, folder, folder_restriction
Comments	inline_comment, inline_comment_version, footer_comment, footer_comment_version
Other	template

*space_permission is not in the multiplier CSV. Its count is computed dynamically as num_spaces × num_discovered_users × num_available_roles. Users are auto-discovered from the instance via CQL search, and roles are fetched from the v2 Space Roles API. Confluence Cloud uses RBAC (role-based access control), so permissions are created as role assignments rather than direct permission grants.

Phase Order

Generation follows this order (defined in CheckpointManager.PHASE_ORDER):

spaces → space_properties → space_labels → space_permissions → space_look_and_feel
templates
pages → page_labels → page_properties → page_restrictions → page_versions
blogposts → blogpost_labels → blogpost_properties → blogpost_restrictions → blogpost_versions
attachments → attachment_labels → attachment_versions
folders → folder_restrictions
inline_comments → inline_comment_versions
footer_comments → footer_comment_versions

Content-Only Mode (--content-only): Only creates spaces, pages, and blogposts.

Rate Limiting Strategy

Priority Order:

Primary: Use Retry-After header from Confluence response
Fallback: Exponential backoff starting at 1s, doubling on each 429
Reset: Return to 1s delay on successful request
Max: Cap at 60s delay
Thread-safe: Uses asyncio.Lock for shared state in async context

Adaptive Throttling (async only):

On 429: Increase adaptive_delay by 100ms (caps at 1s)
On success: Decrease adaptive_delay by 10ms every 10 successes
Jitter: ±20% on rate limit backoff, ±10% on request delay

API Endpoints

Confluence Cloud REST API v2

Base URL: {confluence_url}/api/v2/

Endpoint	Method	Purpose
`spaces`	POST	Create space
`spaces/{id}`	GET	Get space details
`space-roles`	GET	List available space roles (RBAC)
`spaces/{id}/role-assignments`	GET/POST	List/add role assignments (array body for POST)
`folders`	POST	Create folder
`folders/{id}`	DELETE	Delete folder
`pages`	POST	Create page
`pages/{id}`	GET/PUT	Get/update page
`blogposts`	POST	Create blogpost
`footer-comments`	POST	Create footer comment
`inline-comments`	POST	Create inline comment

Confluence Cloud REST API v1 (Legacy)

Base URL: {confluence_url}/rest/api/

Endpoint	Method	Purpose
`space`	GET	List spaces
`space/{key}/label`	POST	Add space label (prefix: `global`) or category (prefix: `team`)
`space/{key}/property`	POST	Add space property
`search/user?cql=type="user"`	GET	Search for users (paginated via `start`/`limit`)
`user/current`	GET	Get current user
`content/{id}/child/attachment`	POST	Upload attachment (multipart form data)
`content/{id}/child/attachment/{att_id}/data`	POST	Upload new attachment version (multipart form data)
`content/{id}/label`	POST	Add label to content (pages, blogposts, attachments)
`template`	POST	Create content template (page or blogpost type)

Note on Labels vs Categories: Both use the same endpoint but with different prefixes in the request body:

Labels: [{"prefix": "global", "name": "label-name"}]
Categories: [{"prefix": "team", "name": "category-name"}]

Testing

Running Tests

# Install dependencies (use venv)
.venv/bin/pip install -r requirements-dev.txt

# Run all tests in parallel
.venv/bin/pytest -n auto

# Run with coverage report
.venv/bin/pytest -n auto --cov=generators --cov-report=term-missing

# Run specific test file
.venv/bin/pytest tests/test_checkpoint.py -v

# Run tests matching a pattern
.venv/bin/pytest -k "test_dry_run"

Test Connectivity

Before running the full generator, verify API credentials:

.venv/bin/python test_connectivity.py

Mocking Strategy

Sync HTTP calls: responses library
Async HTTP calls: aioresponses library
File I/O: tmp_path pytest fixture

End-to-End Verification

After implementing a fix, verify the complete data flow end-to-end before considering the task complete—especially for user/permission assignments. Mock tests validate API call structure but don't catch:

Incorrect resource naming conventions (site-specific names)
Missing or deprecated API endpoints
Permission/access issues with created resources
Data format differences between v1 and v2 APIs

When possible, run a quick manual test against a real Confluence instance after fixing API-related issues.

Command Line Options

Option	Required	Description	Default
`--url`	Yes	Confluence instance URL	-
`--email`	Yes	Atlassian account email	-
`--count`	Yes	Target number of content items (pages + blogposts)	-
`--prefix`	No	Label/property prefix	`TESTDATA`
`--size`	No	Size bucket: small, medium, large	`small`
`--spaces`	No	Override number of spaces	calculated
`--concurrency`	No	Max concurrent requests	`5`
`--request-delay`	No	Delay between API calls in sync loops (seconds)	`0.0`
`--settling-delay`	No	Delay before version creation to let Confluence settle	`0.0`
`--content-only`	No	Only create spaces, pages, blogposts	`false`
`--dry-run`	No	Preview without API calls	`false`
`--resume`	No	Resume from checkpoint	`false`
`--no-checkpoint`	No	Disable checkpointing	`false`
`--no-async`	No	Use synchronous mode	`false`
`--cleanup`	No	Delete all test spaces matching the prefix instead of generating data	`false`
`--yes`	No	Skip confirmation prompt during cleanup	`false`
`--verbose`	No	Enable debug logging	`false`

Note: API token is read from CONFLUENCE_API_TOKEN environment variable or .env file. Never pass tokens via command line.

Size Buckets

Based on Atlassian's sizing guide:

Bucket	Content (all versions)
S	Up to 500,000
M	500,000 - 2.5 million
L	2.5 million - 10 million
XL	10 million - 25 million

Dependencies

Required

aiohttp>=3.9.0: Async HTTP library
python-dotenv>=1.0.0: Load environment variables
requests>=2.31.0: Sync HTTP library

Development

ruff>=0.4.0: Linting and formatting
pytest>=8.0.0: Test framework
pytest-cov>=4.1.0: Coverage reporting
pytest-asyncio>=0.23.0: Async test support
pytest-xdist>=3.5.0: Parallel test execution

Python Version

Minimum: Python 3.12
Target: py312 (configured in ruff.toml)

Running Python Tools (IMPORTANT)

This project uses a virtual environment. Do NOT use system python, pip, or ruff - they won't work due to Homebrew's PEP 668 restrictions.

Option 1: Use venv binaries directly (preferred)

.venv/bin/python script.py
.venv/bin/ruff check .
.venv/bin/ruff format .
.venv/bin/pytest

Option 2: Use uvx for one-off tool runs

uvx ruff check .
uvx ruff format .
uvx pytest

Option 3: Activate the venv first

source .venv/bin/activate
python script.py
ruff check .
pytest

Quick reference:

Task	Command
Run linter	`.venv/bin/ruff check .`
Fix lint issues	`.venv/bin/ruff check --fix .`
Check formatting	`.venv/bin/ruff format --check .`
Fix formatting	`.venv/bin/ruff format .`
Run tests	`.venv/bin/pytest -n auto`
Run single test	`.venv/bin/pytest tests/test_file.py -v`
Install deps	`.venv/bin/pip install -r requirements.txt`

Workflow

After PR Merge

After the user confirms a PR is merged, always:

git checkout main
git pull origin main
Create a new feature branch for the next task

Never continue working on the old feature branch after its PR is merged.

Integration Testing Before User Review

Before asking the user to test new functionality, run integration tests yourself using credentials from .env:

Run the tool with minimal data using a unique prefix:

source .env
TEST_PREFIX="AITEST$(date +%H%M%S)"
.venv/bin/python confluence_data_generator.py \
    --url $CONFLUENCE_URL \
    --email $CONFLUENCE_EMAIL \
    --count 1 \
    --spaces 1 \
    --prefix $TEST_PREFIX

Check output for errors - any API failures, missing methods, wrong parameters
If errors occur:
- Fix the code
- Delete the test space via API (permanently deleted, not moved to trash)
- Re-run with the same or new prefix

Clean up after successful test:

source .env
# Delete the space (permanent - does NOT go to trash when deleted via API)
# Replace AITEST1234561 with actual space key from output
curl -s -u "$CONFLUENCE_EMAIL:$CONFLUENCE_API_TOKEN" -X DELETE \
    "$CONFLUENCE_URL/rest/api/space/${TEST_PREFIX}1"

Note: Deleting a space via the REST API permanently removes it — it does not go to the trash like UI deletion does. The API returns 202 (accepted) and the space is gone.

This catches issues like wrong method names, incorrect API parameters, and missing async methods before the user wastes time debugging.

Validate Fixes Before Committing

For Python projects, prefer running quick validation tests or API calls after fixes rather than assuming the change works:

# Quick unit test validation
.venv/bin/pytest tests/test_specific.py -v -k "test_name"

# Quick syntax/import check
.venv/bin/python -c "from generators.spaces import SpaceGenerator; print('OK')"

# Quick API validation (if credentials available)
.venv/bin/python test_connectivity.py

# Quick lint check
.venv/bin/ruff check . && .venv/bin/ruff format --check .

This catches issues early—before they're committed and before CI runs.

GitHub CLI (gh) Operations

Replying to PR Review Comments

When replying to individual review comments (code comments, not general PR comments), use:

# Correct format - note the -X POST and leading /
gh api -X POST /repos/{owner}/{repo}/pulls/{pr_number}/comments/{comment_id}/replies \
    -f body="Your reply here"

Common mistakes:

Missing -X POST (defaults to GET, returns 404)
Missing leading / in the path
Using repos/ instead of /repos/

Viewing PR Comments

# Get all review comments (code comments) with their IDs
gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \
    --jq '.[] | "ID: \(.id) | Path: \(.path):\(.line) | Body: \(.body | split("\n")[0])"'

# Get general PR comments (not code comments)
gh pr view {pr_number} --comments

Adding a General PR Comment

gh pr comment {pr_number} --body "Your comment here"

Quick Reference for Common Tasks

"Add support for [new content type]"

Check Confluence API v2 docs for endpoint
Decide which generator module (or create new one)
Inherit from ConfluenceAPIClient
Add phase to CheckpointManager.PHASE_ORDER
Add multiplier key mapping in CheckpointManager.initialize()
Add tests with mocked responses
Update generators/__init__.py exports

"Fix rate limiting issue"

Check ConfluenceAPIClient._handle_rate_limit_async() for async
Check ConfluenceAPIClient._handle_rate_limit() for sync
Verify asyncio.Lock is used for shared state
Adjust max_delay or backoff multiplier in RateLimitState

"Add new CLI option"

Add to argparse in main orchestrator
Pass through to relevant generators
Update README.md
Update this file's Command Line Options section

"Complete a task"

Always check documentation before marking complete:

Internal docs: Update .ai/plans/ implementation plan (mark status, add learnings)
README.md: Add/update usage examples, CLI options, feature descriptions
CLAUDE.md: Update file structure, add patterns/gotchas, update API endpoints
Code docs: Ensure CLI --help text is accurate and helpful

Last Updated: 2026-02-10 AI Agent Note: This file is specifically for you. The user-facing docs are in README.md.

FilesExpand file tree

CLAUDE.md

Latest commit

History