fix: improve CI test configuration without version freezing

avivsinai · avivsinai · commit 1ae5634760c1 · 2025-08-13T21:47:45.000+03:00
- Add explicit CI and PROMPTCODE_TEST env vars
- Use --bail=1 to stop at first failure for easier debugging
- Reduce test timeout to 5s (from 10s)
- Keep using latest versions to catch issues early
diff --git a/.claude/commands/promptcode-ask-expert.md b/.claude/commands/promptcode-ask-expert.md
@@ -0,0 +1,161 @@
+---
+allowed-tools: Bash(promptcode expert:*), Bash(promptcode preset list:*), Bash(promptcode generate:*), Bash(open -a Cursor:*), Read(/tmp/expert-*:*), Write(/tmp/expert-consultation-*.md), Task
+description: Consult AI expert (O3/O3-pro) for complex problems with code context - supports ensemble mode for multiple models
+---
+
+Consult an expert about: $ARGUMENTS
+
+## Instructions:
+
+1. Analyze the request in $ARGUMENTS:
+   - Extract the main question/problem
+   - Identify if code context would help (look for keywords matching our presets)
+   - Check for multiple model requests (e.g., "compare using o3 and gpt-5", "ask o3, gpt-5, and gemini")
+   - Available models from our MODELS list: o3, o3-pro, o3-mini, gpt-5, gpt-5-mini, gpt-5-nano, sonnet-4, opus-4, gemini-2.5-pro, gemini-2.5-flash, grok-4
+   - If 2+ models detected → use ensemble mode
+   - For single model: determine preference (if user mentions "o3-pro" or "o3 pro", use o3-pro)
+
+2. If code context needed, list available presets:
+   ```bash
+   promptcode preset list
+   ```
+   Choose relevant preset(s) based on the question.
+
+3. Prepare consultation file for review:
+   - Create a consultation file at `/tmp/expert-consultation-{timestamp}.md`
+   - Structure the file with:
+     ```markdown
+     # Expert Consultation
+     
+     ## Question
+     {user's question}
+     
+     ## Context
+     {any relevant context or background}
+     ```
+   - If a preset would help, append the code context:
+     ```bash
+     echo -e "\n## Code Context\n" >> "/tmp/expert-consultation-{timestamp}.md"
+     promptcode generate --preset "{preset_name}" >> "/tmp/expert-consultation-{timestamp}.md"
+     ```
+
+4. Open consultation for user review (if Cursor is available):
+   ```bash
+   open -a Cursor "/tmp/expert-consultation-{timestamp}.md"
+   ```
+   
+5. Estimate cost and get approval:
+   - Model costs (from our pricing):
+     - O3: $2/$8 per million tokens (input/output)
+     - O3-pro: $20/$80 per million tokens (input/output)
+     - GPT-5: $1.25/$10 per million tokens
+     - GPT-5-mini: $0.25/$2 per million tokens
+     - Sonnet-4: $5/$20 per million tokens
+     - Opus-4: $25/$100 per million tokens
+     - Gemini-2.5-pro: $3/$12 per million tokens
+     - Grok-4: $5/$15 per million tokens
+   - Calculate based on file size (roughly: file_size_bytes / 4 = tokens)
+   
+   **For single model:**
+   - Say: "I've prepared the expert consultation (~{tokens} tokens). Model: {model}. You can edit the file to refine your question. Reply 'yes' to send to the expert (estimated cost: ${cost})."
+   
+   **For ensemble mode (multiple models):**
+   - Calculate total cost across all models
+   - Say: "I've prepared an ensemble consultation (~{tokens} tokens) with {models}. Total estimated cost: ${total_cost} ({model1}: ${cost1}, {model2}: ${cost2}, ...). Reply 'yes' to proceed with all models in parallel."
+
+6. Execute based on mode:
+
+   **Single Model Mode:**
+   ```bash
+   promptcode expert --prompt-file "/tmp/expert-consultation-{timestamp}.md" --model {model} --yes
+   ```
+   
+   **Ensemble Mode (Parallel Execution):**
+   - Use Task tool to run multiple models in parallel
+   - Each task runs the same consultation file with different models
+   - Store each result in separate file: `/tmp/expert-{model}-{timestamp}.txt`
+   - Example for 3 models (run these in PARALLEL using Task tool):
+     ```
+     Task 1: promptcode expert --prompt-file "/tmp/expert-consultation-{timestamp}.md" --model o3 --yes > /tmp/expert-o3-{timestamp}.txt
+     Task 2: promptcode expert --prompt-file "/tmp/expert-consultation-{timestamp}.md" --model gpt-5 --yes > /tmp/expert-gpt5-{timestamp}.txt  
+     Task 3: promptcode expert --prompt-file "/tmp/expert-consultation-{timestamp}.md" --model gemini-2.5-pro --yes > /tmp/expert-gemini-{timestamp}.txt
+     ```
+   - IMPORTANT: Launch all tasks at once for true parallel execution
+   - Wait for all tasks to complete
+   - Note: The --yes flag confirms we have user approval for the cost
+
+7. Handle the response:
+
+   **Single Model Mode:**
+   - If successful: Open response in Cursor (if available) and summarize key insights
+   - If API key missing: Show appropriate setup instructions
+   
+   **Ensemble Mode (Synthesis):**
+   - Read all response text files
+   - Extract key insights from each model's response
+   - Create synthesis report in `/tmp/expert-ensemble-synthesis-{timestamp}.md`:
+   
+   ```markdown
+   # Ensemble Expert Consultation Results
+   
+   ## Question
+   {original_question}
+   
+   ## Expert Responses
+   
+   ### {Model1} - ${actual_cost}, {response_time}s
+   **Key Points:**
+   - {key_point_1}
+   - {key_point_2}
+   - {key_point_3}
+   
+   ### {Model2} - ${actual_cost}, {response_time}s
+   **Key Points:**
+   - {key_point_1}
+   - {key_point_2}
+   - {key_point_3}
+   
+   ## Synthesis
+   
+   **Consensus Points:**
+   - {point_agreed_by_multiple_models}
+   - {another_consensus_point}
+   
+   **Best Comprehensive Answer:** {Model} provided the most thorough analysis, particularly strong on {specific_aspect}
+   
+   **Unique Insights:**
+   - {Model1}: {unique_insight_from_model1}
+   - {Model2}: {unique_insight_from_model2}
+   
+   **🏆 WINNER:** {winning_model} - {clear_reason_why_this_model_won}
+   (If tie: "TIE - Both models provided equally valuable but complementary insights")
+   
+   **Performance Summary:**
+   - Total Cost: ${total_actual_cost}
+   - Total Time: {total_time}s
+   - Best Value: {model_with_best_cost_to_quality_ratio}
+   ```
+   
+   - Open synthesis in Cursor if available
+   - IMPORTANT: Always declare a clear winner (or explicitly state if it's a tie)
+   - Provide brief summary of which model performed best and why they won
+
+   **Error Handling:**
+   - If any model fails in ensemble mode, continue with successful ones
+   - Report which models succeeded/failed
+   - If OPENAI_API_KEY missing:
+     ```
+     To use expert consultation, set your OpenAI API key:
+     export OPENAI_API_KEY=sk-...
+     Get your key from: https://platform.openai.com/api-keys
+     ```
+   - For other errors: Report exact error message
+
+## Important:
+- Default to O3 model unless O3-pro explicitly requested or needed for complex reasoning
+- For ensemble mode: limit to maximum 4 models to prevent resource exhaustion
+- Always show cost estimate before sending
+- Keep questions clear and specific
+- Include relevant code context when asking about specific functionality
+- NEVER automatically add --yes without user approval
+- Reasoning effort defaults to 'high' (set in CLI) - no need to specify
diff --git a/.claude/commands/promptcode-preset-create.md b/.claude/commands/promptcode-preset-create.md
@@ -0,0 +1,52 @@
+---
+allowed-tools: Bash(promptcode preset create:*), Bash(promptcode preset info:*), Glob(**/*), Grep, Write(.promptcode/presets/*.patterns)
+description: Create a promptcode preset from description
+---
+
+Create a promptcode preset for: $ARGUMENTS
+
+## Instructions:
+
+1. Parse the description to understand what code to capture:
+   - Look for keywords like package names, features, components, integrations
+   - Identify if it's Python, TypeScript, or mixed code
+   - Determine the scope (single package, cross-package feature, etc.)
+
+2. Research the codebase structure:
+   - Use Glob to explore relevant directories
+   - Use Grep to find related files if needed
+   - Identify the main code locations and any related tests/docs
+
+3. Generate a descriptive preset name:
+   - Use kebab-case (e.g., "auth-system", "microlearning-utils")
+   - Keep it concise but descriptive
+
+4. Create the preset:
+   ```bash
+   promptcode preset create "{preset_name}"
+   ```
+   This creates `.promptcode/presets/{preset_name}.patterns`
+
+5. Edit the preset file to add patterns:
+   - Start with a header comment explaining what the preset captures
+   - Add inclusion patterns for the main code
+   - Add patterns for related tests and documentation
+   - Include common exclusion patterns:
+     - `!**/__pycache__/**`
+     - `!**/*.pyc`
+     - `!**/node_modules/**`
+     - `!**/dist/**`
+     - `!**/build/**`
+
+6. Test and report results:
+   ```bash
+   promptcode preset info "{preset_name}"
+   ```
+   Report the file count and estimated tokens.
+
+## Common Pattern Examples:
+- Python package: `python/cogflows-py/packages/{package}/src/**/*.py`
+- TypeScript component: `ts/next/{site}/components/{component}/**/*.{ts,tsx}`
+- Cross-package feature: Multiple specific paths
+- Tests: `python/cogflows-py/packages/{package}/tests/**/*.py`
+- Documentation: `**/{feature}/**/*.md`
diff --git a/.claude/commands/promptcode-preset-info.md b/.claude/commands/promptcode-preset-info.md
@@ -0,0 +1,31 @@
+---
+allowed-tools: Bash(promptcode preset info:*), Bash(promptcode preset list:*), Glob(.promptcode/presets/*.patterns), Read(.promptcode/presets/*.patterns:*)
+description: Show detailed information about a promptcode preset
+---
+
+Show detailed information about promptcode preset: $ARGUMENTS
+
+## Instructions:
+
+1. Parse the arguments to identify the preset:
+   - If exact preset name provided (e.g., "functional-framework"), use it directly
+   - If description provided, infer the best matching preset:
+     - Run `promptcode preset list` to see available presets
+     - Read header comments from preset files in `.promptcode/presets/` if needed
+     - Match based on keywords and context
+     - Choose the most relevant preset
+
+2. Run the promptcode info command with the determined preset name:
+   ```bash
+   promptcode preset info "{preset_name}"
+   ```
+
+3. If a preset was inferred from description, explain which preset was chosen and why.
+
+The output will show:
+- Preset name and path
+- Description from header comments
+- File count and token statistics
+- Pattern details
+- Sample files included
+- Usage instructions
diff --git a/.claude/commands/promptcode-preset-list.md b/.claude/commands/promptcode-preset-list.md
@@ -0,0 +1,13 @@
+---
+allowed-tools: Bash(promptcode preset list:*)
+description: List all available promptcode presets with pattern counts
+---
+
+List all available promptcode presets.
+
+Run the command:
+```bash
+promptcode preset list
+```
+
+This will display all available presets with their pattern counts. Use the preset names with other promptcode commands to work with specific code contexts.
diff --git a/.claude/commands/promptcode-preset-to-prompt.md b/.claude/commands/promptcode-preset-to-prompt.md
@@ -0,0 +1,40 @@
+---
+allowed-tools: Bash(promptcode generate:*), Bash(promptcode preset list:*), Glob(.promptcode/presets/*.patterns), Read(.promptcode/presets/*.patterns:*)
+description: Generate AI-ready prompt file from a promptcode preset
+---
+
+Generate prompt file from promptcode preset: $ARGUMENTS
+
+## Instructions:
+
+1. Parse arguments to understand what the user wants:
+   - Extract preset name or description
+   - Extract output path/filename if specified (e.g., "to ~/Desktop/analysis.txt", "in /tmp/", "as myfile.txt")
+
+2. If inferring from description:
+   - Run `promptcode preset list` to see available presets
+   - Read header comments from `.promptcode/presets/*.patterns` files if needed
+   - Match based on keywords and context
+   - Choose the most relevant preset
+
+3. Determine output path:
+   - Default: `/tmp/promptcode-{preset-name}-{timestamp}.txt` where timestamp is YYYYMMDD-HHMMSS
+   - If user specified just a folder: `{folder}/promptcode-{preset-name}-{timestamp}.txt`
+   - If user specified filename without path: `/tmp/{filename}`
+   - If user specified full path: use exactly as specified
+
+4. Generate the prompt file:
+   ```bash
+   promptcode generate --preset "{preset_name}" --output "{output_path}"
+   ```
+
+5. Report results:
+   - Which preset was used (especially important if inferred)
+   - Full path to the output file
+   - Token count and number of files included
+   - Suggest next steps (e.g., "You can now open this file in your editor")
+
+## Examples of how users might call this:
+- `/promptcode-preset-to-prompt functional-framework`
+- `/promptcode-preset-to-prompt microlearning analysis to ~/Desktop/`
+- `/promptcode-preset-to-prompt the functional code as analysis.txt`
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
@@ -8,7 +8,7 @@ on:
 
 jobs:
   test:
-    runs-on: ubuntu-20.04  # Use stable runner version
+    runs-on: ubuntu-24.04
     timeout-minutes: 10   # Add timeout to prevent hanging
     steps:
       - uses: actions/checkout@v4
@@ -31,16 +31,17 @@ jobs:
         
       - name: Setup Bun for CLI tests
         uses: oven-sh/setup-bun@v2
-        with:
-          bun-version: 1.0.4  # Pin to known stable version
           
       - name: CLI tests
         timeout-minutes: 5  # Add step-level timeout
+        env:
+          CI: true
+          PROMPTCODE_TEST: 1
         run: |
           cd packages/cli
           bun install --frozen-lockfile
           bun run build
-          bun test --timeout 10000  # Increase per-test timeout to 10s
+          PROMPTCODE_TEST=1 CI=true bun test --bail=1 --timeout 5000
 
   build-extension:
     needs: test
diff --git a/.promptcode/presets/domain-boundaries.patterns b/.promptcode/presets/domain-boundaries.patterns
@@ -0,0 +1,29 @@
+# Domain boundaries analysis preset
+# Analyzing separation between CLI layer and agent command layers
+# Created: 2025-01-13
+
+# CLI Layer - Core functionality
+packages/cli/src/commands/expert.ts
+packages/cli/src/commands/cc.ts
+packages/cli/src/commands/cursor.ts
+packages/cli/src/utils/cost.ts
+packages/cli/src/utils/environment.ts
+packages/cli/src/providers/models.ts
+packages/cli/src/providers/ai-provider.ts
+
+# Agent Command Layer - Claude Code
+.claude/commands/*.md
+
+# Agent Templates 
+packages/cli/src/claude-templates/*.md
+packages/cli/src/cursor-templates/*.mdc
+
+# Integration utilities
+packages/cli/src/utils/claude-integration.ts
+packages/cli/src/utils/cursor-integration.ts
+packages/cli/src/utils/integration-helper.ts
+
+# Exclude test files and build artifacts
+!**/*.test.ts
+!**/dist/**
+!**/node_modules/**
diff --git a/CLAUDE.md b/CLAUDE.md
diff --git a/gpt5-domain-boundary-analysis.md b/gpt5-domain-boundary-analysis.md