feat: implement Phase 1 - Core Library/API (v0.7.0-alpha)#5
Conversation
This commit implements the foundation for using promptext as a Go library,
transforming it from a CLI-only tool into a developer-friendly API while
maintaining 100% backward compatibility with the existing CLI.
## New Public API Package: pkg/promptext
Created a complete public API surface with the following components:
### Core API (promptext.go)
- Extract() - Main entry point for simple extraction
- Extractor type - Reusable extractor for multiple directories
- NewExtractor() - Factory function with builder pattern support
- Version constant for library versioning
### Functional Options (options.go)
Implemented clean, composable options pattern:
- WithExtensions() - Filter by file extensions
- WithExcludes() - Exclude patterns
- WithGitIgnore() - Control .gitignore respect
- WithDefaultRules() - Control built-in filtering
- WithRelevance() - Keyword-based relevance filtering
- WithTokenBudget() - AI model token limit enforcement
- WithFormat() - Output format selection
- WithVerbose() - Verbose logging
- WithDebug() - Debug logging with timing
### Type System (result.go)
Public types for structured data access:
- Result - Main result container with formatted output
- ProjectOutput - Complete project extraction data
- FileInfo - Individual file metadata and content
- DirectoryNode - Hierarchical directory structure
- GitInfo, Metadata, FileStatistics, BudgetInfo, FilterConfig
### Format System (format.go)
- Format type for output formats (PTX, JSONL, Markdown, XML, etc.)
- Formatter interface for custom formatters
- RegisterFormatter() for extensibility
- Result.As() for format conversion
### Error Handling (errors.go)
Well-typed sentinel errors:
- ErrInvalidDirectory - Invalid/inaccessible directory
- ErrNoFilesMatched - No matching files found
- ErrTokenBudgetTooLow - Budget too low
- ErrInvalidFormat - Unsupported format
- Wrapped errors: DirectoryError, FilterError, FormatError
### Documentation (doc.go)
Comprehensive package documentation with:
- Quick start examples
- Common usage patterns
- All available options
- Design principles
- Multiple use case examples
## Testing & Examples
### Unit Tests (promptext_test.go)
Complete test coverage including:
- Simple extraction scenarios
- Extension and exclusion filtering
- Multiple output formats
- Token budget enforcement
- Error conditions
- Extractor reusability
- Builder pattern
- Format conversion
All tests pass ✓
### Example Programs
Created practical examples demonstrating:
1. examples/basic/ - Fundamental usage patterns
- Simple extraction with defaults
- Extension filtering
- Exclusion patterns
- Token budgets
- Format selection and conversion
- Saving to files
- Reusable extractors
- Builder pattern
2. examples/token-budget/ - AI-focused extraction
- Token budget enforcement
- Relevance filtering by keywords
- Combining relevance + budget
- Optimizing for different AI models
- Token efficiency analysis
3. examples/README.md - Comprehensive guide
- Usage patterns
- Common use cases
- Available options reference
- Error handling examples
## Documentation Updates
### README.md
Added new "Using as a Library" section with:
- Installation instructions
- Quick start guide
- Common patterns
- Complete options reference
- Output formats
- Error handling
- Links to examples and API docs
Updated Use Cases section to highlight library integration
## Design Principles
1. **Simple by Default** - Works with zero configuration
2. **Composable** - Options combine naturally
3. **Discoverable** - IDE autocomplete reveals options
4. **Safe** - Typed errors with errors.Is() support
5. **Extensible** - Custom formatters via RegisterFormatter()
## Backward Compatibility
✓ All existing CLI functionality works unchanged
✓ All existing tests pass
✓ CLI still builds and operates correctly
✓ No breaking changes to internal packages
## API Examples
Simple usage:
```go
result, err := promptext.Extract(".")
fmt.Println(result.FormattedOutput)
```
With options:
```go
result, err := promptext.Extract(".",
promptext.WithExtensions(".go"),
promptext.WithTokenBudget(8000),
promptext.WithFormat(promptext.FormatPTX),
)
```
Reusable extractor:
```go
extractor := promptext.NewExtractor(
promptext.WithRelevance("auth"),
promptext.WithTokenBudget(5000),
)
result1, _ := extractor.Extract("/project1")
result2, _ := extractor.Extract("/project2")
```
Format conversion:
```go
result, _ := promptext.Extract(".")
markdown, _ := result.As(promptext.FormatMarkdown)
jsonl, _ := result.As(promptext.FormatJSONL)
```
## Next Steps (Phase 2+)
This Phase 1 implementation provides the foundation for:
- Phase 2: CLI migration to use library (v0.7.1)
- Phase 3: Advanced features (streaming, analysis) (v0.8.0)
- Phase 4: Documentation & examples expansion (v0.9.0)
- Phase 5: v1.0.0 with API stability guarantees
## Files Changed
pkg/promptext/
├── promptext.go - Main API and Extract() function
├── options.go - Functional options pattern
├── result.go - Public result types
├── format.go - Format system
├── errors.go - Error types
├── doc.go - Package documentation
└── promptext_test.go - Comprehensive tests
examples/
├── README.md - Examples guide
├── basic/main.go - Basic usage examples
└── token-budget/main.go - AI-focused examples
README.md - Updated with library documentation
Closes #<issue> (if applicable)
There was a problem hiding this comment.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on December 24
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| // It first checks custom formatters, then falls back to built-in formatters. | ||
| func GetFormatter(formatStr string) (Formatter, error) { | ||
| // Check custom formatters first | ||
| if customFormatter, ok := customFormatters[formatStr]; ok { |
There was a problem hiding this comment.
Bug: Data Races in Public Library API
The global customFormatters map is accessed without synchronization in RegisterFormatter and GetFormatter. Concurrent calls to these functions from different goroutines will cause data races. Since this is a public library API, users may call RegisterFormatter from multiple goroutines during initialization, leading to undefined behavior.
| // extractor := promptext.NewExtractor().WithFormat(promptext.FormatJSONL) | ||
| func (e *Extractor) WithFormat(format Format) *Extractor { | ||
| e.config.format = format | ||
| return e |
There was a problem hiding this comment.
Bug: Builder Methods Violate Immutability Contract
The builder methods WithExtensions, WithExcludes, and WithFormat mutate the Extractor in place but the documentation claims they return "a new Extractor". This breaks immutability and causes unexpected behavior when reusing extractors. For instance, if an extractor is created once and builder methods are called on it later, the original configuration is modified, affecting all subsequent uses. The documentation promises a new instance but the implementation modifies the receiver.
Pull Request Review: Phase 1 - Core Library/API (v0.7.0-alpha)Overall AssessmentThis is an excellent first phase implementation that successfully transforms promptext from a CLI-only tool into a developer-friendly library while maintaining 100% backward compatibility. The code demonstrates strong software engineering practices with a clean API design, comprehensive documentation, and good test coverage. ✅ Strengths1. API Design Excellence
2. Code Quality
3. Test Coverage
4. Documentation
Areas for Improvement🔴 Critical Issues1. Concurrency Safety Concern in Extractor (pkg/promptext/promptext.go:189-214)The builder methods ( func (e *Extractor) WithExtensions(extensions ...string) *Extractor {
e.config.extensions = extensions // Mutates in-place
return e
}Problem: If an Example issue: extractor := promptext.NewExtractor()
// These share the same config!
e1 := extractor.WithExtensions(".go")
e2 := extractor.WithExtensions(".js") // Overwrites e1's configRecommendation: Either:
I recommend option 3 for simplicity, since users can already use: extractor := promptext.NewExtractor(
promptext.WithExtensions(".go"),
promptext.WithExcludes("vendor/"),
)2. Global State in Custom Formatters (pkg/promptext/format.go:49)var customFormatters = make(map[string]Formatter)Problem: This global map is not protected by a mutex, making Recommendation: Add a var (
customFormatters = make(map[string]Formatter)
formattersMu sync.RWMutex
)
func RegisterFormatter(name string, formatter Formatter) {
formattersMu.Lock()
defer formattersMu.Unlock()
customFormatters[name] = formatter
}
func GetFormatter(formatStr string) (Formatter, error) {
formattersMu.RLock()
customFormatter, ok := customFormatters[formatStr]
formattersMu.RUnlock()
// ... rest of code
}🟡 Medium Priority Issues3. Missing Input Validation (pkg/promptext/options.go:40-56)The func WithExtensions(extensions ...string) Option {
return func(c *config) {
c.extensions = extensions // No validation
}
}Potential issues:
Recommendation: Add validation: func WithExtensions(extensions ...string) Option {
return func(c *config) {
validated := make([]string, 0, len(extensions))
for _, ext := range extensions {
if ext == "" {
continue // Skip empty
}
if !strings.HasPrefix(ext, ".") {
ext = "." + ext // Auto-fix
}
validated = append(validated, ext)
}
c.extensions = validated
}
}4. Token Budget Validation (pkg/promptext/options.go:120-124)func WithTokenBudget(maxTokens int) Option {
return func(c *config) {
c.tokenBudget = maxTokens // Accepts negative values!
}
}Recommendation: Validate the budget is non-negative or document that 0 means unlimited and negative values are invalid. 5. Error Context Could Be Improved (pkg/promptext/promptext.go:152-155)procResult, err := processor.ProcessDirectory(procConfig, e.config.verbose)
if err != nil {
return nil, fmt.Errorf("error processing directory: %w", err)
}Issue: The error message doesn't include the directory path, making debugging harder. Recommendation: return nil, fmt.Errorf("error processing directory %s: %w", absPath, err)6. Potential Directory Traversal Risk (pkg/promptext/promptext.go:216-235)The absPath, err := filepath.Abs(dir)
if err != nil {
return "", fmt.Errorf("failed to resolve absolute path: %w", err)
}
return absPath, nilConsideration: While Recommendation: Document the security expectations or add validation if needed: // Optionally: Ensure the path doesn't escape a specific root
// Or document that callers must validate paths in server contexts🔵 Low Priority / Polish7. Unused Helper Functions (pkg/promptext/promptext.go:254-262)func joinExtensions(extensions []string) string {
return strings.Join(extensions, ",")
}
func joinExcludes(excludes []string) string {
return strings.Join(excludes, ",")
}These functions appear unused in the current code. Consider removing or using them. 8. Test Coverage GapsWhile test coverage is good, consider adding tests for:
9. Documentation ConsistencyIn // This is version 0.7.0 (Phase 1) of the library API.
// The API may evolve during the 0.x releases. Version 1.0.0 will provide
// API stability guarantees and backward compatibility.This is great! Consider adding a Performance Considerations✅ Good Practices Observed
💡 Potential Optimizations
The keywordStr := ""
for i, kw := range keywords {
if i > 0 {
keywordStr += " "
}
keywordStr += kw
}Recommendation: Use c.relevanceKeywords = strings.Join(keywords, " ")
Security Assessment✅ Security Positives
|
This commit implements the foundation for using promptext as a Go library,
transforming it from a CLI-only tool into a developer-friendly API while
maintaining 100% backward compatibility with the existing CLI.
New Public API Package: pkg/promptext
Created a complete public API surface with the following components:
Core API (promptext.go)
Functional Options (options.go)
Implemented clean, composable options pattern:
Type System (result.go)
Public types for structured data access:
Format System (format.go)
Error Handling (errors.go)
Well-typed sentinel errors:
Documentation (doc.go)
Comprehensive package documentation with:
Testing & Examples
Unit Tests (promptext_test.go)
Complete test coverage including:
All tests pass ✓
Example Programs
Created practical examples demonstrating:
examples/basic/ - Fundamental usage patterns
examples/token-budget/ - AI-focused extraction
examples/README.md - Comprehensive guide
Documentation Updates
README.md
Added new "Using as a Library" section with:
Updated Use Cases section to highlight library integration
Design Principles
Backward Compatibility
✓ All existing CLI functionality works unchanged
✓ All existing tests pass
✓ CLI still builds and operates correctly
✓ No breaking changes to internal packages
API Examples
Simple usage:
With options:
Reusable extractor:
Format conversion:
Next Steps (Phase 2+)
This Phase 1 implementation provides the foundation for:
Files Changed
pkg/promptext/
├── promptext.go - Main API and Extract() function
├── options.go - Functional options pattern
├── result.go - Public result types
├── format.go - Format system
├── errors.go - Error types
├── doc.go - Package documentation
└── promptext_test.go - Comprehensive tests
examples/
├── README.md - Examples guide
├── basic/main.go - Basic usage examples
└── token-budget/main.go - AI-focused examples
README.md - Updated with library documentation
Closes # (if applicable)