Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 20 additions & 20 deletions agent-os/product/mission.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Product Mission

## Pitch
pybpmn_parser is a Python library that helps developers, workflow engineers, and business analysts programmatically parse and manipulate BPMN 2.0 workflows by providing type-safe, extensible Python dataclasses that accurately represent all BPMN elements and their relationships.
`pybpmn_parser` is a Python library that helps developers, workflow engineers, and business analysts programmatically parse and manipulate BPMN 2.0 workflows by providing type-safe, extensible Python dataclasses that accurately represent all BPMN elements and their relationships.

## Users

Expand All @@ -13,19 +13,19 @@ pybpmn_parser is a Python library that helps developers, workflow engineers, and

### User Personas

**Workflow Developer** (25-45)
**Workflow Developer**
- **Role:** Software Engineer / Technical Lead
- **Context:** Building workflow automation systems for enterprise clients
- **Pain Points:** Manual XML parsing is error-prone, lack of type safety, no standard BPMN representation in Python
- **Pain Points:** Manual XML parsing is error-prone, lacks type safety, and has no standard BPMN representation in Python
- **Goals:** Reliable BPMN parsing, type-safe workflow manipulation, easy integration with existing systems

**Process Analyst** (30-50)
**Process Analyst**
- **Role:** Business Process Analyst / Consultant
- **Context:** Analyzing hundreds of BPMN diagrams across different departments
- **Pain Points:** No programmatic way to analyze BPMN files, manual inspection is time-consuming
- **Pain Points:** No programmatic way to analyze BPMN files; manual inspection is time-consuming
- **Goals:** Automated workflow analysis, bulk processing of BPMN files, extracting metrics and patterns

**Integration Engineer** (28-40)
**Integration Engineer**
- **Role:** Systems Integration Specialist
- **Context:** Connecting BPMN-based workflow engines with Python applications
- **Pain Points:** Incompatible formats, no standard Python representation, complex XML manipulation
Expand All @@ -34,27 +34,27 @@ pybpmn_parser is a Python library that helps developers, workflow engineers, and
## The Problem

### Lack of Type-Safe BPMN Parsing in Python
Python developers working with BPMN workflows struggle with raw XML manipulation and lack a robust, type-safe way to parse and work with BPMN 2.0 files. This leads to error-prone code and increases development time by 3-5x.
Python developers working with BPMN workflows struggle with raw XML manipulation and lack a robust, type-safe way to parse and work with BPMN 2.0 files.

**Our Solution:** Provide comprehensive Pydantic-based dataclasses that represent all BPMN elements with full type safety and validation.
**Our Solution:** Provide comprehensive data classes that represent all BPMN elements with full type safety and validation.

### Incompatible Vendor Extensions
Different BPMN tools (Camunda, Activiti, etc.) add proprietary extensions that break standard parsers. Teams spend weeks writing custom parsers for each vendor's format.
### Vendor Extensions
Different BPMN tools (Camunda, Activiti, etc.) add proprietary extensions that standard parsers don't recognize.

**Our Solution:** Extensible plugin system that handles vendor-specific extensions while maintaining core BPMN compatibility.
**Our Solution:** Use the de facto-standard Moddle extension definitions.

### Complex BPMN Validation
Validating BPMN files requires deep understanding of the specification and complex XML schema validation. Most teams skip validation, leading to runtime errors.
Validating BPMN files requires a deep understanding of the specification and complex XML schema validation. Most teams skip validation, leading to runtime errors.

**Our Solution:** Built-in XML schema validation and semantic validation that ensures BPMN correctness before processing.

## Differentiators

### Complete BPMN 2.0 Coverage
Unlike partial parsers that handle only basic elements, we provide comprehensive support for all BPMN 2.0 elements including choreographies, conversations, and complex event definitions. This results in 100% specification compliance.
### Nearly Complete BPMN 2.0 Coverage
Unlike partial parsers that handle only basic elements, we provide comprehensive support for all BPMN 2.0 elements except choreographies.

### Type-Safe by Design
Unlike dictionary-based parsers, we use Pydantic models with full type hints and validation. This catches errors at development time and provides excellent IDE support with auto-completion.
Unlike dictionary-based parsers, we use Python dataclasses with full type hints and validation. This catches errors at development time and provides excellent IDE support with auto-completion.

### Production-Ready Performance
Unlike academic parsers, we're optimized for real-world use with efficient XML parsing using lxml and lazy loading of large diagrams. This enables processing of enterprise-scale workflows with thousands of elements.
Expand All @@ -66,15 +66,15 @@ Unlike rigid parsers, our plugin system allows easy addition of vendor extension

### Core Features
- **Complete BPMN Parsing:** Parse any valid BPMN 2.0 XML file into fully-typed Python dataclasses
- **Type Safety:** Pydantic models with comprehensive validation ensure data integrity
- **Type Safety:** Comprehensive validation ensures data integrity
- **Schema Validation:** Validate BPMN files against official XML schemas before parsing
- **Error Reporting:** Clear, actionable error messages with line numbers and context

### Extension Features
- **Plugin System:** Register custom parsers for vendor-specific BPMN extensions
- **Camunda Support:** Built-in support for Camunda-specific extensions
- **Custom Elements:** Define and parse your own BPMN extensions
- **Namespace Handling:** Proper XML namespace management for mixed-vendor workflows
- **Plugin System:** Uses the Moddle extension definition for vendor-specific BPMN extensions.
- **Camunda Support:** Built-in support for Camunda-specific extensions.
- **Custom Elements:** Define and parse your own BPMN extensions.
- **Namespace Handling:** Proper XML namespace management for mixed-vendor workflows.

### Advanced Features
- **Lazy Loading:** Efficiently handle large BPMN files with thousands of elements
Expand Down
30 changes: 11 additions & 19 deletions agent-os/product/roadmap.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,23 @@
# Product Roadmap

1. [ ] Documentation Site — Create comprehensive documentation website with API reference, tutorials, and examples using MkDocs Material at https://callowayproject.github.io/pybpmn_parser `M`
2. [ ] Parser Plugin — Develop a plugin architecture to allow for easy extension and customization of PyBPMN Parser functionality
1. [x] Documentation Site — Create a comprehensive documentation website with API reference, tutorials, and examples using MkDocs Material at https://callowayproject.github.io/pybpmn_parser `M`

3. [ ] Element Querying API — Implement fluent API for finding and filtering BPMN elements by type, ID, name, or custom predicates with chainable methods `S`

4. [ ] BPMN Serialization — Add capability to serialize modified Pydantic models back to valid BPMN 2.0 XML format with proper namespace handling `M`
2. [x] Moddle extension implementation — Implement Moddle as a plugin architecture to allow for easy extension and customization of PyBPMN Parser functionality

5. [ ] Performance Optimization — Implement lazy loading for large BPMN files, optimize parsing with caching, and add benchmarking suite for performance regression testing `M`

6. [ ] Semantic Validation — Enhance validator to check BPMN semantic rules beyond XML schema, including sequence flow connectivity, gateway rules, and event definitions `L`
3. [ ] Element Querying API — Implement fluent API for finding and filtering BPMN elements by type, ID, name, or custom predicates with chainable methods `S`

7. [ ] Camunda Extension Complete — Expand Camunda plugin to support all Camunda-specific elements including forms, external tasks, and execution listeners `M`
4. [ ] Performance Optimization — Implement lazy loading for large BPMN files, optimize parsing with caching, and add a benchmarking suite for performance regression testing `M`

8. [ ] Activiti ExtensionCreate plugin for Activiti BPMN extensions supporting custom service tasks, form properties, and execution listeners `M`
5. [ ] Semantic ValidationEnhance validator to check BPMN semantic rules beyond XML schema, including sequence flow connectivity, gateway rules, and event definitions `L`

9. [ ] BPMN Diagram MetricsAdd analysis module to calculate workflow complexity metrics, identify patterns, and generate statistical reports on BPMN diagrams `S`
6. [x] Camunda Extension CompleteExpand Camunda plugin to support all Camunda-specific elements, including forms, external tasks, and execution listeners `M`

10. [ ] Element Modification API — Implement safe modification methods for BPMN elements with automatic relationship updates and validation `M`
7. [x] Activiti Extension — Create plugin for Activiti BPMN extensions supporting custom service tasks, form properties, and execution listeners `M`

11. [ ] CLI Tool — Create command-line interface for parsing, validating, and analyzing BPMN files with output formats including JSON, YAML, and summary reports `S`
8. [ ] BPMN Diagram Metrics — Add analysis module to calculate workflow complexity metrics, identify patterns, and generate statistical reports on BPMN diagrams `S`

12. [ ] Async Processing — Add async/await support for parsing large BPMN collections and implement parallel processing for bulk operations `S`
9. [ ] Element Modification API — Implement safe modification methods for BPMN elements with automatic relationship updates and validation `M`

13. [ ] Visual ExportGenerate visual representations of parsed BPMN as SVG or PNG diagrams using graphviz or similar libraries for documentation purposes `L`
10. [ ] CLI ToolCreate a command-line interface for parsing, validating, and analyzing BPMN files with output formats including JSON, YAML, and summary reports `S`

> Notes
> - Include 4–12 items total
> - Order items by technical dependencies and product architecture
> - Each item should represent an end-to-end (frontend + backend) functional and testable feature
11. [ ] Async Processing — Add async/await support for parsing large BPMN collections and implement parallel processing for bulk operations `S`
9 changes: 0 additions & 9 deletions agent-os/product/tech-stack.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
## Core Libraries
- **XML Parsing:** lxml 6.0+ (high-performance XML processing)
- **Data Modeling:** Pydantic 2.11+ (data validation using Python type annotations)
- **XML Binding:** pydantic-xml 2.17+ (XML serialization/deserialization for Pydantic)
- **Functional Programming:** returns 0.23+ (type-safe error handling)
- **Schema Validation:** xmlschema 4.1+ (XML Schema validator)
- **XML to Dict:** xmltodict 0.14+ (XML to Python dict conversion)
Expand Down Expand Up @@ -64,14 +63,6 @@
- **Coverage Configuration:** Branch coverage with 90% minimum
- **Pre-commit Configuration:** .pre-commit-config.yaml (automated code quality checks)

## Supported BPMN Standards
- **BPMN 2.0:** Full specification compliance
- **XML Schema:** XSD validation support
- **Vendor Extensions:**
- Camunda (built-in support)
- Custom extensions via plugin system
- Moodle types (in development)

## Performance Considerations
- **XML Parser:** lxml with C extensions for performance
- **Validation:** Lazy validation with streaming where possible
Expand Down
12 changes: 7 additions & 5 deletions pybpmn_parser/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,19 @@

from dataclasses import dataclass
from pathlib import Path
from typing import Optional
from typing import TYPE_CHECKING, Any, Optional

import xmltodict

from pybpmn_parser.bpmn.infrastructure.definitions import Definitions
from pybpmn_parser.bpmn.types import NAMESPACES
from pybpmn_parser.element_registry import ElementDescriptor
from pybpmn_parser.plugins import load_default_plugins
from pybpmn_parser.plugins.moddle import convert_moddle_registry, load_moddle_file
from pybpmn_parser.validator import validate

if TYPE_CHECKING:
from pybpmn_parser.element_registry import ElementDescriptor


class ParseResult:
"""Result for parsing a BPMN file."""
Expand Down Expand Up @@ -41,8 +43,8 @@ class ParseContext:
"""Context for parsing BPMN elements from XML dictionaries."""

def __init__(self):
self.elements_by_id: dict[str, ElementDescriptor] = {}
"""A mapping from element ID to element descriptor."""
self.elements_by_id: dict[str, Any] = {}
"""A mapping from element ID to element instance."""

self.references: list[Reference] = []
"""A list of unresolved references."""
Expand All @@ -51,7 +53,7 @@ def add_reference(self, reference: Reference) -> None:
"""Add an unresolved reference."""
self.references.append(reference)

def add_element(self, element: ElementDescriptor) -> None:
def add_element(self, element: Any) -> None:
"""Add a processed element."""
if (id_value := getattr(element, "id", None)) or (id_value := getattr(element, "@id", None)):
self.elements_by_id[id_value] = element
Expand Down
Loading