Skip to content

refactor: redefined lean domain + clean architecture + issues: #13, #14#16

Merged
cango91 merged 5 commits intomainfrom
refactor
Sep 3, 2025
Merged

refactor: redefined lean domain + clean architecture + issues: #13, #14#16
cango91 merged 5 commits intomainfrom
refactor

Conversation

@cango91
Copy link
Owner

@cango91 cango91 commented Sep 3, 2025

This is a fundamental architectural refactor that replaces the unnecessarily complex hierarchical rule engine with a lean domain model focused on the app’s true core: precise URL manipulation and tracking parameter removal.

The original main branch, while functional, suffered from over-engineering:

  • A hierarchical rule system with priorities (EXACT_HOST, SUBDOMAIN_WILDCARD) when the actual use case only required simple monotone union operations
  • Complex rule compilation, specificity calculations, and matching logic
  • Rule-based vocabulary that obscured the real domain: URL cleaning
  • Brittle testing around rule interactions rather than URL behavior

The “rule domain” was a false abstraction — Detracktor’s real domain is URL manipulation, not rule management.

  • Core Domain (domain/model/):
    • Url & UrlParts: Type-safe URL representation with validation
    • QueryPairs & QueryMap: Lossless query parameter handling with exact wire-format preservation and round-trip guarantees
    • QueryToken: Granular query component with proper encoding/decoding
    • UrlCodec: RFC-compliant percent encoding/decoding
  • Error Handling (domain/error/):
    • DomainResult: Functional error handling without exceptions
    • DomainError & ValidationError: Structured error representation
  • Application Layer (application/service/):
    • MatchService + Globby: Simple, safe pattern matching for decoded param names (glob-style only)
    • Replaces the complex hierarchical rule system with straightforward, composable predicates
    • Configuration Schema: Standardized JSON schema for rule definitions (typed in application/types/), designed for forward-compatibility. More complex rules can be added by evolving the application layer and up without touching the domain.
  1. Lossless URL Handling: Perfect round-trip conversion preserving exact wire format, order, duplicates, and edge cases (empty keys, valueless params)
  2. Type Safety: Value classes and strong typing prevent URL manipulation bugs
  3. Functional Error Handling: DomainResult eliminates exception-based control flow for better composability
  4. Testability: Pure functions with clear contracts, comprehensive test suite covering edge cases and RFC compliance
  5. Performance: Immutable structures, value classes, and optimized parsing

Refactor impact on open issues (as of v1.0.1-rc):

  • Privacy: Clipboard content exposure for non-URI data #13 :
    • Now handled at runtime UI: clipboard text is parsed via UrlParser; when invalid, the app shows a generic “Invalid URL” state and does not render clipboard contents. No non-URI data is displayed in MainActivity’s preview.
    • URL preview also enhanced to respect privacy by hiding non-rule matching parameters' values by default.
  • Security: Regex Denial of Service in user-configurable patterns #14:
    • Domain layer is predicate-driven and free of regex-based operations.
    • Application layer uses Globby (custom glob matcher) with strict validation (Pattern wrapper), avoiding catastrophic backtracking. Future host/path matching remains non-regex; if regex is reintroduced, it will be guarded behind safe matchers (e.g., Android PatternMatcher with PATTERN_ADVANCED_GLOB).

This is a fundamental architectural refactor that replaces the unnecessarily
complex hierarchical rule engine with a lean domain model focused on the app's
true core: precise URL manipulation and tracking parameter removal.

The original `main` branch, while functional, suffered from over-engineering:
- A hierarchical rule system with priorities (EXACT_HOST, SUBDOMAIN_WILDCARD)
  when the actual use case only required simple monotone union operations
- Complex rule compilation, specificity calculations, and matching logic
- Rule-based vocabulary that obscured the real domain: URL cleaning
- Brittle testing around rule interactions rather than URL behavior

The "rule domain" was a false abstraction - Detracktor's real domain is URL
manipulation, not rule management.

**Core Domain (`domain/model/`):**
- `Url` & `UrlParts`: Type-safe URL representation with validation
- `QueryPairs` & `QueryMap`: Lossless query parameter handling with exact
  wire-format preservation and round-trip guarantees
- `QueryToken`: Granular query component with proper encoding/decoding
- `UrlCodec`: RFC-compliant percent encoding/decoding

**Error Handling (`domain/error/`):**
- `DomainResult<T>`: Functional error handling without exceptions
- `DomainError` & `ValidationError`: Structured error representation

**Application Layer (`application/pattern/`):**
- `PatternEngine`: Simple, testable pattern matching (EXACT/GLOB/REGEX)
- Replaces the complex hierarchical rule system with straightforward predicates

1. **Lossless URL Handling**: Perfect round-trip conversion preserving exact
   wire format, order, duplicates, and edge cases (empty keys, valueless params)

2. **Type Safety**: Value classes and strong typing prevent URL manipulation bugs

3. **Functional Error Handling**: `DomainResult<T>` eliminates exception-based
   control flow for better composability

4. **Testability**: Pure functions with clear contracts, comprehensive test suite
   covering edge cases and RFC compliance

5. **Performance**: Immutable structures, value classes, and optimized parsing

This commit establishes the domain foundation. Upcoming layers:

- **Infrastructure**: URL parsing adapters, configuration persistence
- **Application Services**: URL cleaning workflows, clipboard integration
- **Presentation**: Clean Architecture compliance with dependency inversion

The refactor also addresses the currently open issues (applicable to latest version `v1.0.1-rc` at the time of writing):
- [#13: Privacy: Clipboard content exposure for non-URI data](#13):
    - #FIXME: Will be addressed in the application and presentation layers.
- [#14: Security: Regex Denial of Service (ReDoS) vulnerability in user-configurable patterns](#14):
    - Domain api is now modeled as a predicate based Url manipulator, essentially removing any regex based operations from the domain layer.
    - #FIXME: Planned to be addressed in the application layer via using android's `PatternMatcher` with `PATTERN_ADVANCED_GLOB` flag, which has safety guarantees - i.e. it is not vulnerable to ReDoS.

Comprehensive test suite with 822 test cases covering:
- URL parsing edge cases and RFC compliance
- Query parameter manipulation with lossless round-trips
- Pattern engine validation and compilation
- Error handling scenarios

The domain is complete, tested, and ready for the application layer.
…+ net services

- Remove legacy pattern engine and tests:
  - delete application/pattern/* and DefaultPatternEngineTest
- Introduce matching and rule services:
  - add application/service/match/{RuleEngine,Match}
  - add application/types/{Pattern,Settings}
- Settings infrastructure:
  - add application/repo/SettingsRepository and application/service/Settings
  - add runtime/android/repo/Settings
- Networking/integration services:
  - add application/service/net/HostCanonicalizer + runtime impl
  - add runtime/android/service/net/UrlParserImpl
- Runtime wiring + UI package move:
  - add runtime/android/Composition and update MainActivity
  - move presentation/ui/theme/* -> runtime/android/presentation/ui/theme/*
- Assets and schemas:
  - add assets {rules.schema.json, settings.schema.json, warnings.schema.json}
  - update assets/default_rules.json
- Errors:
  - add application/error/AppError
- Android tests and benchmarks:
  - add androidTest {GlobbyTest, ServiceIntegrationTest, HostCanonicalizerTest,
    UrlParserImplTest, ServicePerformanceBenchmark}
- Domain tweaks and version:
  - modify domain/error/DomainError and domain/model/Url
  - update version.properties

Rationale: continue Clean Architecture layering—wire application and runtime services,
formalize settings and net adapters, add schemas for validation, and replace the
over-abstracted pattern engine with a focused rule engine.
@cango91 cango91 merged commit e4c38a1 into main Sep 3, 2025
4 checks passed
@cango91 cango91 deleted the refactor branch September 3, 2025 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security: Regex Denial of Service (ReDoS) vulnerability in user-configurable patterns Privacy: Clipboard content exposure for non-URI data

2 participants