Skip to content

Commit a0268da

Browse files
jwilgerclaude
andauthored
feat: implement AWS Bedrock provider handler (issue #28) (#136)
## Summary This PR implements issue #28 - AWS Bedrock provider handler as the MVP provider for the proxy pattern. It establishes the provider abstraction pattern that will be used for all future LLM provider integrations. ## Implementation Details ### Provider Infrastructure - ✅ Core provider trait abstraction with async methods - ✅ Provider registry for URL-based routing - ✅ Zero-copy streaming support for minimal latency - ✅ Comprehensive error handling and type safety ### AWS Bedrock Provider - ✅ SigV4 authentication pass-through (no credential storage) - ✅ Support for InvokeModel and InvokeModelWithResponseStream endpoints - ✅ Model-specific handling for Claude, Titan, and Llama models - ✅ Request/response metadata extraction - ✅ Token usage tracking and cost calculation - ✅ Comprehensive test coverage ### Response Processing - ✅ Model-specific token extraction from response bodies - ✅ Automatic cost calculation based on current pricing - ✅ Provider metadata aggregation for audit logging ### Documentation - ✅ Comprehensive provider integration guide - ✅ Step-by-step instructions for adding new providers - ✅ Testing strategies and performance considerations ## Testing - ✅ Unit tests for all provider components - ✅ Integration tests with mocked AWS responses - ✅ Test coverage for authentication, routing, streaming, and error handling - ✅ All tests passing ## Key Design Decisions 1. **URL-based routing** - Following ADR-0011, providers are selected based on URL path prefix 2. **Zero-copy streaming** - Responses stream directly through without buffering 3. **Authentication pass-through** - No credential storage, headers forwarded as-is 4. **Type-driven development** - Extensive use of newtypes and validation at boundaries ## Files Added/Modified ### Core Provider Framework - `src/providers/mod.rs` - Provider trait and registry - `src/providers/response_processor.rs` - Response metadata processing ### Bedrock Provider - `src/providers/bedrock/mod.rs` - Module organization - `src/providers/bedrock/provider.rs` - Core provider implementation - `src/providers/bedrock/auth.rs` - SigV4 authentication handling - `src/providers/bedrock/models.rs` - Model-specific logic - `src/providers/bedrock/types.rs` - Type definitions and pricing - `src/providers/bedrock/tests.rs` - Comprehensive unit tests ### Integration Tests - `tests/bedrock_integration.rs` - Integration tests with mocked responses ### Documentation - `docs/provider-integration.md` - Comprehensive integration guide ## Next Steps This PR establishes the foundation for provider integration. Future providers (OpenAI, Anthropic, etc.) can follow the same pattern established here. Closes #28 --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 1fa9fc5 commit a0268da

25 files changed

+4827
-80
lines changed

Cargo.lock

Lines changed: 1385 additions & 58 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ tracing-subscriber = { version = "0.3", features = ["env-filter"] }
2222
eventcore = "0.1.8"
2323
eventcore-postgres = "0.1.8"
2424
eventcore-macros = "0.1.8"
25-
nutype = { version = "0.6", features = ["serde", "new_unchecked"] }
25+
nutype = { version = "0.6", features = ["serde", "new_unchecked", "regex"] }
2626
derive_more = { version = "2.0", features = ["debug", "display", "from", "into"] }
2727
chrono = { version = "0.4", features = ["serde"] }
2828
anyhow = "1.0"
@@ -42,6 +42,12 @@ pin-project-lite = "0.2.16"
4242
crossbeam = "0.8.4"
4343
parking_lot = "0.12.4"
4444
urlencoding = "2.1.3"
45+
aws-config = "1.8.3"
46+
aws-sdk-bedrockruntime = "1.99.0"
47+
regex = "1.11.1"
48+
base64 = "0.22.1"
49+
rust_decimal = "1.37.2"
50+
currencies = { version = "0.4.1", features = ["serde"] }
4551

4652
[dev-dependencies]
4753
tokio-test = "0.4"
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# 0019. Money Representation Strategy
2+
3+
Date: 2024-01-26
4+
Status: Accepted
5+
6+
## Context
7+
8+
The Union Square proxy needs to track costs associated with LLM API calls for audit and analysis purposes. This requires:
9+
1. Storing prices per thousand tokens (often fractional cents, e.g., $0.003)
10+
2. Calculating actual costs based on token usage
11+
3. Serializing/deserializing monetary values for storage and API responses
12+
4. Ensuring proper rounding for billing purposes
13+
14+
We evaluated several approaches:
15+
- Using floating-point numbers (rejected due to precision issues with money)
16+
- Using `rust_decimal::Decimal` throughout
17+
- Using dedicated money crates like `rusty-money`, `steel-cent`, or `currencies`
18+
- Building our own money type
19+
20+
## Decision Drivers
21+
22+
- **Precision**: Must accurately represent fractional cents for pricing
23+
- **Type Safety**: Prevent mixing monetary values with regular numbers
24+
- **Currency Support**: Should handle currency information (initially USD only)
25+
- **Serialization**: Must support serde for JSON API responses
26+
- **Rounding Rules**: Must support standard financial rounding (ceiling for costs)
27+
- **Performance**: Should not significantly impact response times
28+
29+
## Considered Options
30+
31+
### Option 1: Decimal Everywhere
32+
Use `rust_decimal::Decimal` for both prices and costs.
33+
34+
**Pros:**
35+
- Simple, single type for all monetary values
36+
- Arbitrary precision
37+
- Good serde support
38+
39+
**Cons:**
40+
- No currency information
41+
- No type distinction between prices and money
42+
- Easy to accidentally mix with non-monetary decimals
43+
44+
### Option 2: rusty-money Throughout
45+
Use `rusty-money` crate for all monetary values.
46+
47+
**Pros:**
48+
- Dedicated money type with currency support
49+
- Type safety
50+
- Rich API for money operations
51+
52+
**Cons:**
53+
- No built-in serde support (deal breaker)
54+
- Cannot represent fractional cents well
55+
56+
### Option 3: Hybrid Approach
57+
Use `Decimal` for prices (per-thousand-tokens) and a money crate for final costs.
58+
59+
**Pros:**
60+
- Appropriate types for each use case
61+
- Type safety for actual money values
62+
- Can represent fractional cent prices
63+
64+
**Cons:**
65+
- Two different types to manage
66+
- Potential confusion about when to use which
67+
68+
## Decision Outcome
69+
70+
We chose **Option 3: Hybrid Approach** using:
71+
- `rust_decimal::Decimal` for price-per-thousand-tokens
72+
- `currencies::Amount<USD>` for final cost calculations
73+
74+
### Implementation Details
75+
76+
```rust
77+
/// Price per thousand tokens (can be fractional cents)
78+
#[nutype(
79+
validate(predicate = |price| *price >= Decimal::ZERO),
80+
derive(Debug, Clone, Copy, PartialEq, Serialize, Deserialize, AsRef)
81+
)]
82+
pub struct PricePerThousandTokens(Decimal);
83+
84+
/// Calculate cost with ceiling rounding
85+
pub fn calculate_cost(
86+
&self,
87+
input_tokens: InputTokens,
88+
output_tokens: OutputTokens,
89+
) -> Amount<USD> {
90+
// Calculate in Decimal for precision
91+
let total_cost_decimal = /* calculation */;
92+
93+
// Convert to cents with ceiling rounding
94+
let total_cents = (total_cost_decimal * Decimal::from(100)).ceil();
95+
let cents_u64 = total_cents.try_into().unwrap_or(0);
96+
97+
Amount::<USD>::from_raw(cents_u64)
98+
}
99+
```
100+
101+
### Rounding Strategy
102+
103+
All costs are rounded UP to the next penny (ceiling rounding), which is standard practice for usage-based billing systems. This ensures:
104+
- Providers are never under-compensated
105+
- Consistent with industry practices
106+
- Simple and predictable for users
107+
108+
## Consequences
109+
110+
### Positive
111+
- Type safety prevents mixing prices with costs
112+
- Currency information is preserved in cost values
113+
- Proper financial rounding is enforced
114+
- Clear distinction between pricing models and actual charges
115+
- Good serialization support for API responses
116+
117+
### Negative
118+
- Breaking API change: `ProviderMetadata.cost_estimate` type changed
119+
- Developers must understand when to use each type
120+
- Additional dependency on `currencies` crate
121+
- Conversion logic needed between Decimal and Amount
122+
123+
### Future Considerations
124+
- Easy to extend to other currencies when needed
125+
- Could add convenience methods for common conversions
126+
- May want to create specialized types for different pricing models
127+
128+
## Links
129+
130+
- [currencies crate documentation](https://docs.rs/currencies/)
131+
- [rust_decimal documentation](https://docs.rs/rust_decimal/)
132+
- PR #136 - Initial implementation
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# 0020. Authentication Pass-Through Design
2+
3+
Date: 2024-01-26
4+
Status: Accepted
5+
6+
## Context
7+
8+
Union Square acts as a proxy between applications and LLM providers. Each provider has different authentication mechanisms:
9+
- AWS Bedrock uses SigV4 signatures
10+
- OpenAI uses API keys in headers
11+
- Anthropic uses API keys in headers
12+
- Other providers may use OAuth, JWT, or other schemes
13+
14+
We need to handle authentication in a way that:
15+
1. Maintains security without storing credentials
16+
2. Supports all provider authentication methods
17+
3. Allows the proxy to remain stateless
18+
4. Enables audit logging without exposing sensitive data
19+
20+
## Decision Drivers
21+
22+
- **Security**: Must never store or log credentials
23+
- **Transparency**: Proxy should not interfere with provider authentication
24+
- **Statelessness**: No session management or credential caching
25+
- **Flexibility**: Support diverse authentication schemes
26+
- **Auditability**: Track who makes requests without storing how
27+
28+
## Considered Options
29+
30+
### Option 1: Store and Manage Credentials
31+
Proxy stores credentials and makes requests on behalf of clients.
32+
33+
**Pros:**
34+
- Centralized credential management
35+
- Could implement credential rotation
36+
- Clients don't need provider credentials
37+
38+
**Cons:**
39+
- Major security risk
40+
- Compliance nightmare
41+
- Complex key management required
42+
- Violates principle of least privilege
43+
44+
### Option 2: Re-sign Requests
45+
Proxy validates incoming auth, then re-signs with its own credentials.
46+
47+
**Pros:**
48+
- Can validate authentication
49+
- Single set of provider credentials
50+
51+
**Cons:**
52+
- Requires proxy to have provider credentials
53+
- Breaks direct accountability
54+
- Complex signature manipulation
55+
- Different logic for each provider
56+
57+
### Option 3: Complete Pass-Through
58+
Forward authentication headers exactly as received.
59+
60+
**Pros:**
61+
- No credential storage
62+
- Provider handles all validation
63+
- Maintains direct accountability
64+
- Simple and secure
65+
66+
**Cons:**
67+
- Cannot validate authentication locally
68+
- Relies on provider error messages
69+
70+
## Decision Outcome
71+
72+
We chose **Option 3: Complete Pass-Through** for all authentication.
73+
74+
### Implementation Pattern
75+
76+
```rust
77+
// Extract auth headers without validation
78+
fn extract_sigv4_headers(headers: &HeaderMap) -> Result<Vec<(HeaderName, HeaderValue)>, ProviderError> {
79+
let mut auth_headers = Vec::new();
80+
81+
for header_name in &[AUTHORIZATION, AMZ_DATE, AMZ_SECURITY_TOKEN, AMZ_CONTENT_SHA256] {
82+
if let Some(value) = headers.get(header_name) {
83+
auth_headers.push((header_name.clone(), value.clone()));
84+
}
85+
}
86+
87+
Ok(auth_headers)
88+
}
89+
90+
// Forward exactly as received
91+
fn forward_request(&self, request: Request<Body>) -> Result<Response<Body>, ProviderError> {
92+
// Extract headers
93+
let auth_headers = extract_auth_headers(&request.headers())?;
94+
95+
// Forward unchanged
96+
for (name, value) in auth_headers {
97+
forwarded_request.headers_mut().insert(name, value);
98+
}
99+
100+
// Let provider validate
101+
client.request(forwarded_request).await
102+
}
103+
```
104+
105+
### Security Considerations
106+
107+
1. **No Logging**: Authentication headers are never logged
108+
2. **No Storage**: Credentials pass through memory only
109+
3. **No Validation**: We check presence, not correctness
110+
4. **Provider Errors**: Authentication failures return provider's error response
111+
112+
## Consequences
113+
114+
### Positive
115+
- Zero credential storage risk
116+
- Simple, stateless implementation
117+
- Supports all authentication schemes
118+
- Maintains accountability chain
119+
- Minimal attack surface
120+
- Easy to audit (what, not how)
121+
122+
### Negative
123+
- Cannot provide early authentication validation
124+
- Dependent on provider error messages
125+
- No ability to implement credential rotation
126+
- Cannot aggregate requests under proxy credentials
127+
128+
### Monitoring and Audit
129+
130+
We track:
131+
- ✅ Which provider was called
132+
- ✅ When the request was made
133+
- ✅ Whether authentication succeeded (via response)
134+
- ❌ What credentials were used
135+
- ❌ Who made the request (unless in other headers)
136+
137+
### Future Considerations
138+
139+
If we need request attribution, we could:
140+
- Add optional Union Square API keys for client identification
141+
- Use mutual TLS for client authentication
142+
- Add request signing for non-repudiation
143+
144+
But these would be IN ADDITION to, not INSTEAD OF, the pass-through authentication.
145+
146+
### Proxy Authentication
147+
148+
Union Square's authentication model (as per ADR-0006) uses API keys for client authentication. This is separate from provider authentication:
149+
150+
1. **Proxy Authentication** (Union Square API key) - Identifies the client application to the proxy
151+
2. **Provider Authentication** (pass-through) - Authenticates the request to the LLM provider
152+
153+
HTTP proxy standards (RFC 7235) define Proxy-Authorization headers for proxy authentication, but:
154+
- Traditional HTTP proxies are typically transparent to the application
155+
- Union Square is an application-level proxy with value-added features
156+
- We need to track usage per client for audit purposes
157+
158+
Our approach:
159+
- Use `X-API-Key` header for Union Square authentication (already implemented)
160+
- Keep provider authentication separate and pass it through unchanged
161+
- This allows clients to use different provider credentials while sharing a proxy
162+
163+
## Links
164+
165+
- ADR-0006 - Authentication and Authorization (establishes API key pattern)
166+
- AWS SigV4 documentation
167+
- PR #136 - Bedrock provider implementation

0 commit comments

Comments
 (0)