Skip to content

feat: Add S3ChatMemoryRepository #5088

@ybezsonov

Description

@ybezsonov

Feature Request: S3ChatMemoryRepository for Spring AI

Please do a quick search on GitHub issues first, the feature you are about to request might have already been requested.

Expected Behavior

Spring AI should provide an S3-based ChatMemoryRepository implementation that allows developers to store chat conversation messages in Amazon S3, similar to existing implementations for JDBC, Redis, MongoDB, Neo4j, and CosmosDB.

The implementation should:

  1. Follow Spring AI patterns: Implement the ChatMemoryRepository interface with the same behavioral contracts as other repository implementations
  2. Provide Spring Boot auto-configuration: Enable seamless integration through configuration properties
  3. Support all Spring AI message types: Handle UserMessage, AssistantMessage, SystemMessage, and ToolResponseMessage with metadata preservation
  4. Offer cost optimization: Support S3 storage classes and work with MessageWindowChatMemory for automatic message limiting

Code Example

// Configuration via properties
spring.ai.chat.memory.repository.s3.bucket-name=my-chat-bucket
spring.ai.chat.memory.repository.s3.key-prefix=chat-memory
spring.ai.chat.memory.repository.s3.region=us-east-1
spring.ai.chat.memory.repository.s3.storage-class=STANDARD

// Programmatic configuration
@Bean
public S3ChatMemoryRepository s3ChatMemoryRepository(S3Client s3Client) {
    return S3ChatMemoryRepository.builder()
        .s3Client(s3Client)
        .bucketName("my-chat-bucket")
        .keyPrefix("chat-memory")
        .storageClass(StorageClass.STANDARD)
        .build();
}

// Usage with MessageWindowChatMemory
@Bean
public ChatMemory chatMemory(S3ChatMemoryRepository repository) {
    return MessageWindowChatMemory.builder()
        .chatMemoryRepository(repository)
        .maxMessages(20)  // Cost control through windowing
        .build();
}

JSON Storage Format

Conversations would be stored as JSON objects in S3 with the pattern {prefix}/{conversationId}.json:

{
  "conversationId": "conv-123",
  "messages": [
    {
      "type": "USER",
      "content": "Hello, how are you?",
      "timestamp": 1703123456,
      "metadata": {
        "userId": "user-456"
      }
    },
    {
      "type": "ASSISTANT",
      "content": "I'm doing well, thank you!",
      "timestamp": 1703123457,
      "metadata": {
        "model": "gpt-4"
      }
    }
  ]
}

Current Behavior

Spring AI currently provides ChatMemoryRepository implementations for:

  • JDBC (relational databases)
  • Redis (in-memory data store)
  • MongoDB (document database)
  • Neo4j (graph database)
  • CosmosDB (multi-model database)

However, there is no implementation for Amazon S3, which is a popular choice for:

  • Cost-effective storage: S3 offers multiple storage classes (Standard, IA, Glacier) for cost optimization
  • Durability and scalability: 99.999999999% (11 9's) durability and virtually unlimited scalability
  • Lifecycle management: Automatic expiration and tiering policies
  • Cloud-native applications: Many applications already use S3 for other storage needs

Context

How has this issue affected you?

Many Spring AI applications run on AWS and would benefit from using S3 for chat memory storage because:

  1. Cost optimization: S3 is significantly cheaper than databases for storing chat history, especially with storage classes like IA (Infrequent Access)
  2. Operational simplicity: No need to manage database infrastructure for chat storage
  3. Compliance: S3 provides built-in encryption, versioning, and access controls
  4. Integration: Seamless integration with existing AWS infrastructure and IAM policies

What are you trying to accomplish?

  • Store chat conversation history in a cost-effective, durable manner
  • Leverage S3's lifecycle policies for automatic data management
  • Reduce operational overhead by using managed storage instead of databases
  • Maintain consistency with Spring AI patterns and conventions

What other alternatives have you considered?

  1. JDBC with RDS: More expensive and requires database management
  2. Redis: Requires memory management and is more expensive for long-term storage
  3. MongoDB: Requires database infrastructure and ongoing maintenance
  4. Custom implementation: Would require significant development effort and wouldn't follow Spring AI patterns

Are you aware of any workarounds?

Currently, developers must either:

  1. Use one of the existing repository implementations (which may not fit their architecture)
  2. Implement a custom ChatMemoryRepository for S3 (duplicating effort across projects)
  3. Store chat history outside of Spring AI's ChatMemory system

Benefits of S3 Implementation

  1. Cost-effective: S3 offers competitive pricing with multiple storage classes for cost optimization (see S3 Pricing)
  2. Serverless-friendly: Perfect for serverless applications that want to avoid database connections
  3. Automatic scaling: No capacity planning required
  4. Built-in features: Encryption, versioning, lifecycle policies, cross-region replication
  5. AWS ecosystem integration: Works seamlessly with other AWS services

Implementation Considerations

The implementation should:

  • Follow the same patterns as existing Spring AI repository implementations
  • Support Spring Boot auto-configuration with sensible defaults
  • Handle S3 pagination for listing conversations
  • Provide proper error handling and exception mapping
  • Support S3-compatible services (MinIO, Wasabi, etc.) through endpoint configuration
  • Include comprehensive testing with LocalStack integration tests

This feature would make Spring AI more accessible to AWS-native applications and provide a cost-effective storage option for chat memory.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions