Skip to content

Latest commit

 

History

History
129 lines (110 loc) · 4.54 KB

File metadata and controls

129 lines (110 loc) · 4.54 KB

Architecture

This document describes the high-level architecture of the LLMCpp Spring Boot Chat application.

Overview

The application is designed as a standalone Spring Boot application that provides a Command Line Interface (CLI) for interacting with a Large Language Model (LLM). It leverages the java-llama.cpp library to run GGUF models locally.

Component Diagram

The following diagram illustrates the key components and their relationships:

classDiagram
    class LlmcppChatDemoApplication {
        +main(args)
    }
    class ChatRunner {
        +run(args)
    }
    class ChatServicesImpl {
        +startChatService()
    }
    class ChatbotServicesImpl {
        +generateResponse(question)
    }
    class ConsoleIOService {
        +readInput()
        +writeOutput()
    }
    class LlamaCppProperties {
        +String model
        +Double temperature
        +...
    }
    class LlamaModelComponent {
        -LlamaModel modelLlm
        +init()
        +getModelLlm()
        +generate(InferenceParameters)
    }
    class PromptComponent {
        -String promptContent
        +init()
    }
    
    LlmcppChatDemoApplication --> ChatRunner : triggers
    ChatRunner --> ChatServicesImpl : starts
    ChatServicesImpl --> ChatbotServicesImpl : uses
    ChatServicesImpl --> ConsoleIOService : uses I/O
    ChatbotServicesImpl --> LlamaModelComponent : uses
    ChatbotServicesImpl --> PromptComponent : uses
    ChatbotServicesImpl --> LlamaCppProperties : config
    LlamaModelComponent --> LlamaCppProperties : config
Loading

Sequence Diagram: Chat Flow

The following sequence diagram shows the flow of control when the application starts and processes a user request:

sequenceDiagram
    participant User
    participant App as LlmcppChatDemoApplication
    participant Runner as ChatRunner
    participant Chat as ChatServicesImpl
    participant IO as ConsoleIOService
    participant Bot as ChatbotServicesImpl
    participant Model as LlamaModelComponent
    participant Prompt as PromptComponent

    App->>Runner: run()
    Runner->>Chat: startChatService()
    loop Chat Loop
        Chat->>IO: writeOutput("user >")
        IO-->>User: Display Prompt
        User->>IO: Input Question
        IO-->>Chat: question
        alt Input is "exit"
            Chat->>Runner: return
        else Valid Input
            Chat->>Bot: generateResponse(question)
            Bot->>Prompt: getPromptContent()
            Bot->>Model: generate(InferenceParameters)
            loop Stream Tokens
                Model-->>Bot: LlamaOutput token
                Bot->>IO: writeOutput(token)
                IO-->>User: Print token
            end
        end
    end
Loading

Component Details

1. Application Entry & Lifecycle

  • LlmcppChatDemoApplication: Bootstraps the Spring application context.
  • ChatRunner: Implements CommandLineRunner. This is the preferred way to start CLI applications in Spring Boot as it runs after the context is fully refreshed and doesn't block the main thread during initialization.

2. Chat Interface (ChatServicesImpl)

  • Role: Handles the high-level chat logic.
  • Implementation:
    • Uses IOService to interact with the user, decoupling the logic from System.in/out.
    • Implements an infinite loop that breaks on the "exit" command.

3. I/O Abstraction (IOService / ConsoleIOService)

  • Role: Provides a clean interface for reading/writing to the user interface.
  • Implementation: ConsoleIOService uses Scanner and System.out. This abstraction makes the application testable and adaptable to other interfaces (e.g., a web socket or a GUI).

4. Chat Logic (ChatbotServicesImpl)

  • Role: Orchestrates the response generation.
  • Responsibilities:
    • Constructs the full prompt by injecting the user's question into the template.
    • Configures InferenceParameters (temperature, top-p, mirostat, etc.).
    • Calls the LlamaModel to generate text.
    • Streams the output directly to the console.

5. Model Management (LlamaModelComponent)

  • Role: Wrapper for the native LlamaModel.
  • Implementation: Uses LlamaCppProperties for configuration.
  • Lifecycle:
    • Init: Loads the GGUF model from the configured path.
    • Destroy: Ensures the model is closed properly to free native memory.

6. Prompt Management (PromptComponent)

  • Role: Loads and caches the prompt template.

7. Configuration (LlamaCppProperties)

  • Role: Type-safe configuration bean mapping all llamacpp.* properties.