crmne
diff --git a/‎docs/_advanced/rails.md‎
Lines changed: 25 additions & 0 deletions b/‎docs/_advanced/rails.md‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎docs/_core_features/chat.md‎
Lines changed: 90 additions & 0 deletions b/‎docs/_core_features/chat.md‎
Lines changed: 90 additions & 0 deletions
diff --git a/‎docs/_core_features/tools.md‎
Lines changed: 24 additions & 0 deletions b/‎docs/_core_features/tools.md‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎lib/generators/ruby_llm/install/templates/create_messages_migration.rb.tt‎
Lines changed: 3 additions & 0 deletions b/‎lib/generators/ruby_llm/install/templates/create_messages_migration.rb.tt‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎lib/generators/ruby_llm/upgrade_to_v1_9/templates/add_v1_9_message_columns.rb.tt‎
Lines changed: 15 additions & 0 deletions b/‎lib/generators/ruby_llm/upgrade_to_v1_9/templates/add_v1_9_message_columns.rb.tt‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎lib/generators/ruby_llm/upgrade_to_v1_9/upgrade_to_v1_9_generator.rb‎
Lines changed: 49 additions & 0 deletions b/‎lib/generators/ruby_llm/upgrade_to_v1_9/upgrade_to_v1_9_generator.rb‎
Lines changed: 49 additions & 0 deletions
diff --git a/‎lib/ruby_llm/active_record/chat_methods.rb‎
Lines changed: 41 additions & 13 deletions b/‎lib/ruby_llm/active_record/chat_methods.rb‎
Lines changed: 41 additions & 13 deletions
@@ -28,6 +28,7 @@ After reading this guide, you will know:
 *   How to use `acts_as_chat` and `acts_as_message` with your models
 *   How to persist AI model metadata in your database with `acts_as_model`
 *   How to send file attachments to AI models with ActiveStorage
+*   How to store raw provider payloads (Anthropic prompt caching, etc.)
 *   How to integrate streaming responses with Hotwire/Turbo Streams
 *   How to customize the persistence behavior for validation-focused scenarios
 
@@ -87,6 +88,7 @@ rails db:migrate
 
 Your Rails app is now AI-ready!
 
+
 ### Adding a Chat UI
 
 Want a ready-to-use chat interface? Run the chat UI generator:
@@ -148,6 +150,29 @@ class Message < ApplicationRecord
 end
 ```
 
+### Working with Raw Provider Payloads, Anthropic Prompt Caching
+{: .d-inline-block }
+
+v1.9.0+
+{: .label .label-green }
+
+Providers like Anthropic expose advanced features (prompt caching, fine-grained metadata) by embedding rich structures inside each prompt block. Use `RubyLLM::Content::Raw` to persist those blocks alongside your conversation history:
+
+```ruby
+raw_block = RubyLLM::Content::Raw.new([
+  { type: 'text', text: 'Reusable analysis prompt', cache_control: { type: 'ephemeral' } },
+  { type: 'text', text: "Today's request: #{summary}" }
+])
+
+chat = Chat.create!(model: 'claude-sonnet-4-5')
+chat.ask(raw_block)
+```
+
+The v1.9 schema adds a `content_raw` column so raw payloads live alongside the plain-text `content` field. When you load messages via `acts_as_message`, RubyLLM reconstructs the original `Content::Raw` automatically.
+
+> Existing apps: run `rails generate ruby_llm:upgrade_to_v1_9` to add cached-token tracking and raw content storage columns introduced in v1.9.0. New apps will get the proper columns from the install generator.
+{: .note }
+
 ### Configuring RubyLLM
 
 Set up your API keys and other configuration in the initializer:
 
@@ -49,6 +49,8 @@ puts response.content
 # The response object contains metadata
 puts "Model Used: #{response.model_id}"
 puts "Tokens Used: #{response.input_tokens} input, #{response.output_tokens} output"
+puts "Cached Prompt Tokens: #{response.cached_tokens}" # v1.9.0+
+puts "Cache Writes: #{response.cache_creation_tokens}" # v1.9.0+
 ```
 
 The `ask` method adds your message to the conversation history with the `:user` role, sends the entire conversation history to the AI provider, and returns a `RubyLLM::Message` object containing the assistant's response.
@@ -307,6 +309,88 @@ puts JSON.parse(response.content)
 > Available parameters vary by provider and model. Always consult the provider's documentation for supported features. RubyLLM passes these parameters through without validation, so incorrect parameters may cause API errors. Parameters from `with_params` take precedence over RubyLLM's defaults, allowing you to override any aspect of the request payload.
 {: .warning }
 
+## Raw Content Blocks
+{: .d-inline-block }
+
+v1.9.0+
+{: .label .label-green }
+
+{: .d-inline-block }
+
+v1.9.0+
+{: .label .label-green }
+
+Most of the time you can rely on RubyLLM to format messages for each provider. When you need to send a custom payload as content,  wrap it in `RubyLLM::Content::Raw`. The block is forwarded verbatim, with no additional processing.
+
+```ruby
+raw_block = RubyLLM::Content::Raw.new([
+  { type: 'text', text: 'Reusable analysis prompt' },
+  { type: 'text', text: "Today's request: #{summary}" }
+])
+
+chat = RubyLLM.chat
+chat.add_message(role: :system, content: raw_block)
+chat.ask(raw_block)
+```
+
+Use raw blocks sparingly: they bypass cross-provider safeguards, so it is your responsibility to ensure the payload matches the provider's expectations. `Chat#ask`, `Chat#add_message`, tool results, and streaming accumulators all understand `Content::Raw` values.
+
+### Anthropic Prompt Caching
+{: .d-inline-block }
+
+v1.9.0+
+{: .label .label-green }
+
+One use case for Raw Content Blocks is Anthropic Prompt Caching.
+
+Anthropic lets you mark individual prompt blocks for caching, which can dramatically reduce costs on long conversations. RubyLLM provides a convenience builder that returns a `Content::Raw` instance with the proper structure:
+
+```ruby
+system_block = RubyLLM::Providers::Anthropic::Content.new(
+  "You are a release-notes assistant. Always group changes by subsystem.",
+  cache: true # shorthand for cache_control: { type: 'ephemeral' }
+)
+
+chat = RubyLLM.chat(model: '{{ site.models.anthropic_latest }}')
+chat.add_message(role: :system, content: system_block)
+
+response = chat.ask(
+  RubyLLM::Providers::Anthropic::Content.new(
+    "Summarize the API changes in this diff.",
+    cache_control: { type: 'ephemeral', ttl: '1h' }
+  )
+)
+```
+
+Need something even more custom? Build the payload manually and wrap it in `Content::Raw`:
+
+```ruby
+raw_prompt = RubyLLM::Content::Raw.new([
+  { type: 'text', text: File.read('/a/large/file'), cache_control: { type: 'ephemeral' } },
+  { type: 'text', text: "Today's request: #{summary}" }
+])
+
+chat.ask(raw_prompt)
+```
+
+The same idea applies to tool definitions:
+
+```ruby
+class ChangelogTool < RubyLLM::Tool
+  description "Formats commits into human-readable changelog entries."
+  param :commits, type: :array, desc: "List of commits to summarize"
+
+  with_params cache_control: { type: 'ephemeral' }
+
+  def execute(commits:)
+    # ...
+  end
+end
+```
+
+Providers that do not understand these extra fields silently ignore them, so you can reuse the same tools across models.
+See the [Tool Provider Parameters]({% link _core_features/tools.md %}#provider-specific-parameters) section for more detail.
+
 ### Custom HTTP Headers
 
 Some providers offer beta features or special capabilities through custom HTTP headers. The `with_headers` method lets you add these headers to your API requests while maintaining RubyLLM's security model.
@@ -502,9 +586,13 @@ response = chat.ask "Explain the Ruby Global Interpreter Lock (GIL)."
 
 input_tokens = response.input_tokens   # Tokens in the prompt sent TO the model
 output_tokens = response.output_tokens # Tokens in the response FROM the model
+cached_tokens = response.cached_tokens # Tokens served from the provider's prompt cache (if supported) - v1.9.0+
+cache_creation_tokens = response.cache_creation_tokens # Tokens written to the cache (Anthropic/Bedrock) - v1.9.0+
 
 puts "Input Tokens: #{input_tokens}"
 puts "Output Tokens: #{output_tokens}"
+puts "Cached Prompt Tokens: #{cached_tokens}" # v1.9.0+
+puts "Cache Creation Tokens: #{cache_creation_tokens}" # v1.9.0+
 puts "Total Tokens for this turn: #{input_tokens + output_tokens}"
 
 # Estimate cost for this turn
@@ -523,6 +611,8 @@ total_conversation_tokens = chat.messages.sum { |msg| (msg.input_tokens || 0) +
 puts "Total Conversation Tokens: #{total_conversation_tokens}"
 ```
 
+`cached_tokens` captures the portion of the prompt served from the provider's cache. OpenAI reports this value automatically for prompts over 1024 tokens, while Anthropic and Bedrock/Claude expose both cache hits and cache writes. When the provider does not send cache data the attributes remain `nil`, so the example above falls back to zero for display. Available from v1.9+
+
 Refer to the [Working with Models Guide]({% link _advanced/models.md %}) for details on accessing model-specific pricing.
 
 ## Chat Event Handlers
 
@@ -86,6 +86,30 @@ end
 > ```
 {: .note }
 
+### Provider-Specific Parameters
+{: .d-inline-block }
+
+v1.9.0+
+{: .label .label-green }
+
+Some providers allow you to attach extra metadata to tool definitions (for example, Anthropic's `cache_control` directive for prompt caching). Use `with_params` on your tool class to declare these once and RubyLLM will merge them into the API payload when the provider understands them.
+
+```ruby
+class TodoTool < RubyLLM::Tool
+  description "Adds a task to the shared TODO list"
+  param :title, desc: "Human-friendly task description"
+
+  with_params cache_control: { type: 'ephemeral' }
+
+  def execute(title:)
+    Todo.create!(title:)
+    "Added “#{title}” to the list."
+  end
+end
+```
+
+Provider-specific tool parameters are passed through verbatim. Currently implemented only for the Anthropic provider, other providers will ignore `with_params` for now. Use `RUBYLLM_DEBUG=true` and keep an eye on your logs when rolling out new metadata.
+
 ## Returning Rich Content from Tools
 
 Tools can return `RubyLLM::Content` objects with file attachments, allowing you to pass images, documents, or other files from your tools to the AI model:
 
@@ -3,8 +3,11 @@ class Create<%= message_model_name.gsub('::', '').pluralize %> < ActiveRecord::M
     create_table :<%= message_table_name %> do |t|
       t.string :role, null: false
       t.text :content
+      t.json :content_raw
       t.integer :input_tokens
       t.integer :output_tokens
+      t.integer :cached_tokens
+      t.integer :cache_creation_tokens
       t.timestamps
     end
 
 
@@ -0,0 +1,15 @@
+class AddRubyLlmV19Columns < ActiveRecord::Migration<%= migration_version %>
+  def change
+    unless column_exists?(:<%= message_table_name %>, :cached_tokens)
+      add_column :<%= message_table_name %>, :cached_tokens, :integer
+    end
+
+    unless column_exists?(:<%= message_table_name %>, :cache_creation_tokens)
+      add_column :<%= message_table_name %>, :cache_creation_tokens, :integer
+    end
+
+    unless column_exists?(:<%= message_table_name %>, :content_raw)
+      add_column :<%= message_table_name %>, :content_raw, :json
+    end
+  end
+end
@@ -0,0 +1,49 @@
+# frozen_string_literal: true
+
+require 'rails/generators'
+require 'rails/generators/active_record'
+require_relative '../generator_helpers'
+
+module RubyLLM
+  module Generators
+    # Generator to add v1.9 columns (cached tokens + raw content support) to existing apps.
+    class UpgradeToV19Generator < Rails::Generators::Base
+      include Rails::Generators::Migration
+      include RubyLLM::GeneratorHelpers
+
+      namespace 'ruby_llm:upgrade_to_v1_9'
+      source_root File.expand_path('templates', __dir__)
+
+      argument :model_mappings, type: :array, default: [], banner: 'message:MessageName'
+
+      desc 'Adds cached token columns and raw content storage fields introduced in v1.9.0'
+
+      def self.next_migration_number(dirname)
+        ::ActiveRecord::Generators::Base.next_migration_number(dirname)
+      end
+
+      def create_migration_file
+        parse_model_mappings
+
+        migration_template 'add_v1_9_message_columns.rb.tt',
+                           'db/migrate/add_ruby_llm_v1_9_columns.rb',
+                           migration_version: migration_version,
+                           message_table_name: message_table_name
+      end
+
+      def show_next_steps
+        say_status :success, 'Upgrade prepared!', :green
+        say <<~INSTRUCTIONS
+
+          Next steps:
+          1. Review the generated migration
+          2. Run: rails db:migrate
+          3. Restart your application server
+
+          📚 See the v1.9.0 release notes for details on cached token tracking and raw content support.
+
+        INSTRUCTIONS
+      end
+    end
+  end
+end
@@ -174,8 +174,16 @@ def on_tool_result(...)
       end
 
       def create_user_message(content, with: nil)
-        message_record = messages_association.create!(role: :user, content: content)
+        content_text, attachments, content_raw = prepare_content_for_storage(content)
+
+        message_record = messages_association.build(role: :user)
+        message_record.content = content_text
+        message_record.content_raw = content_raw if message_record.respond_to?(:content_raw=)
+        message_record.save!
+
         persist_content(message_record, with) if with.present?
+        persist_content(message_record, attachments) if attachments.present?
+
         message_record
       end
 
@@ -235,28 +243,25 @@ def persist_new_message
         @message = messages_association.create!(role: :assistant, content: '')
       end
 
-      def persist_message_completion(message) # rubocop:disable Metrics/PerceivedComplexity
+      # rubocop:disable Metrics/PerceivedComplexity
+      def persist_message_completion(message)
         return unless message
 
         tool_call_id = find_tool_call_id(message.tool_call_id) if message.tool_call_id
 
         transaction do
-          content = message.content
-          attachments_to_persist = nil
-
-          if content.is_a?(RubyLLM::Content)
-            attachments_to_persist = content.attachments if content.attachments.any?
-            content = content.text
-          elsif content.is_a?(Hash) || content.is_a?(Array)
-            content = content.to_json
-          end
+          content_text, attachments_to_persist, content_raw = prepare_content_for_storage(message.content)
 
           attrs = {
             role: message.role,
-            content: content,
+            content: content_text,
             input_tokens: message.input_tokens,
             output_tokens: message.output_tokens
           }
+          attrs[:cached_tokens] = message.cached_tokens if @message.has_attribute?(:cached_tokens)
+          if @message.has_attribute?(:cache_creation_tokens)
+            attrs[:cache_creation_tokens] = message.cache_creation_tokens
+          end
 
           # Add model association dynamically
           attrs[self.class.model_association_name] = model_association
@@ -266,12 +271,15 @@ def persist_message_completion(message) # rubocop:disable Metrics/PerceivedCompl
             attrs[parent_tool_call_assoc.foreign_key] = tool_call_id
           end
 
-          @message.update!(attrs)
+          @message.assign_attributes(attrs)
+          @message.content_raw = content_raw if @message.respond_to?(:content_raw=)
+          @message.save!
 
           persist_content(@message, attachments_to_persist) if attachments_to_persist
           persist_tool_calls(message.tool_calls) if message.tool_calls.present?
         end
       end
+      # rubocop:enable Metrics/PerceivedComplexity
 
       def persist_tool_calls(tool_calls)
         tool_calls.each_value do |tool_call|
@@ -331,6 +339,26 @@ def convert_to_active_storage_format(source)
         RubyLLM.logger.warn "Failed to process attachment #{source}: #{e.message}"
         nil
       end
+
+      def prepare_content_for_storage(content)
+        attachments = nil
+        content_raw = nil
+        content_text = content
+
+        case content
+        when RubyLLM::Content::Raw
+          content_raw = content.value
+          content_text = nil
+        when RubyLLM::Content
+          attachments = content.attachments if content.attachments.any?
+          content_text = content.text
+        when Hash, Array
+          content_raw = content
+          content_text = nil
+        end
+
+        [content_text, attachments, content_raw]
+      end
     end
   end
 end