feat: add thinking and cleaned_content properties to Message class (#65)

lfnovo · web-flow · commit b2432183fbea · 2026-01-16T16:47:34.000-03:00
* feat: add thinking and cleaned_content properties to Message class

Adds two new properties for handling models that include reasoning traces:
- thinking: extracts content inside &lt;think&gt; tags
- cleaned_content: returns content with &lt;think&gt; tags removed

Useful for models like Qwen3, DeepSeek R1, and others that include
chain-of-thought reasoning in their responses.

* docs: add documentation for thinking and cleaned_content properties

- README.md: Added example in LLM Responses section
- docs/capabilities/llm.md: Added Handling Reasoning Traces section
- CLAUDE.md: Added Message Thinking Properties documentation
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,16 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [2.15.0] - 2026-01-16
+
+### Added
+
+- **Message Thinking Properties** - Added `thinking` and `cleaned_content` properties to `Message` class
+  - `thinking`: Extracts content inside `<think>` tags (reasoning trace from models like Qwen3, DeepSeek R1)
+  - `cleaned_content`: Returns content with `<think>` tags removed (actual response)
+  - Multiple `<think>` blocks are concatenated
+  - Returns `None` for `thinking` if no tags present
+
 ## [2.14.1] - 2026-01-16
 
 ### Fixed
diff --git a/README.md b/README.md
@@ -323,6 +323,24 @@ async for chunk in model.achat_complete(messages):
     print(chunk.choices[0].delta.content, end="", flush=True)
 ```
 
+#### Handling Reasoning Traces
+
+Some models (like Qwen3, DeepSeek R1) include chain-of-thought reasoning in `<think>` tags. The `Message` class provides convenient properties to handle this:
+
+```python
+response = model.chat_complete(messages)
+msg = response.choices[0].message
+
+# Full response including reasoning
+msg.content  # "<think>Let me analyze...</think>\n\n{\"answer\": 42}"
+
+# Just the reasoning (returns None if no <think> tags)
+msg.thinking  # "Let me analyze..."
+
+# Just the actual response (with <think> tags removed)
+msg.cleaned_content  # "{\"answer\": 42}"
+```
+
 ### Embedding Responses
 
 ```python
diff --git a/docs/capabilities/llm.md b/docs/capabilities/llm.md
@@ -133,6 +133,26 @@ chunk = next(model.chat_complete(messages, stream=True))
 chunk.choices[0].delta.content                # Incremental content
 ```
 
+### Handling Reasoning Traces
+
+Some models (like Qwen3, DeepSeek R1) include chain-of-thought reasoning in `<think>` tags. The `Message` class provides convenient properties to parse these:
+
+```python
+response = model.chat_complete(messages)
+msg = response.choices[0].message
+
+# Full response with reasoning
+msg.content          # "<think>Let me analyze...</think>\n\n42"
+
+# Just the reasoning trace (returns None if no <think> tags)
+msg.thinking         # "Let me analyze..."
+
+# Just the actual answer (with <think> tags removed)
+msg.cleaned_content  # "42"
+```
+
+Multiple `<think>` blocks are concatenated. If the response has no `<think>` tags, `thinking` returns `None` and `cleaned_content` returns the full content unchanged.
+
 ## Structured Output
 
 Request JSON-formatted responses (where supported):
@@ -264,6 +284,27 @@ asyncio.run(main())
 
 See [Resource Management](../advanced/connection-resource-management.md) for more details.
 
+### Handling Reasoning Traces
+
+```python
+# Works with models that include <think> tags (Qwen3, DeepSeek R1, etc.)
+model = AIFactory.create_language(
+    "openai-compatible", "qwen/qwen3-4b",
+    config={"base_url": "http://localhost:1234/v1"}
+)
+
+messages = [{"role": "user", "content": "What is 15 * 24?"}]
+response = model.chat_complete(messages)
+msg = response.choices[0].message
+
+# Get just the answer, without the reasoning
+print(msg.cleaned_content)  # "360"
+
+# Or inspect the reasoning for debugging
+if msg.thinking:
+    print(f"Model's reasoning: {msg.thinking}")
+```
+
 ## See Also
 
 - [Provider Setup Guides](../providers/README.md)
diff --git a/pyproject.toml b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "esperanto"
-version = "2.14.1"
+version = "2.15.0"
 description = "A light-weight, production-ready, unified interface for various AI model providers"
 authors = [
     { name = "LUIS NOVO", email = "lfnovo@gmail.com" }
diff --git a/src/esperanto/common_types/CLAUDE.md b/src/esperanto/common_types/CLAUDE.md
@@ -44,7 +44,7 @@ All providers convert their API responses to Esperanto's common types:
 
 ### Message Structure
 
-Chat messages follow OpenAI-style format (response.py:31):
+Chat messages follow OpenAI-style format (response.py:34):
 
 ```python
 Message(
@@ -55,6 +55,27 @@ Message(
 )
 ```
 
+### Message Thinking Properties
+
+The `Message` class provides properties for handling models that include reasoning traces (like Qwen3, DeepSeek R1):
+
+- **`thinking`**: Extracts content inside `<think>` tags (returns `None` if no tags)
+- **`cleaned_content`**: Returns content with `<think>` tags removed
+
+```python
+# Response from a model with reasoning traces
+msg = Message(
+    content="<think>Let me analyze this...</think>\n\n{\"answer\": 42}",
+    role="assistant"
+)
+
+msg.content          # "<think>Let me analyze this...</think>\n\n{\"answer\": 42}"
+msg.thinking         # "Let me analyze this..."
+msg.cleaned_content  # "{\"answer\": 42}"
+```
+
+Multiple `<think>` blocks are concatenated with `\n\n`. If content has no `<think>` tags, `thinking` returns `None` and `cleaned_content` returns the full content.
+
 ### Usage Tracking
 
 Token usage is standardized in `Usage` class (response.py:19):
@@ -239,3 +260,27 @@ print(msg["content"])  # "Hello"
 # Convert to dict
 msg_dict = msg.model_dump()  # {"content": "Hello", "role": "user", ...}
 ```
+
+### Handling Reasoning Traces
+
+```python
+from esperanto import AIFactory
+
+# Models like Qwen3, DeepSeek R1 include <think> tags
+model = AIFactory.create_language("openai-compatible", "qwen/qwen3-4b", config={...})
+response = model.chat_complete([{"role": "user", "content": "What is 2+2?"}])
+
+msg = response.choices[0].message
+
+# Full response with reasoning
+print(msg.content)
+# "<think>I need to add 2 and 2...</think>\n\n4"
+
+# Just the reasoning (useful for debugging/logging)
+print(msg.thinking)
+# "I need to add 2 and 2..."
+
+# Just the answer (useful for parsing/display)
+print(msg.cleaned_content)
+# "4"
+```
diff --git a/src/esperanto/common_types/response.py b/src/esperanto/common_types/response.py
@@ -1,9 +1,13 @@
 """Response types for Esperanto."""
 
+import re
 from typing import Any, Dict, List, Optional, Union
 
 from pydantic import BaseModel, ConfigDict, Field, model_validator
 
+# Regex pattern to match <think>...</think> blocks (including multiline)
+_THINK_PATTERN = re.compile(r"<think>(.*?)</think>", re.DOTALL)
+
 
 def to_dict(obj: Any) -> Dict[str, Any]:
     """Convert an object to a dictionary."""
@@ -50,6 +54,46 @@ def __getitem__(self, key: str) -> Any:
         """Enable dict-like access for backward compatibility."""
         return getattr(self, key)
 
+    @property
+    def thinking(self) -> Optional[str]:
+        """Extract content inside <think> tags (reasoning trace).
+
+        Returns the concatenated content of all <think>...</think> blocks
+        in the message. Returns None if no thinking tags are present or
+        if all thinking blocks are empty.
+
+        This is useful for models like Qwen3, DeepSeek R1, and others that
+        include chain-of-thought reasoning in their responses.
+        """
+        if not self.content:
+            return None
+        matches = _THINK_PATTERN.findall(self.content)
+        if not matches:
+            return None
+        # Concatenate all thinking blocks, stripping whitespace
+        non_empty = [match.strip() for match in matches if match.strip()]
+        if not non_empty:
+            return None
+        return "\n\n".join(non_empty)
+
+    @property
+    def cleaned_content(self) -> str:
+        """Get content with <think> tags removed (actual response).
+
+        Returns the message content with all <think>...</think> blocks
+        removed. If there are no thinking tags, returns the original content.
+
+        This is useful for getting the actual response from models that
+        include chain-of-thought reasoning in their responses.
+        """
+        if not self.content:
+            return ""
+        # Remove all <think>...</think> blocks and clean up whitespace
+        cleaned = _THINK_PATTERN.sub("", self.content)
+        # Clean up extra whitespace/newlines left behind
+        cleaned = re.sub(r"\n{3,}", "\n\n", cleaned)
+        return cleaned.strip()
+
     @model_validator(mode="before")
     @classmethod
     def convert_mock_content(cls, data: Any) -> Any:
diff --git a/tests/test_message_thinking.py b/tests/test_message_thinking.py
@@ -0,0 +1,157 @@
+"""Tests for Message.thinking and Message.cleaned_content properties."""
+
+import pytest
+
+from esperanto.common_types import Message
+
+
+class TestMessageThinkingProperty:
+    """Tests for the thinking property."""
+
+    def test_thinking_extracts_single_block(self):
+        """Test extracting content from a single think block."""
+        content = "<think>Let me think about this.</think>\n\nThe answer is 42."
+        msg = Message(content=content, role="assistant")
+        assert msg.thinking == "Let me think about this."
+
+    def test_thinking_extracts_multiline_block(self):
+        """Test extracting multiline content from think block."""
+        content = """<think>
+First, I need to consider X.
+Then, I should look at Y.
+Finally, Z is important.
+</think>
+
+The answer is 42."""
+        msg = Message(content=content, role="assistant")
+        assert "First, I need to consider X." in msg.thinking
+        assert "Then, I should look at Y." in msg.thinking
+        assert "Finally, Z is important." in msg.thinking
+
+    def test_thinking_concatenates_multiple_blocks(self):
+        """Test that multiple think blocks are concatenated."""
+        content = """<think>First thought</think>
+
+Some response
+
+<think>Second thought</think>
+
+More response"""
+        msg = Message(content=content, role="assistant")
+        assert msg.thinking == "First thought\n\nSecond thought"
+
+    def test_thinking_returns_none_without_tags(self):
+        """Test that thinking returns None when no think tags present."""
+        content = '{"name": "John", "age": 30}'
+        msg = Message(content=content, role="assistant")
+        assert msg.thinking is None
+
+    def test_thinking_returns_none_for_empty_content(self):
+        """Test that thinking returns None for None content."""
+        msg = Message(content=None, role="assistant")
+        assert msg.thinking is None
+
+    def test_thinking_returns_none_for_empty_think_tags(self):
+        """Test that thinking returns None when think tags are empty."""
+        content = "<think>\n\n</think>\n\n{\"result\": 42}"
+        msg = Message(content=content, role="assistant")
+        assert msg.thinking is None
+
+    def test_thinking_strips_whitespace(self):
+        """Test that thinking content is stripped of leading/trailing whitespace."""
+        content = "<think>   \n  Padded content  \n   </think>"
+        msg = Message(content=content, role="assistant")
+        assert msg.thinking == "Padded content"
+
+
+class TestMessageCleanedContentProperty:
+    """Tests for the cleaned_content property."""
+
+    def test_cleaned_content_removes_think_block(self):
+        """Test that cleaned_content removes think blocks."""
+        content = "<think>Let me think.</think>\n\n{\"answer\": 42}"
+        msg = Message(content=content, role="assistant")
+        assert msg.cleaned_content == '{"answer": 42}'
+
+    def test_cleaned_content_removes_multiple_blocks(self):
+        """Test that cleaned_content removes multiple think blocks."""
+        content = """<think>First thought</think>
+
+Some response
+
+<think>Second thought</think>
+
+More response"""
+        msg = Message(content=content, role="assistant")
+        cleaned = msg.cleaned_content
+        assert "<think>" not in cleaned
+        assert "</think>" not in cleaned
+        assert "First thought" not in cleaned
+        assert "Second thought" not in cleaned
+        assert "Some response" in cleaned
+        assert "More response" in cleaned
+
+    def test_cleaned_content_returns_full_content_without_tags(self):
+        """Test that cleaned_content returns full content when no think tags."""
+        content = '{"name": "John", "age": 30}'
+        msg = Message(content=content, role="assistant")
+        assert msg.cleaned_content == content
+
+    def test_cleaned_content_returns_empty_string_for_none(self):
+        """Test that cleaned_content returns empty string for None content."""
+        msg = Message(content=None, role="assistant")
+        assert msg.cleaned_content == ""
+
+    def test_cleaned_content_handles_empty_think_tags(self):
+        """Test that cleaned_content handles empty think tags correctly."""
+        content = "<think>\n\n</think>\n\n{\"result\": 42}"
+        msg = Message(content=content, role="assistant")
+        assert msg.cleaned_content == '{"result": 42}'
+
+    def test_cleaned_content_cleans_excessive_newlines(self):
+        """Test that cleaned_content normalizes excessive newlines."""
+        content = "<think>Thinking...</think>\n\n\n\n\nThe result"
+        msg = Message(content=content, role="assistant")
+        # Should not have more than 2 consecutive newlines
+        assert "\n\n\n" not in msg.cleaned_content
+        assert "The result" in msg.cleaned_content
+
+
+class TestMessageThinkingIntegration:
+    """Integration tests for thinking/cleaned_content with real-world examples."""
+
+    def test_qwen_style_response(self):
+        """Test parsing Qwen3-style response with think tags."""
+        content = """<think>
+Okay, the user wants a JSON with name and age for a fictional person.
+Let me create something reasonable.
+Name: Elena Voss
+Age: 34
+</think>
+
+{"name": "Elena Voss", "age": 34}"""
+        msg = Message(content=content, role="assistant")
+
+        assert msg.thinking is not None
+        assert "Elena Voss" in msg.thinking
+        assert "Age: 34" in msg.thinking
+
+        assert msg.cleaned_content == '{"name": "Elena Voss", "age": 34}'
+        assert "<think>" not in msg.cleaned_content
+
+    def test_empty_string_content(self):
+        """Test with empty string content."""
+        msg = Message(content="", role="assistant")
+        assert msg.thinking is None
+        assert msg.cleaned_content == ""
+
+    def test_original_content_unchanged(self):
+        """Test that original content property is unchanged."""
+        content = "<think>Reasoning</think>\n\nResult"
+        msg = Message(content=content, role="assistant")
+
+        # Original content should be preserved
+        assert msg.content == content
+        # Properties should parse it differently
+        assert msg.thinking == "Reasoning"
+        assert msg.cleaned_content == "Result"
diff --git a/uv.lock b/uv.lock