algorithmicsuperintelligence
diff --git a/‎README.md‎
Lines changed: 2 additions & 1 deletion b/‎README.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎optillm/autothink/README.md‎
Lines changed: 95 additions & 0 deletions b/‎optillm/autothink/README.md‎
Lines changed: 95 additions & 0 deletions
diff --git a/‎optillm/autothink/__init__.py‎
Lines changed: 7 additions & 0 deletions b/‎optillm/autothink/__init__.py‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎optillm/autothink/autothink.py‎
Lines changed: 88 additions & 0 deletions b/‎optillm/autothink/autothink.py‎
Lines changed: 88 additions & 0 deletions
diff --git a/‎optillm/autothink/classifier.py‎
Lines changed: 152 additions & 0 deletions b/‎optillm/autothink/classifier.py‎
Lines changed: 152 additions & 0 deletions
@@ -343,7 +343,7 @@ Check this log file for connection issues, tool execution errors, and other diag
 
 | Approach                             | Slug               | Description                                                                                    |
 | ------------------------------------ | ------------------ | ---------------------------------------------------------------------------------------------- |
-| Cerebras Planning and Optimization | `cepo`             | Combines Best of N, Chain-of-Thought, Self-Reflection, Self-Improvement, and various prompting techniques |
+| Cerebras Planning and Optimization   | `cepo`             | Combines Best of N, Chain-of-Thought, Self-Reflection, Self-Improvement, and various prompting techniques |
 | CoT with Reflection                  | `cot_reflection`   | Implements chain-of-thought reasoning with \<thinking\>, \<reflection> and \<output\> sections |
 | PlanSearch                           | `plansearch`       | Implements a search algorithm over candidate plans for solving a problem in natural language   |
 | ReRead                               | `re2`              | Implements rereading to improve reasoning by processing queries twice                          |
@@ -359,6 +359,7 @@ Check this log file for connection issues, tool execution errors, and other diag
 | CoT Decoding                         |  N/A for proxy     | Implements chain-of-thought decoding to elicit reasoning without explicit prompting            |
 | Entropy Decoding                     |  N/A for proxy     | Implements adaptive sampling based on the uncertainty of tokens during generation              |
 | Thinkdeeper                          |  N/A for proxy     | Implements the `reasoning_effort` param from OpenAI for reasoning models like DeepSeek R1      |
+| AutoThink                            |  N/A for proxy     | Combines query complexity classification with steering vectors to enhance reasoning            |
 
 ## Implemented plugins
 
 
@@ -0,0 +1,95 @@
+# AutoThink
+
+AutoThink is an adaptive thinking approach for Large Language Models that combines query complexity classification with steering vector guidance to enhance model reasoning capabilities.
+
+## Overview
+
+AutoThink combines several advanced techniques to optimize the thinking process of LLMs:
+
+1. **Query Complexity Classification**: Uses an adaptive classifier to determine if a query requires HIGH or LOW complexity reasoning
+2. **Token Budget Allocation**: Dynamically allocates thinking tokens based on query complexity
+3. **Steering Vector Guidance**: Applies activation-based steering vectors to guide the model's reasoning process
+4. **Controlled Thinking Process**: Manages explicit thinking phases with start and end tokens
+
+## How It Works
+
+### 1. Query Classification
+
+AutoThink uses the `adaptive-classifier/llm-router` model to classify incoming queries:
+
+- **HIGH**: Complex queries requiring deep reasoning, multi-step calculations, or thorough exploration
+- **LOW**: Simpler queries requiring less extensive reasoning
+
+### 2. Token Budget
+
+Based on the classification, AutoThink allocates different token budgets for the thinking phase:
+
+- **HIGH**: 70-90% of max tokens allocated for thinking
+- **LOW**: 20-40% of max tokens allocated for thinking
+
+### 3. Steering Vectors
+
+AutoThink uses pre-extracted steering vectors from datasets like `codelion/Qwen3-0.6B-pts-steering-vectors`. These vectors represent different reasoning patterns:
+
+- **Depth and thoroughness**: Encourages detailed, step-by-step reasoning
+- **Numerical accuracy**: Promotes precise calculations and verification
+- **Self-correction**: Facilitates error detection and correction
+- **Exploration**: Supports considering multiple approaches
+- **Organization**: Improves logical structure in responses
+
+During inference, the model's internal activations are modified based on these vectors to enhance specific reasoning capabilities.
+
+### 4. Controlled Thinking Process
+
+The generation process includes:
+1. A thinking phase marked by `<think>` and `</think>` tokens
+2. Automatic adjustment of thinking time based on query complexity
+3. Dynamic application of steering vectors
+4. Graceful transition to the final response
+
+## Configuration
+
+AutoThink can be configured with:
+
+```python
+{
+    "model_name": "your-model-name",
+    "classifier_model": "adaptive-classifier/llm-router",
+    "steering_dataset": "codelion/Qwen3-0.6B-pts-steering-vectors",
+    "target_layer": 19,  # Layer to apply steering vectors
+    "high_complexity_min_tokens": 1024, 
+    "high_complexity_max_tokens": 4096,
+    "low_complexity_min_tokens": 256,
+    "low_complexity_max_tokens": 1024,
+    "pattern_strengths": {
+        "depth_and_thoroughness": 2.5,  # Steering strength for different patterns
+        "numerical_accuracy": 2.0,
+        "self_correction": 3.0,
+        "exploration": 2.0,
+        "organization": 1.5
+    }
+}
+```
+
+## Usage
+
+```python
+from optillm.autothink import autothink_decode
+
+response = autothink_decode(
+    model,
+    tokenizer,
+    messages,
+    {
+        "steering_dataset": "codelion/Qwen3-0.6B-pts-steering-vectors",
+        "target_layer": 19
+    }
+)
+```
+
+## Benefits
+
+- **Adaptive Resource Usage**: Models think more on complex problems and less on simple ones
+- **Enhanced Reasoning**: Steering vectors guide the model toward better reasoning patterns
+- **Efficiency**: Better performance without increasing model size
+- **Customizability**: Can be tailored for different domains using domain-specific steering vector datasets
@@ -0,0 +1,7 @@
+"""
+AutoThink - Adaptive thinking approach for LLMs with query complexity classification and steering vectors.
+"""
+
+from .autothink import autothink_decode, AutoThinkProcessor
+
+__all__ = ["autothink_decode", "AutoThinkProcessor"]
@@ -0,0 +1,88 @@
+"""
+AutoThink main implementation.
+
+This module provides the main implementation of AutoThink, combining
+query complexity classification with steering vectors to enhance reasoning.
+"""
+
+import logging
+from typing import Dict, List, Any, Optional
+from transformers import PreTrainedModel, PreTrainedTokenizer
+
+from .processor import AutoThinkProcessor
+
+logger = logging.getLogger(__name__)
+
+class AutoThinkProcessor:
+    """
+    Main AutoThink processor class for external use.
+    Wraps the internal processor implementation.
+    """
+    
+    def __init__(self, model: PreTrainedModel, tokenizer: PreTrainedTokenizer, config: Dict[str, Any] = None):
+        """
+        Initialize the AutoThink processor.
+        
+        Args:
+            model: Language model
+            tokenizer: Model tokenizer
+            config: Configuration dictionary
+        """
+        self.config = config or {}
+        self.processor = None
+        self.model = model
+        self.tokenizer = tokenizer
+    
+    def __call__(self, messages: List[Dict[str, str]]) -> str:
+        """
+        Process messages with AutoThink's controlled thinking.
+        
+        Args:
+            messages: List of message dictionaries
+            
+        Returns:
+            Generated response
+        """
+        # Create processor on first use to allow for model loading
+        if self.processor is None:
+            self.processor = self._create_processor()
+        
+        return self.processor.process(messages)
+    
+    def _create_processor(self):
+        """Create the internal processor instance."""
+        return AutoThinkProcessor(self.config, self.tokenizer, self.model)
+
+def autothink_decode(
+    model: PreTrainedModel, 
+    tokenizer: PreTrainedTokenizer, 
+    messages: List[Dict[str, str]], 
+    request_config: Optional[Dict[str, Any]] = None
+) -> str:
+    """
+    Main plugin execution function with AutoThink's controlled thinking process.
+    
+    Args:
+        model: Language model
+        tokenizer: Model tokenizer
+        messages: List of message dictionaries
+        request_config: Optional configuration dictionary
+        
+    Returns:
+        Generated response with thinking process
+    """
+    logger.info("Starting AutoThink processing")
+    
+    # Create config dictionary
+    config = {}
+    if request_config:
+        config.update(request_config)
+    
+    try:
+        processor = AutoThinkProcessor(model, tokenizer, config)
+        response = processor(messages)
+        return response
+        
+    except Exception as e:
+        logger.error(f"Error in AutoThink processing: {str(e)}")
+        raise
@@ -0,0 +1,152 @@
+"""
+Query complexity classifier for AutoThink.
+
+This module provides functionality to classify queries as HIGH or LOW complexity
+using the adaptive-classifier model.
+"""
+
+import logging
+from typing import Dict, Any, Tuple, Optional, List, Union
+import os
+import sys
+
+logger = logging.getLogger(__name__)
+
+class ComplexityClassifier:
+    """
+    Classifies queries as HIGH or LOW complexity for token budget allocation.
+    Uses the adaptive-classifier model for classification.
+    """
+    
+    def __init__(self, model_name: str = "adaptive-classifier/llm-router"):
+        """
+        Initialize the complexity classifier.
+        
+        Args:
+            model_name: HuggingFace model name or path for the classifier
+        """
+        self.model_name = model_name
+        self.classifier = None
+        
+        # Load model
+        self._load_model()
+        
+    def _load_model(self):
+        """Load the classification model using adaptive-classifier library."""
+        try:
+            # Check if adaptive-classifier is installed
+            try:
+                import adaptive_classifier
+            except ImportError:
+                logger.info("Installing adaptive-classifier library...")
+                os.system(f"{sys.executable} -m pip install adaptive-classifier")
+                import adaptive_classifier
+            
+            # Import the AdaptiveClassifier class
+            from adaptive_classifier import AdaptiveClassifier
+            
+            logger.info(f"Loading complexity classifier model: {self.model_name}")
+            self.classifier = AdaptiveClassifier.from_pretrained(self.model_name)
+            logger.info("Classifier loaded successfully")
+            
+        except Exception as e:
+            logger.error(f"Error loading complexity classifier: {e}")
+            # Fallback to basic classification if model fails to load
+            self.classifier = None
+    
+    def predict(self, text: str) -> List[Tuple[str, float]]:
+        """
+        Predict the complexity label for a given text.
+        
+        Args:
+            text: The query text to classify
+            
+        Returns:
+            List of (label, score) tuples sorted by confidence
+        """
+        if self.classifier is None:
+            logger.warning("Classifier not loaded. Using fallback classification.")
+            return self._fallback_classification(text)
+        
+        try:
+            # Make prediction using the AdaptiveClassifier
+            predictions = self.classifier.predict(text)
+            logger.debug(f"Classifier predictions: {predictions}")
+            
+            # Make sure predictions are in the expected format
+            if isinstance(predictions, list) and all(isinstance(p, tuple) and len(p) == 2 for p in predictions):
+                # Sort by confidence (assuming higher score = higher confidence)
+                predictions.sort(key=lambda x: x[1], reverse=True)
+                return predictions
+            else:
+                logger.warning(f"Unexpected prediction format: {predictions}")
+                return self._fallback_classification(text)
+            
+        except Exception as e:
+            logger.error(f"Error during classification: {e}")
+            return self._fallback_classification(text)
+    
+    def _fallback_classification(self, text: str) -> List[Tuple[str, float]]:
+        """
+        Simple heuristic classification when model isn't available.
+        
+        Args:
+            text: The query text
+            
+        Returns:
+            List of (label, score) tuples
+        """
+        # Count key indicators of complexity
+        complexity_indicators = [
+            "explain", "analyze", "compare", "evaluate", "synthesize",
+            "how", "why", "complex", "detail", "thorough", "comprehensive",
+            "step by step", "calculate", "prove", "justify", "multiple",
+            "consequences", "implications", "differentiate", "frameworks"
+        ]
+        
+        # Count mentions of complexity indicators
+        count = sum(1 for indicator in complexity_indicators if indicator.lower() in text.lower())
+        
+        # Calculate complexity probability based on count and text length
+        text_length_factor = min(len(text) / 100, 2.0)  # Cap at 2.0
+        indicator_factor = min(count / 3, 1.5)  # Cap at 1.5
+        
+        # Combined factor determines HIGH vs LOW
+        complexity_score = text_length_factor * indicator_factor
+        
+        if complexity_score > 1.0:
+            return [("HIGH", 0.7), ("LOW", 0.3)]
+        else:
+            return [("LOW", 0.8), ("HIGH", 0.2)]
+    
+    def is_high_complexity(self, text: str, threshold: float = 0.5) -> bool:
+        """
+        Determine if a query is high complexity.
+        
+        Args:
+            text: The query text
+            threshold: Confidence threshold for HIGH classification
+            
+        Returns:
+            Boolean indicating if the query is high complexity
+        """
+        predictions = self.predict(text)
+        
+        for label, score in predictions:
+            if label == "HIGH" and score >= threshold:
+                return True
+        
+        return False
+    
+    def get_complexity_with_confidence(self, text: str) -> Tuple[str, float]:
+        """
+        Get the complexity label and confidence score.
+        
+        Args:
+            text: The query text
+            
+        Returns:
+            Tuple of (complexity_label, confidence_score)
+        """
+        predictions = self.predict(text)
+        return predictions[0]  # Return highest confidence prediction