-
-
Notifications
You must be signed in to change notification settings - Fork 7
Claude/fix issue 417 chain of thought 011 c uu sh u85 jd b ngn qq2dc1 j #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ooples
wants to merge
88
commits into
master
Choose a base branch
from
claude/fix-issue-417-chain-of-thought-011CUuShU85JdBNgnQq2dc1J
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 71 commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
ad1c2d4
Implement Chain-of-Thought and Advanced Reasoning Features (Issue #417)
claude 18329bb
fix: improve validation consistency and unicode handling in rag
ooples 7d46050
fix: complete all code quality improvements for rag advanced patterns
ooples a78b65c
perf: replace containskey with trygetvalue for single dictionary lookup
ooples abf199c
fix: restore net framework compatibility in rag advanced patterns
ooples 4d33429
fix: replace priorityqueue with list for net framework compatibility
ooples f7fd9c0
Merge branch 'master' into claude/fix-issue-417-chain-of-thought-011C…
ooples e752dcf
Refactor advanced reasoning retrievers to follow architecture guideli…
claude 319d6cd
Refactor VerifiedReasoningRetriever to inherit from RetrieverBase (Pa…
claude b6f1c8d
Merge master into chain-of-thought branch for continued development
claude 747ce07
Add foundational reasoning framework architecture
claude 3742afc
Add core reasoning component interfaces
claude 369efca
Add concrete reasoning implementations
claude 377bf90
Add core reasoning strategies and search algorithms
claude e72ee63
Add comprehensive verification and refinement system
claude 56dec6e
Add diversity sampling and contradiction detection
claude 2324cc4
Add domain-specific reasoners and benchmark infrastructure
claude 7e0cbf0
Add HumanEval benchmark and adaptive compute scaling
claude f36ed94
Add comprehensive reasoning framework documentation
claude 1552277
Add MATH benchmark and additional search algorithms
claude d6a6c8d
Add verification and reward model enhancements
claude 15d366a
Add ARC-AGI, MMLU, and MBPP benchmarks
claude af3483b
Add HellaSwag, BoolQ, PIQA, and WinoGrande benchmarks
claude 32d795e
Add TruthfulQA, LogiQA, DROP, and CommonsenseQA benchmarks
claude d5c0767
Add ScientificReasoner and LogicalReasoner domain experts
claude 358f939
Add complete RL training infrastructure
claude b37fb3e
Add comprehensive tests and documentation
claude 190f647
Add concrete implementations with real data loaders and runnable exam…
claude 866cbf6
Merge branch 'master' into claude/fix-issue-417-chain-of-thought-011C…
ooples f996a61
Merge branch 'master' into claude/fix-issue-417-chain-of-thought-011C…
ooples 2fd1525
Merge branch 'master' into claude/fix-issue-417-chain-of-thought-011C…
ooples 0f11053
fix: escape double quotes in LogiQABenchmark verbatim string
ooples 601dfbc
fix: resolve build errors - ThoughtNode ambiguity, IChatModel type pa…
ooples 680202b
fix: net462 compatibility - replace math.clamp with mathhelper.clamp …
ooples 93fa969
fix: net462 compatibility for data loaders - replace system.text.json…
ooples db35b3c
fix: correct reasoning config.default() method calls
ooples 5b57b6e
fix: correct triple question mark operator to double question mark
ooples fccf006
fix: resolve Chain, Dimension, JsonSerializer, ReasoningContext, Stri…
ooples ceab3a1
fix: resolve variable shadowing, async warnings, and nullable referen…
ooples 2f5787d
fix: resolve merge conflicts and nullable reference warnings
ooples af57ec9
fix: codeexecutionverifier waitforexit net462 compatibility
ooples 0b036bf
fix: escape quotes properly in logiqa benchmark verbatim string
ooples 0177eb5
fix: remove double-escaping in mbpp python code extraction regex
ooples c4f8c54
fix: add null guard for finalanswer in code reasoner language detection
ooples 2bcc57d
fix: add null check for finalanswer in process reward model
ooples de2cd5d
fix: add null check for reasoningcontext parameter in selfrefinemente…
ooples 4c2092c
fix: propagate cancellation token in processrewardmodel calculatestep…
ooples b542308
fix: add using system.linq directive to code reasoner for linq extens…
ooples 4f0a2ae
fix: use child.thought instead of root.thought when evaluating childr…
ooples 0d5e4c2
fix: use child.thought instead of root.thought in bestfirstsearch, mc…
ooples 68b67cc
fix: add null guards for generator, evaluator, and config in all sear…
ooples dab032a
fix: guard against empty problem sets in humaneval, gsm8k, and math b…
ooples c34eddb
fix: replace console.writeline with debug.writeline in data loaders f…
ooples 2124368
fix: add file existence check in gsm8k loadfromjsonarrayasync method
ooples de508a6
fix: pass rlconfig to reinforcementlearner constructor in training ex…
ooples 85723b7
fix: add thread-safe locking for reasoning trace in reasoningstrategy…
ooples a3caa82
fix: add cancellation token checking in outcomerewardmodel semantic s…
ooples a2ab0f4
fix: remove null-forgiving operator and add proper initialization in …
ooples a2f429a
fix: remove null-forgiving operator and add constructors with numops.…
ooples eca8eae
fix: remove null-forgiving operator from strategy and component classes
ooples a7a13b8
fix: ensure maxscalingfactor is at least 2.0 for monotonic hard-regio…
ooples 67efc27
fix: include argument parameter in evaluateargumentasync prompt
ooples 9360c9c
feat: implement production-ready checkpoint functionality for reinfor…
ooples 59c6ae8
fix: correct broken regex with invalid backreferences in contradictio…
ooples c26f1e9
fix: replace fragile findnodewithth ought search with parent pointer …
ooples a951be2
fix: add process disposal and safe kill in code execution verifier
ooples b815a20
fix: correct sample data split to match getsampleproblems size
ooples 5a38fe4
fix: add unique counter to prevent sortedset from dropping duplicate …
ooples bc279aa
fix: add generic constraint to ensure t is numeric in mcts
ooples 95b6e14
fix: add validation for maxreasoningtimeseconds config property
ooples 0ef12df
fix: evaluate all child nodes against original query in beam search
ooples 50b2893
fix: remove positional bias in mcts child selection
ooples b95fccc
refactor: extract duplicated terminal detection to thoughtnode
ooples 6df678a
fix: restore training data when loading checkpoint
ooples ced5603
fix: treat verified zero scores as valid in processrewardmodel
ooples 952bcd1
fix: resolve 4 minor code quality issues
ooples 0d2b8b6
revert: remove incorrect generic constraint from mcts
ooples e3ae057
fix: add null safety checks and fix pattern asymmetry in contradictio…
ooples 23c534c
fix: remove duplicate strategy assignment and fix evaluation context …
ooples 00ae306
fix: add missing imports, improve error handling, and enhance fallbac…
ooples 0fb57e6
fix: add reproducible shuffling and cancellation token support
ooples 83771b5
fix: propagate cancellation token to contradiction detector llm calls
ooples 73f8581
fix: correct strategy names to match actual implementations
ooples f2af607
Merge branch 'master' into claude/fix-issue-417-chain-of-thought-011C…
ooples 66c110d
feat: implement production-ready cancellationtoken support in chat mo…
ooples e6f7f83
Merge branch 'master' into claude/fix-issue-417-chain-of-thought-011C…
ooples 373bff6
docs: add cancellationtoken xml documentation and fix mockchatmodel s…
ooples f75e0cc
Merge branch 'master' into claude/fix-issue-417-chain-of-thought-011C…
ooples File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,365 @@ | ||
| # Getting Started with AiDotNet Reasoning Framework | ||
|
|
||
| Welcome to the AiDotNet Reasoning Framework - a cutting-edge system for advanced AI reasoning that rivals DeepSeek-R1 and ChatGPT o1/o3! | ||
|
|
||
| ## Table of Contents | ||
| - [Quick Start](#quick-start) | ||
| - [Installation](#installation) | ||
| - [Basic Concepts](#basic-concepts) | ||
| - [First Example](#first-example) | ||
| - [Next Steps](#next-steps) | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ```csharp | ||
| using AiDotNet.Reasoning.Strategies; | ||
| using AiDotNet.Reasoning.Models; | ||
|
|
||
| // Initialize with your chat model | ||
| var chatModel = /* your IChatModel implementation */; | ||
| var strategy = new ChainOfThoughtStrategy<double>(chatModel); | ||
|
|
||
| // Solve a problem | ||
| var result = await strategy.ReasonAsync("What is 15 × 12?"); | ||
|
|
||
| Console.WriteLine($"Answer: {result.FinalAnswer}"); | ||
| Console.WriteLine($"Steps: {result.Chain.Steps.Count}"); | ||
| ``` | ||
|
|
||
| ## Installation | ||
|
|
||
| ### Prerequisites | ||
| - .NET 6.0 or higher | ||
| - A chat model implementation (OpenAI, Anthropic, etc.) | ||
|
|
||
| ### NuGet Package | ||
| ```bash | ||
| dotnet add package AiDotNet | ||
| ``` | ||
|
|
||
| ### From Source | ||
| ```bash | ||
| git clone https://github.com/ooples/AiDotNet.git | ||
| cd AiDotNet | ||
| dotnet build | ||
| ``` | ||
|
|
||
| ## Basic Concepts | ||
|
|
||
| ### 1. Reasoning Strategies | ||
|
|
||
| The framework provides three main reasoning strategies: | ||
|
|
||
| #### **Chain-of-Thought (CoT)** | ||
| Linear step-by-step reasoning - best for straightforward problems. | ||
|
|
||
| ```csharp | ||
| var cotStrategy = new ChainOfThoughtStrategy<double>(chatModel); | ||
| var result = await cotStrategy.ReasonAsync("Calculate the area of a circle with radius 5"); | ||
| ``` | ||
|
|
||
| #### **Self-Consistency** | ||
| Generates multiple reasoning paths and aggregates results - best for problems with multiple valid approaches. | ||
|
|
||
| ```csharp | ||
| var scStrategy = new SelfConsistencyStrategy<double>(chatModel); | ||
| var config = new ReasoningConfig { NumSamples = 5 }; | ||
| var result = await scStrategy.ReasonAsync("What is the capital of France?", config); | ||
| ``` | ||
|
|
||
| #### **Tree-of-Thoughts (ToT)** | ||
| Explores multiple paths with backtracking - best for complex problems requiring exploration. | ||
|
|
||
| ```csharp | ||
| var totStrategy = new TreeOfThoughtsStrategy<double>(chatModel); | ||
| var config = new ReasoningConfig { ExplorationDepth = 4, BranchingFactor = 3 }; | ||
| var result = await totStrategy.ReasonAsync("Solve this logic puzzle: ...", config); | ||
| ``` | ||
|
|
||
| ### 2. Configuration Presets | ||
|
|
||
| Choose the right preset for your use case: | ||
|
|
||
| ```csharp | ||
| // Fast: Quick answers for simple problems | ||
| var fastConfig = ReasoningConfig.Fast; // 3 steps, depth 2 | ||
|
|
||
| // Default: Balanced for most problems | ||
| var defaultConfig = ReasoningConfig.Default; // 10 steps, depth 3 | ||
|
|
||
| // Thorough: Deep exploration for hard problems | ||
| var thoroughConfig = ReasoningConfig.Thorough; // 20 steps, depth 5 | ||
| ``` | ||
|
|
||
| ### 3. Domain-Specific Reasoners | ||
|
|
||
| Use specialized reasoners for specific domains: | ||
|
|
||
| ```csharp | ||
| // Mathematics | ||
| var mathReasoner = new MathematicalReasoner<double>(chatModel); | ||
| var result = await mathReasoner.SolveAsync("What is 347 + 892?"); | ||
|
|
||
| // Code Generation | ||
| var codeReasoner = new CodeReasoner<double>(chatModel); | ||
| var result = await codeReasoner.GenerateCodeAsync( | ||
| "Write a function to find the factorial of n", | ||
| language: "python" | ||
| ); | ||
|
|
||
| // Science | ||
| var scienceReasoner = new ScientificReasoner<double>(chatModel); | ||
| var result = await scienceReasoner.SolveAsync( | ||
| "Calculate kinetic energy of 5kg object at 10m/s", | ||
| domain: "physics" | ||
| ); | ||
|
|
||
| // Logic | ||
| var logicReasoner = new LogicalReasoner<double>(chatModel); | ||
| var result = await logicReasoner.SolveAsync( | ||
| "All A are B. All B are C. Therefore?", | ||
| logicType: "deductive" | ||
| ); | ||
| ``` | ||
|
|
||
| ## First Example | ||
|
|
||
| Let's build a complete example that solves a math problem with verification: | ||
|
|
||
| ```csharp | ||
| using AiDotNet.Reasoning.DomainSpecific; | ||
| using AiDotNet.Reasoning.Models; | ||
| using AiDotNet.Reasoning.Verification; | ||
|
|
||
| public class MathProblemSolver | ||
| { | ||
| private readonly IChatModel _chatModel; | ||
| private readonly MathematicalReasoner<double> _reasoner; | ||
| private readonly CalculatorVerifier<double> _verifier; | ||
|
|
||
| public MathProblemSolver(IChatModel chatModel) | ||
| { | ||
| _chatModel = chatModel; | ||
| _reasoner = new MathematicalReasoner<double>(chatModel); | ||
| _verifier = new CalculatorVerifier<double>(); | ||
| } | ||
|
|
||
| public async Task<string> SolveWithVerificationAsync(string problem) | ||
| { | ||
| // Step 1: Solve the problem | ||
| var result = await _reasoner.SolveAsync( | ||
| problem, | ||
| useVerification: true, | ||
| useSelfConsistency: false // Try setting to true for harder problems! | ||
| ); | ||
|
|
||
| if (!result.Success) | ||
| { | ||
| return $"Failed to solve: {result.ErrorMessage}"; | ||
| } | ||
|
|
||
| // Step 2: Verify the calculation | ||
| var verification = await _verifier.VerifyAsync(result.Chain); | ||
|
|
||
| // Step 3: Return results | ||
| var output = new StringBuilder(); | ||
| output.AppendLine($"Problem: {problem}"); | ||
| output.AppendLine($"\nReasoning Steps:"); | ||
|
|
||
| foreach (var step in result.Chain.Steps) | ||
| { | ||
| output.AppendLine($" {step.StepNumber}. {step.Content}"); | ||
| } | ||
|
|
||
| output.AppendLine($"\nFinal Answer: {result.FinalAnswer}"); | ||
| output.AppendLine($"Verification: {(verification.IsValid ? "✓ Correct" : "✗ Incorrect")}"); | ||
| output.AppendLine($"Confidence: {result.ConfidenceScore:P0}"); | ||
|
|
||
| return output.ToString(); | ||
| } | ||
| } | ||
|
|
||
| // Usage | ||
| var chatModel = /* your chat model */; | ||
| var solver = new MathProblemSolver(chatModel); | ||
|
|
||
| var result = await solver.SolveWithVerificationAsync( | ||
| "A store has 347 apples. They sell 129 in the morning and 85 in the afternoon. How many apples are left?" | ||
| ); | ||
|
|
||
| Console.WriteLine(result); | ||
| ``` | ||
|
|
||
| **Output:** | ||
| ``` | ||
| Problem: A store has 347 apples... | ||
|
|
||
| Reasoning Steps: | ||
| 1. Start with initial amount: 347 apples | ||
| 2. Calculate morning sales: 347 - 129 = 218 | ||
| 3. Calculate afternoon sales: 218 - 85 = 133 | ||
|
|
||
| Final Answer: 133 apples | ||
| Verification: ✓ Correct | ||
| Confidence: 95% | ||
| ``` | ||
|
|
||
| ## Next Steps | ||
|
|
||
| ### Learn More | ||
| - [API Documentation](./ApiReference.md) - Complete API reference | ||
| - [Tutorials](./Tutorials.md) - Step-by-step guides | ||
| - [Best Practices](./BestPractices.md) - Tips and patterns | ||
| - [Benchmarks](./Benchmarks.md) - Evaluation guide | ||
|
|
||
| ### Try These Examples | ||
| 1. **Solve GSM8K Math Problems**: See `examples/GSM8KExample.cs` | ||
| 2. **Generate Code with HumanEval**: See `examples/CodeGenerationExample.cs` | ||
| 3. **Train with Reinforcement Learning**: See `examples/RLTrainingExample.cs` | ||
| 4. **Build a Custom Reasoner**: See `examples/CustomReasonerExample.cs` | ||
|
|
||
| ### Key Features to Explore | ||
|
|
||
| #### 1. Verification System | ||
| ```csharp | ||
| // Critic-based verification | ||
| var criticModel = new CriticModel<double>(chatModel); | ||
| var critique = await criticModel.CritiqueStepAsync(step, context); | ||
|
|
||
| // Self-refinement | ||
| var refinementEngine = new SelfRefinementEngine<double>(chatModel); | ||
| var refined = await refinementEngine.RefineStepAsync(step, critique, context); | ||
| ``` | ||
|
|
||
| #### 2. Reward Models for RL | ||
| ```csharp | ||
| // Process Reward Model (step-by-step scoring) | ||
| var prm = new ProcessRewardModel<double>(chatModel); | ||
|
|
||
| // Outcome Reward Model (final answer scoring) | ||
| var orm = new OutcomeRewardModel<double>(chatModel); | ||
|
|
||
| // Hybrid (best of both) | ||
| var hybrid = new HybridRewardModel<double>(prm, orm, 0.5, 0.5); | ||
| ``` | ||
|
|
||
| #### 3. Search Algorithms | ||
| ```csharp | ||
| // Monte Carlo Tree Search | ||
| var mcts = new MonteCarloTreeSearch<double>( | ||
| explorationConstant: 1.414, | ||
| simulationCount: 100 | ||
| ); | ||
|
|
||
| // Best-First Search | ||
| var bestFirst = new BestFirstSearch<double>(); | ||
|
|
||
| // Depth-First Search | ||
| var dfs = new DepthFirstSearch<double>(); | ||
| ``` | ||
|
|
||
| #### 4. Benchmarking | ||
| ```csharp | ||
| // Evaluate on GSM8K | ||
| var benchmark = new GSM8KBenchmark<double>(); | ||
| var results = await benchmark.EvaluateAsync( | ||
| async (problem) => { | ||
| var result = await reasoner.SolveAsync(problem); | ||
| return result.FinalAnswer; | ||
| }, | ||
| sampleSize: 100 | ||
| ); | ||
|
|
||
| Console.WriteLine($"Accuracy: {results.Accuracy:P2}"); | ||
| ``` | ||
|
|
||
| #### 5. Training with RL | ||
| ```csharp | ||
| var rewardModel = new HybridRewardModel<double>(prm, orm); | ||
| var learner = new ReinforcementLearner<double>(chatModel, rewardModel); | ||
|
|
||
| var trainingData = await LoadTrainingDataAsync(); | ||
| var validationData = await LoadValidationDataAsync(); | ||
|
|
||
| var results = await learner.TrainAsync(trainingData, validationData); | ||
| Console.WriteLine($"Best Accuracy: {results.BestAccuracy:P2}"); | ||
| ``` | ||
|
|
||
| ## Common Patterns | ||
|
|
||
| ### Pattern 1: Progressive Refinement | ||
| ```csharp | ||
| var result = await strategy.ReasonAsync(problem); | ||
|
|
||
| while (result.ConfidenceScore < 0.9 && iterations < maxIterations) | ||
| { | ||
| var critique = await critic.CritiqueChainAsync(result.Chain); | ||
| result = await refinement.RefineAsync(result, critique); | ||
| iterations++; | ||
| } | ||
| ``` | ||
|
|
||
| ### Pattern 2: Ensemble Reasoning | ||
| ```csharp | ||
| var strategies = new IReasoningStrategy<double>[] | ||
| { | ||
| new ChainOfThoughtStrategy<double>(chatModel), | ||
| new SelfConsistencyStrategy<double>(chatModel), | ||
| new TreeOfThoughtsStrategy<double>(chatModel) | ||
| }; | ||
|
|
||
| var results = await Task.WhenAll( | ||
| strategies.Select(s => s.ReasonAsync(problem)) | ||
| ); | ||
|
|
||
| var bestResult = results.OrderByDescending(r => r.ConfidenceScore).First(); | ||
| ``` | ||
|
|
||
| ### Pattern 3: Adaptive Compute | ||
| ```csharp | ||
| var scaler = new AdaptiveComputeScaler<double>(); | ||
| var difficulty = scaler.EstimateDifficulty(problem); | ||
| var config = scaler.ScaleConfig(problem, difficulty); | ||
|
|
||
| var result = await strategy.ReasonAsync(problem, config); | ||
| ``` | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Issue: Low Confidence Scores | ||
| **Solution**: Use Self-Consistency or enable verification: | ||
| ```csharp | ||
| var config = new ReasoningConfig { NumSamples = 5 }; | ||
| var result = await scStrategy.ReasonAsync(problem, config); | ||
| ``` | ||
|
|
||
| ### Issue: Incomplete Reasoning | ||
| **Solution**: Increase max steps or use thorough config: | ||
| ```csharp | ||
| var config = ReasoningConfig.Thorough; // 20 steps instead of 10 | ||
| var result = await strategy.ReasonAsync(problem, config); | ||
| ``` | ||
|
|
||
| ### Issue: Wrong Answers | ||
| **Solution**: Add verification and refinement: | ||
| ```csharp | ||
| var result = await mathReasoner.SolveAsync(problem, useVerification: true); | ||
| ``` | ||
|
|
||
| ## Community & Support | ||
|
|
||
| - **Documentation**: https://docs.aidotnet.com | ||
| - **GitHub**: https://github.com/ooples/AiDotNet | ||
| - **Issues**: https://github.com/ooples/AiDotNet/issues | ||
| - **Discussions**: https://github.com/ooples/AiDotNet/discussions | ||
ooples marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## What's Next? | ||
|
|
||
| You're now ready to build advanced reasoning systems! Here are some ideas: | ||
|
|
||
| 1. **Build a Math Tutor**: Use MathematicalReasoner with step-by-step explanations | ||
| 2. **Create a Code Assistant**: Use CodeReasoner for code generation and debugging | ||
| 3. **Build a Logic Puzzle Solver**: Use LogicalReasoner with ToT strategy | ||
| 4. **Train Your Own Model**: Use the RL infrastructure to improve reasoning | ||
|
|
||
| Happy reasoning! 🚀 | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.