Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
277 commits
Select commit Hold shift + click to select a range
bc5f451
fix: remove readonly from all RL agents and correct DeepReinforcement…
claude Nov 13, 2025
20aef17
fix: update all existing deep RL agents to inherit from DeepReinforce…
claude Nov 13, 2025
f4a4ce3
feat: add classical RL implementations (Tabular Q-Learning and SARSA)
claude Nov 13, 2025
1930a77
feat: add more classical RL algorithms (Expected SARSA, First-Visit MC)
claude Nov 13, 2025
13be43a
feat: add classical RL implementations (Expected SARSA, First-Visit MC)
claude Nov 13, 2025
28dff3a
feat: add n-step SARSA classical RL implementation
claude Nov 13, 2025
6d1a961
fix: update deep RL agents with .NET Framework compatibility and miss…
claude Nov 13, 2025
f43eec7
feat: add 5 classical RL implementations (MC and DP methods)
claude Nov 13, 2025
0fae751
feat: add Modified Policy Iteration (6/29 classical RL)
claude Nov 13, 2025
e948012
wip: add 15 options files and 1 agent for remaining classical RL algo…
claude Nov 13, 2025
51fdeb8
feat: add 3 eligibility trace algorithms (SARSA(λ), Q(λ), Watkins Q(λ))
claude Nov 13, 2025
b65b8c2
chore: prepare for final 12 classical RL algorithm implementations
claude Nov 13, 2025
17c41cc
feat: add 3 Planning algorithms (Dyna-Q, Dyna-Q+, Prioritized Sweeping)
claude Nov 13, 2025
80fbb36
feat: add 4 Bandit algorithms (ε-Greedy, UCB, Thompson Sampling, Grad…
claude Nov 13, 2025
4a76080
feat: add final 5 Advanced RL algorithms (Actor-Critic, Linear Q/SARS…
claude Nov 13, 2025
5090cc7
fix: use count instead of length for list assertion in uniform replay…
ooples Nov 13, 2025
85c3553
fix: correct loss function type name and collection syntax in td3options
ooples Nov 13, 2025
e1dc8ba
fix: correct loss function type name and collection syntax in ddpgopt…
ooples Nov 13, 2025
9a2f713
fix: validate ddpg options before base constructor call
ooples Nov 13, 2025
9584472
fix: validate double dqn options before base constructor and sync tar…
ooples Nov 13, 2025
dd6e242
fix: validate dqn options before base constructor call
ooples Nov 13, 2025
54b8aa3
fix: correct ornstein-uhlenbeck diffusion term sign
ooples Nov 13, 2025
4205cd6
fix: throw notsupportedexception in ddpg computegradients and applygr…
ooples Nov 13, 2025
d97191f
fix: return actual gradients not parameters in double dqn computegrad…
ooples Nov 13, 2025
3abe2a2
fix: apply gradient descent update in dueling dqn applygradients
ooples Nov 13, 2025
6c0ac1e
fix: return actual gradients not parameters in dueling dqn computegra…
ooples Nov 13, 2025
86b1b5b
fix: persist nextstate in trpo trajectory buffer
ooples Nov 13, 2025
9c87416
fix: run a3c workers sequentially to prevent environment corruption
ooples Nov 13, 2025
1d8c40b
fix: correct expectile gradient calculation in iql value function update
ooples Nov 13, 2025
bf36489
fix: apply correct mse gradient sign in iql q-network updates
ooples Nov 13, 2025
7e0f8b1
fix: include conservative penalty gradient in cql q-network updates
ooples Nov 13, 2025
fdc4830
fix: negate policy gradient for q-value maximization in cql
ooples Nov 13, 2025
e7b39cb
fix: mark sac policy gradient as not implemented with proper exception
ooples Nov 13, 2025
1506aaf
fix: mark reinforce policy gradient as not implemented with proper ex…
ooples Nov 13, 2025
0d3315d
fix: mark a2c as needing backpropagation implementation before updates
ooples Nov 13, 2025
86b905c
fix: mark a3c gradient computation as not implemented
ooples Nov 13, 2025
fe202df
fix: mark trpo policy update as not implemented with proper exception
ooples Nov 13, 2025
f9d5b21
fix: mark ddpg actor update as not implemented with proper exception
ooples Nov 13, 2025
93ee5bb
fix: remove unused aiDotNet.LossFunctions using directive from maddpg…
ooples Nov 13, 2025
cfef0bb
feat: implement production-ready reinforce policy gradient with prope…
ooples Nov 13, 2025
cddcf13
feat: implement production-ready a2c backpropagation with proper grad…
ooples Nov 13, 2025
7dbb227
feat: implement production-ready sac policy gradient with reparameter…
ooples Nov 13, 2025
065d366
feat: implement production-ready ddpg deterministic policy gradient
ooples Nov 13, 2025
404428a
feat: implement production-ready a3c gradient computation
ooples Nov 13, 2025
e04169d
feat: implement production-ready trpo importance-weighted policy grad…
ooples Nov 13, 2025
d59aace
fix: correct syntax errors - missing semicolon and params keyword
ooples Nov 13, 2025
b2ff5d7
fix: correct activation functions namespace import
ooples Nov 13, 2025
e39f13b
fix: net462 compatibility - add IsExternalInit shim and fix ambiguous…
ooples Nov 13, 2025
3ddd7aa
fix: remove duplicate SequenceContext class definition from DecisionT…
ooples Nov 13, 2025
c095e34
feat: implement Save/Load methods for SAC, REINFORCE, and A2C agents
ooples Nov 13, 2025
8b6c92e
fix: correct API method names and remove List<T> in Advanced RL agents
claude Nov 13, 2025
04e4128
docs: add comprehensive XML documentation to Advanced RL Options
claude Nov 13, 2025
0a729d7
fix: correct ModelMetadata properties in Advanced RL agents
claude Nov 13, 2025
1411dde
fix: batch replace incorrect API method names across all RL agents
claude Nov 13, 2025
564419b
fix: correct ModelMetadata properties across all RL agents
claude Nov 13, 2025
038eba1
fix: add IActivationFunction casts and fix collection expressions
claude Nov 13, 2025
62d13a8
fix: remove List<T> usage from GetParameters in 6 RL agents
claude Nov 13, 2025
52efe00
fix: remove redundant epsilon properties from 16 RL Options classes
claude Nov 13, 2025
66cd888
fix: qualify Experience type in SACAgent to resolve ambiguity
claude Nov 13, 2025
55b1a8b
fix: remove invalid override keywords from PredictAsync and TrainAsync
claude Nov 13, 2025
96d111d
fix: replace ReplayBuffer<T> with UniformReplayBuffer<T> and fix MCTS…
claude Nov 14, 2025
672b037
fix: rename Save/Load to SaveModel/LoadModel to match IModelSerialize…
claude Nov 14, 2025
427ebb6
fix: change base class to use Vector<T> instead of Matrix<T> and add …
claude Nov 14, 2025
1687bac
fix: add missing abstract method implementations to A3C, TD3, CQL, IQ…
claude Nov 14, 2025
ef8ce78
fix: correct Matrix/Vector usage in deep RL agent parameter methods
claude Nov 14, 2025
c8018f6
fix: correct Matrix/Vector usage in all remaining RL agent parameter …
claude Nov 14, 2025
25d7370
fix: correct GetActiveFeatureIndices and ComputeGradients signatures …
ooples Nov 14, 2025
27532b8
fix: update all RL agent ComputeGradients methods to return Vector<T>…
ooples Nov 14, 2025
ef9ce0a
fix: replace NumericOperations<T>.Instance with MathHelper.GetNumeric…
ooples Nov 14, 2025
7f878b4
fix: disambiguate denselayer constructor calls with explicit iactivat…
ooples Nov 14, 2025
2f956d5
fix: replace mathhelper exp log with numops exp log for generic type …
ooples Nov 14, 2025
2922e3c
fix: remove non-existent modelmetadata properties from rl agents
ooples Nov 14, 2025
e4977a9
fix: replace tasktype with neuralnetworktasktype for correct enum ref…
ooples Nov 14, 2025
7c085e7
fix: correct experience property names to capitalized (state/nextstat…
ooples Nov 14, 2025
a49ef3e
fix: replace updateweights with updateparameters for correct neural n…
ooples Nov 14, 2025
ba15a8b
fix: replace takelast with skip take pattern for net462 compatibility
ooples Nov 14, 2025
ae52038
fix: replace backward with backpropagate for correct neural network api
ooples Nov 14, 2025
545f758
fix: resolve actor-critic agents vector/tensor errors
ooples Nov 14, 2025
aa1f194
fix: resolve dqn family vector/tensor errors
ooples Nov 14, 2025
bb9c325
fix: resolve policy gradient agents vector/tensor errors
ooples Nov 14, 2025
ae14e77
fix: resolve cql agent vector/tensor conversion and api signature errors
ooples Nov 14, 2025
aae5129
fix: resolve constructor, type reference, and property errors
ooples Nov 14, 2025
93a77de
fix: resolve worldmodelsagent vector/tensor api conversion errors
ooples Nov 14, 2025
a048320
fix: resolve maddpg agent build errors - network architecture and ten…
ooples Nov 14, 2025
05eaaa6
fix: resolve planning agent computegradients vector/matrix type errors
ooples Nov 14, 2025
5ef84a2
fix: resolve epsilon greedy bandit agent matrix to vector conversion …
ooples Nov 14, 2025
2d37dad
fix: resolve ucb bandit agent matrix to vector conversion errors
ooples Nov 14, 2025
844c3d3
fix: resolve thompson sampling agent matrix to vector conversion errors
ooples Nov 14, 2025
f0d630c
fix: resolve gradient bandit agent matrix to vector conversion errors
ooples Nov 14, 2025
d7a4a40
fix: resolve qmix agent build errors - network architecture and tenso…
ooples Nov 14, 2025
e9823ec
fix: resolve monte carlo agent build errors - modeltype enum and vect…
ooples Nov 14, 2025
9558bf3
fix: resolve reinforce agent build errors - network architecture and …
ooples Nov 14, 2025
ca6fa32
fix: resolve sarsa lambda agent build errors - null assignment and lo…
ooples Nov 14, 2025
9a6ca7d
fix: apply batch fixes to rl agents - experience api and using direct…
ooples Nov 14, 2025
5391bad
fix: replace linearactivation with identityactivation and fix loss fu…
ooples Nov 14, 2025
e08cf16
fix: correct backpropagate calls to use single argument and initializ…
ooples Nov 14, 2025
c753cd6
fix: add activation function casts and fix experience property names …
ooples Nov 14, 2025
748d90a
fix: resolve 36 iqlAgent errors using proper api patterns
ooples Nov 14, 2025
8aed3ed
fix(rl): complete maddpgagent api migration to tensor-based neural ne…
ooples Nov 14, 2025
7fda0b0
fix(rl): complete td3agent api migration to tensor-based neural networks
ooples Nov 14, 2025
b13109b
fix(rl): complete a3c/trpo/sac/qmix api migration to tensor-based neu…
ooples Nov 14, 2025
d9ca7e9
fix(rl): complete muzero api migration and resolve remaining errors
ooples Nov 14, 2025
9446361
fix(rl): complete rainbowdqn api migration and resolve remaining errors
ooples Nov 14, 2025
9939a95
fix(rl): complete dreameragent api migration to tensor-based neural n…
ooples Nov 14, 2025
44beae2
fix(rl): complete batch api migration for duelingdqn and classical rl…
ooples Nov 14, 2025
17b1685
fix: resolve cs1503 type conversion errors in cql and ppo agents
ooples Nov 14, 2025
f973be0
fix: resolve CS8618 and CS1061 errors in reinforcement learning agent…
ooples Nov 14, 2025
7eca77b
fix: resolve all cs1061 missing member errors
ooples Nov 14, 2025
70f608f
fix: complete decisiontransformeragent tensor conversions and modelty…
ooples Nov 14, 2025
e70136f
fix: correct initializers in STLDecompositionOptions and ProphetOptions
ooples Nov 14, 2025
e8eb882
fix: resolve 32 errors in 4 RL agent files
ooples Nov 14, 2025
7362c9e
fix: resolve compilation errors in DDPG, QMIX, TRPO, MuZero, TabularQ…
ooples Nov 14, 2025
8365fb9
fix: manual error fixes for pr #481
ooples Nov 14, 2025
1e41347
feat: add core policy and exploration strategy interfaces
claude Nov 14, 2025
6d50476
feat: implement epsilon-greedy, gaussian noise, and no-exploration st…
claude Nov 14, 2025
8105265
feat: implement discrete and continuous policy classes
claude Nov 14, 2025
77b636b
feat: add policy options configuration classes
claude Nov 14, 2025
7c1659b
fix: correct numops usage and net462 compatibility in policy files
ooples Nov 14, 2025
22a0876
docs: add comprehensive policy base classes implementation prompt
ooples Nov 14, 2025
5be9a04
feat: add core policy and exploration strategy interfaces
claude Nov 14, 2025
f3d0128
feat: implement epsilon-greedy, gaussian noise, and no-exploration st…
claude Nov 14, 2025
d5c6bb6
feat: implement discrete and continuous policy classes
claude Nov 14, 2025
b8ad0a6
feat: add policy options configuration classes
claude Nov 14, 2025
90d070c
refactor: update policies and exploration strategies to inherit from …
claude Nov 14, 2025
160e890
feat: add advanced exploration strategies and policy implementations
claude Nov 14, 2025
5b6ebce
fix: update policy options classes with sensible default implementations
claude Nov 14, 2025
e49d4be
fix: pass vector<T> to cartpole step method in tests
ooples Nov 14, 2025
9d2b0dc
feat: complete comprehensive RL policy architecture
claude Nov 14, 2025
3a7b49a
fix: use vector<T> instead of tensor<T> in uniformreplaybuffertests
ooples Nov 14, 2025
8eda795
fix: remove epsilongreedypolicytests for non-existent type
ooples Nov 14, 2025
64fe6a7
docs: add comprehensive documentation to DiscretePolicyOptions and Co…
claude Nov 14, 2025
cbf5e8a
Merge remote-tracking branch 'origin/claude/pr-481-followup-01S72jB7k…
ooples Nov 14, 2025
a1a77d7
fix: complete production-ready fixes for qlambdaagent with all 6 issu…
ooples Nov 15, 2025
3d4f48a
fix: resolve all 6 critical issues in muzeroagent implementation
ooples Nov 15, 2025
6cf111c
fix: format predict method in duelingdqnagent for proper code structure
ooples Nov 15, 2025
81f933f
fix(rl): complete dreamer agent - all 9 pr review issues addressed
ooples Nov 15, 2025
309333b
fix(rl): complete agents 2-10 - all 47 pr review issues addressed
ooples Nov 15, 2025
520090d
fix(RL): implement agents 11-12 fixes (11 issues, 3 critical)
ooples Nov 15, 2025
44b33fb
fix(sarsa-lambda): implement serialization, fix clone, add random ins…
ooples Nov 15, 2025
bafd20b
fix(monte-carlo): implement serialization, fix clone, add random inst…
ooples Nov 15, 2025
e0b4595
fix: implement production fixes for sarsaagent (agent #16/17)
ooples Nov 15, 2025
fdb5955
fix(rl): address misc agent issues in dreameroptions, iql, td3, cartp…
ooples Nov 15, 2025
0a76c48
fix(q-learning): implement production fixes for doubleq, nstep, lstd …
ooples Nov 15, 2025
fd8721f
fix(monte-carlo): implement production fixes for first-visit, on-poli…
ooples Nov 15, 2025
86f68bc
fix(monte-carlo): complete production fixes for on-policy and every-v…
ooples Nov 15, 2025
c039bb1
fix(n-step-sarsa): implement production fixes for n-step sarsa agent …
ooples Nov 15, 2025
59c59a2
fix(dynamic-programming): implement production fixes for all dp agent…
ooples Nov 15, 2025
8691525
fix(qlambda): implement production fixes for q(lambda) agent (#28)
ooples Nov 15, 2025
4f414db
fix: implement planning agents production-ready fixes
ooples Nov 15, 2025
0653f0d
fix: implement bandit agents production-ready fixes
ooples Nov 15, 2025
915a7ff
fix: implement tabular qlearning production-ready fixes
ooples Nov 15, 2025
647388a
fix: add override keywords to policy class methods
ooples Nov 15, 2025
a91ec81
fix: correct policy dispose method hiding
ooples Nov 15, 2025
43b37ac
fix: remove empty dispose methods from policy classes
ooples Nov 15, 2025
ab3d1ad
fix: initialize random fields in bandit agents
ooples Nov 15, 2025
fe2d24d
fix: initialize _random field in bandit agent constructors
ooples Nov 15, 2025
058b8d4
fix: add validation to dreamer options and implement serialization fo…
ooples Nov 15, 2025
b3dc534
fix: preserve learned arm statistics in thompsonsampling clone
ooples Nov 15, 2025
2d9185a
fix: update setparameters to preserve qtable state keys
ooples Nov 15, 2025
403fe32
fix: implement PPO clipped objective with importance sampling ratio
ooples Nov 15, 2025
9e8630f
fix: implement serialization and clone for watkinsqlambda
ooples Nov 15, 2025
6e06267
fix: add previous action field to decisiontransformer trajectory buffer
ooples Nov 15, 2025
f655ae5
fix(ucb-bandit): clone now copies learned state (q-values, counts, st…
ooples Nov 15, 2025
41ad89e
fix(lspi): implement serialize/deserialize for model persistence
ooples Nov 15, 2025
53f7f27
fix: correct mcts backup to compute returns before updating qvalues
ooples Nov 15, 2025
8cf33bb
refactor: clarify setparameters logic in sarsaagent
ooples Nov 15, 2025
487858c
fix: use shared random instance from base class
ooples Nov 15, 2025
e1b68b5
fix: make td3options inherit from base and use init-only properties
ooples Nov 15, 2025
d61efd2
fix(modified-pi): normalize probabilities to prevent blow-up
ooples Nov 15, 2025
709ec33
fix: add validation for numarms in epsilongreedybandit
ooples Nov 15, 2025
f5f4a28
fix: implement deep copy of q-table in tabularqlearningagent clone me…
ooples Nov 15, 2025
94281c7
fix: preserve preferences and baseline in gradientbandit clone
ooples Nov 15, 2025
1528148
fix(n-step-q): epsilon decay per episode instead of per step
ooples Nov 15, 2025
52971aa
fix: use greedy action for n-step sarsa bootstrap value
ooples Nov 15, 2025
3cc93d3
fix: clear existing transitions before adding new one in deterministi…
ooples Nov 15, 2025
a704330
fix: copy network parameters in rainbowdqn clone
ooples Nov 15, 2025
ade3dc7
fix(qmix): implement proper td gradient flow through mixer and agents
ooples Nov 15, 2025
9c91db6
fix: implement applygradients for linear q-learning agent
ooples Nov 15, 2025
7c4595b
fix(trpo): correctly use nextstate from trajectory buffer
ooples Nov 15, 2025
a7cdd07
fix: make sac applygradients throw notsupportedexception
ooples Nov 15, 2025
cb66489
docs(sac-options): verify and document mse loss correctness
ooples Nov 15, 2025
daa8e1f
fix: add validation for nstepqlearningoptions properties
ooples Nov 15, 2025
cb193d0
docs(dreamer): clarify representation network training
ooples Nov 15, 2025
ba2bd67
docs(expected-sarsa): clarify deep copy in clone method
ooples Nov 15, 2025
d63237a
fix: initialize discountfactor in a3coptions constructor
ooples Nov 15, 2025
101bcfc
fix: lspiagent clone now copies learned weights and samples
ooples Nov 15, 2025
bc9ff09
fix: cql policy gradient now includes variance action gradient component
ooples Nov 15, 2025
cdea3a3
fix: dreamer gradient calculations for dynamics, representation, and …
ooples Nov 15, 2025
e5d7e6a
fix: modified policy iteration serialization now handles tuple transi…
ooples Nov 15, 2025
2af32f7
fix: expectedsarsa modelmetadata now includes featurecount and comple…
ooples Nov 16, 2025
652db72
fix: qmix agent serialization and save/load now properly implemented
ooples Nov 16, 2025
9d5c324
fix: prevent negative infinity in thompson sampling beta distribution
ooples Nov 16, 2025
c35daf2
fix: tabular qlearning setparameters now preserves state keys
ooples Nov 16, 2025
d0ccd86
fix: resolve 10 P0 critical issues for agent 5 work package
ooples Nov 16, 2025
327243e
fix: correct weighted importance sampling and watkins q lambda trace …
ooples Nov 16, 2025
df48c1a
fix: add maddpg validation and prevent a2c null options crash in base…
ooples Nov 16, 2025
54669d4
fix(dreamer): correct batch processing and gradient calculations
ooples Nov 16, 2025
9f0fc48
style(dreamer): remove fix comments from code
ooples Nov 16, 2025
2dec9f3
fix(maddpg): use target actors for target Q computation
ooples Nov 16, 2025
5e67039
fix(qmix): enforce monotonicity in mixing network weights
ooples Nov 16, 2025
18c322a
fix(dqn): sync target network in setparameters
ooples Nov 16, 2025
da35264
fix(sarsa): setparameters cannot restore Q-values without state keys
ooples Nov 16, 2025
6ca2e95
fix(doubleqlearning): setparameters cannot restore without state info…
ooples Nov 16, 2025
d50af0d
fix(decisiontransformer): use optimizer for gradient application
ooples Nov 16, 2025
eb23984
fix(nstepqlearning): setparameters cannot restore without state infor…
ooples Nov 16, 2025
7fbee16
style(epsilongreedybanditoptions): remove unused using directives
ooples Nov 16, 2025
498cbe8
style(sarsalambdaoptions): remove unused using directives
ooples Nov 16, 2025
9027482
fix(dynaq): applygradients throws exception for unsupported operation
ooples Nov 16, 2025
5d10250
fix(dreamer): applygradients throws exception for multi-network compl…
ooples Nov 16, 2025
d3ec55a
fix(montecarlo): savemodel/loadmodel throw exception for unsupported …
ooples Nov 16, 2025
73131cf
fix: remove resetgradients call from ddpg agent
ooples Nov 16, 2025
afd239d
fix: correct ppoagent cliprange property and linearsarsaagent deseria…
ooples Nov 16, 2025
386e34d
fix: add comprehensive null checks for td3agent discountfactor and mi…
ooples Nov 16, 2025
c289924
chore: remove investigation/report files and temporary scripts per CL…
ooples Nov 16, 2025
6905d24
Delete POLICY_BASE_CLASSES_PROMPT.md
ooples Nov 16, 2025
0af9c52
Delete src/ReinforcementLearning/INTEGRATION_PLAN.md
ooples Nov 16, 2025
c3e785a
Delete fix-ppo-rainbow-dueling-muzero.sh
ooples Nov 16, 2025
725e524
fix: resolve merge conflicts with master
ooples Nov 16, 2025
03ed1fd
fix: add savestate and loadstate methods to reinforcement learning ag…
ooples Nov 16, 2025
3d2ca7e
Merge branch 'claude/fix-issue-394-011CV3HkgfwwbaSAdrzrKd58' of https…
ooples Nov 16, 2025
ed16f3e
fix: add missing using directive for jsonconvert in lspi agent
ooples Nov 16, 2025
a8b244d
fix: add validation for state and action size in expected sarsa options
ooples Nov 16, 2025
5a932d9
fix: add defensive validation for expectedsarsaagent options
ooples Nov 16, 2025
e073989
fix: add constructor validation to expectedsarsaoptions
ooples Nov 16, 2025
e074d7a
fix: add deployment configuration to rl training path
ooples Nov 16, 2025
c297646
fix: replace dummy gradient vector with notsupportedexception in duel…
ooples Nov 16, 2025
e72d71e
fix: change applygradients to throw notsupportedexception in doubledq…
ooples Nov 17, 2025
eea9928
fix: preserve isfirstaction flag in montecarloexploringstarts clone m…
ooples Nov 17, 2025
bc6b317
fix: serialize/deserialize _isfirstaction flag in montecarloexploring…
ooples Nov 17, 2025
2801f59
fix: implement proper maddpg critic gradient descent updates
ooples Nov 17, 2025
e1b91ba
fix: throw notsupportedexception in maddpg applygradients
ooples Nov 17, 2025
873a506
fix: use base class random instance in everyvisitmontecarlo constructor
ooples Nov 17, 2025
e675b41
fix: use base random instance in firstvisitmontecarlo constructor
ooples Nov 17, 2025
2d47db4
fix: use base random instance in montecarloexploringstarts constructor
ooples Nov 17, 2025
790a1af
fix: use base class random in sarsalambdaagent
ooples Nov 17, 2025
72b7ef3
fix: add options validation in iql agent constructor
ooples Nov 17, 2025
e753aaa
fix: add options validation to maddpg agent constructor
ooples Nov 17, 2025
8928771
fix: pass seeded random to get normal random in iql agent
ooples Nov 17, 2025
32f6bf7
fix: add random parameter to mathhelper get normal random
ooples Nov 17, 2025
06ae821
fix: add validate method to iql options
ooples Nov 17, 2025
fa1840c
fix: use network.getgradients() for parameter updates in dqn and doub…
ooples Nov 17, 2025
b56650a
feat: add computeaverage helper method to firstvisitmontecarloagent
ooples Nov 17, 2025
367dc89
fix: implement proper deterministic policy gradient in maddpg actor u…
ooples Nov 17, 2025
776ca19
feat: include target networks in maddpg get/setparameters and synchro…
ooples Nov 17, 2025
f5f7ce2
fix: validate gradient vector length in dqn applygradients to prevent…
ooples Nov 17, 2025
f0fef0d
fix: increase state key precision from f4 to f8 in firstvisitmontecar…
ooples Nov 17, 2025
87612a2
fix(maddpg): implement per-agent reward tracking for competitive scen…
ooples Nov 17, 2025
39b6c3a
fix(maddpg): correct critic gradient application using per-parameter …
ooples Nov 17, 2025
401e9b4
fix(maddpg): use backpropagate return value for input gradients and c…
ooples Nov 17, 2025
471fc18
fix(maddpg): rename inner loop variable to avoid shadowing outer scop…
ooples Nov 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

This file was deleted.

14 changes: 14 additions & 0 deletions src/Compatibility/IsExternalInit.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Compatibility shim for init-only setters in .NET Framework 4.6.2
// This type is required for C# 9+ init accessors to work in older frameworks
// See: https://github.com/dotnet/runtime/issues/45510

namespace System.Runtime.CompilerServices
{
/// <summary>
/// Reserved for use by the compiler for tracking metadata.
/// This class allows the use of init-only setters in .NET Framework 4.6.2.
/// </summary>
internal static class IsExternalInit
{
}
}
Loading
Loading