reward-shaping

Star

Here are 39 public repositories matching this topic...

lcswillems / rl-starter-files

Star

RL starter files in order to immediately train, visualize and evaluate an agent without writing any line of code

pytorch multi-process a3c minigrid ppo a2c reward-shaping preprocessed-observations

Updated May 12, 2024
Python

haizelabs / verdict

Star

Inference-time scaling for LLMs-as-a-judge.

reward-shaping llm llm-as-a-judge test-time-compute inference-time-compute llm-judge test-time-scaling

Updated Nov 5, 2025
Jupyter Notebook

salesforce / MultiHopKG

Star

Multi-hop knowledge graph reasoning learned via policy gradient with reward shaping and action dropout

reinforcement-learning pytorch knowledge-graph policy-gradient reward-shaping action-dropout multi-hop-reasoning

Updated Oct 6, 2025
Jupyter Notebook

lcswillems / torch-ac

Star

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

reinforcement-learning deep-reinforcement-learning pytorch recurrent-neural-networks multi-process a3c minigrid recurrent actor-critic proximal-policy-optimization ppo advantage-actor-critic a2c reward-shaping

Updated Oct 5, 2022
Python

philfung / awesome-reliable-robotics

Star

Robotics research demonstrating reliability and robustness in the real world (continuously updated)

reinforcement-learning robotics robots manipulation imitation-learning manipulator-robotics robotic-arm fine-tuning reward-shaping vision-language-model

Updated Feb 23, 2026

yining043 / NeuOpt

Star

This repo implements our paper, "Learning to Search Feasible and Infeasible Regions of Routing Problems with Flexible Neural k-Opt", which has been accepted at NeurIPS 2023.

reinforcement-learning deep-reinforcement-learning transformer vehicle-routing-problem vrp tsp neural-combinatorial-optimization k-opt reward-shaping learning-to-optimize

Updated Jul 24, 2024
Jupyter Notebook

kochlisGit / TraderNet-CRv2

Star

TraderNet-CRv2 - Combining Deep Reinforcement Learning with Technical Analysis and Trend Monitoring on Cryptocurrency Markets

Updated Oct 2, 2023
Jupyter Notebook

csmile-1006 / ARP

Star

Guide Your Agent with Adaptive Multimodal Rewards (NeurIPS 2023 Accepted)

reinforcement-learning deep-learning robotics imitation-learning reward-shaping vision-language

Updated Sep 25, 2023
Python

sidmohan0 / tesserack

Star

Compiling strategy guides into reward functions for reinforcement learning. Uses Claude Vision to extract unit tests from game guides, then trains agents with dense, interpretable rewards.

machine-learning pokemon reinforcement-learning gameboy neural-network browser-game rl webgpu claude tensorflow-js reward-shaping llms

Updated Jan 30, 2026
JavaScript

niksaz / dota2-expert-demo

Star

Dota 2 bot that is trained by Deep RL with expert demonstrations

reinforcement-learning tensorflow reward-shaping dota2-bot

Updated Oct 15, 2022
Python

holarissun / RewardShifting

Star

Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL

reinforcement-learning ensemble ensemble-learning rnd deep-q-network reward-design reward-shaping exploration-exploitation value-based-methods reward-engineering offline-reinforcement-learning dqn-rnd ensemble-rl

Updated Oct 29, 2023
Python

Digitalized-Energy-Systems / opfgym

Star

A gymnasium-compatible framework to create reinforcement learning (RL) environment for solving the optimal power flow (OPF) problem. Contains five OPF benchmark environments for comparable research.

benchmark environment reinforcement-learning supervised-learning rl optimal-power-flow energy-system gymnasium opf pandapower contextual-bandit reward-design reward-shaping power-system environment-design action-shaping

Updated Mar 22, 2025
Python

mike-gimelfarb / bayesian-reward-shaping

Star

Bayesian Reward Shaping Framework for Deep Reinforcement Learning

deep-reinforcement-learning bayesian-inference ensemble-model reward-shaping

Updated Mar 29, 2019
Python

tongzhoumu / DrS

Star

Code for "DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks"

reinforcement-learning robotics reward-shaping

Updated Apr 26, 2024
Python

awilliea / Risk-based_RL_for_Optimal_Trading_Execution

Star

reinforcement-learning trading-bot multi-objective limit-order-book ddqn gym-environment reward-shaping execution-strategy order-placement tf-agents optimal-trading-execution morl

Updated Sep 18, 2020
Python

csmile-1006 / REDS_agent

Star

Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)

reinforcement-learning visual-reinforcement-learning reward-shaping reward-learning reward-models

Updated Apr 11, 2025
Python

jbakams / slimebot-volleyball

Star

3D gym environments to train RL agents to play the Slime Volleyball game in 3 dimensions using Webots as simulator.

reinforcement-learning 3d-environment incremental-learning webots gym-environment reward-shaping multi-agent-reinforcement-learning

Updated Aug 10, 2025
Python

sebastianbrzustowicz / Robot-Sumo-RL

Star

Python + PyTorch. Advanced Reinforcement Learning (SAC/PPO/A2C) for ✨autonomous Robot Sumo combat featuring competitive self-play in continuous action spaces.