Skip to content

Commit fc34517

Browse files
authored
Add support for decision transformer (#795)
* Add support for decision transformer (Closes #794) * Comment out supported decision transformer models Models are in the `onnx-community` org on HF
1 parent 52e6489 commit fc34517

File tree

5 files changed

+32
-2
lines changed

5 files changed

+32
-2
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -250,7 +250,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
250250

251251
| Task | ID | Description | Supported? |
252252
|--------------------------|----|-------------|------------|
253-
| [Reinforcement Learning](https://huggingface.co/tasks/reinforcement-learning) | n/a | Learning from actions by interacting with an environment through trial and error and receiving rewards (negative or positive) as feedback. | |
253+
| [Reinforcement Learning](https://huggingface.co/tasks/reinforcement-learning) | n/a | Learning from actions by interacting with an environment through trial and error and receiving rewards (negative or positive) as feedback. | |
254254

255255

256256

@@ -276,6 +276,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
276276
1. **[ConvNeXTV2](https://huggingface.co/docs/transformers/model_doc/convnextv2)** (from Facebook AI) released with the paper [ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders](https://arxiv.org/abs/2301.00808) by Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie.
277277
1. **[DeBERTa](https://huggingface.co/docs/transformers/model_doc/deberta)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
278278
1. **[DeBERTa-v2](https://huggingface.co/docs/transformers/model_doc/deberta-v2)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
279+
1. **[Decision Transformer](https://huggingface.co/docs/transformers/model_doc/decision_transformer)** (from Berkeley/Facebook/Google) released with the paper [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345) by Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch.
279280
1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
280281
1. **[Depth Anything](https://huggingface.co/docs/transformers/main/model_doc/depth_anything)** (from University of Hong Kong and TikTok) released with the paper [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891) by Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao.
281282
1. **[DETR](https://huggingface.co/docs/transformers/model_doc/detr)** (from Facebook) released with the paper [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.

docs/snippets/5_supported-tasks.snippet

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,4 +67,4 @@
6767

6868
| Task | ID | Description | Supported? |
6969
|--------------------------|----|-------------|------------|
70-
| [Reinforcement Learning](https://huggingface.co/tasks/reinforcement-learning) | n/a | Learning from actions by interacting with an environment through trial and error and receiving rewards (negative or positive) as feedback. | |
70+
| [Reinforcement Learning](https://huggingface.co/tasks/reinforcement-learning) | n/a | Learning from actions by interacting with an environment through trial and error and receiving rewards (negative or positive) as feedback. | |

docs/snippets/6_supported-models.snippet

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
1. **[ConvNeXTV2](https://huggingface.co/docs/transformers/model_doc/convnextv2)** (from Facebook AI) released with the paper [ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders](https://arxiv.org/abs/2301.00808) by Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie.
2222
1. **[DeBERTa](https://huggingface.co/docs/transformers/model_doc/deberta)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
2323
1. **[DeBERTa-v2](https://huggingface.co/docs/transformers/model_doc/deberta-v2)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
24+
1. **[Decision Transformer](https://huggingface.co/docs/transformers/model_doc/decision_transformer)** (from Berkeley/Facebook/Google) released with the paper [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345) by Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch.
2425
1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
2526
1. **[Depth Anything](https://huggingface.co/docs/transformers/main/model_doc/depth_anything)** (from University of Hong Kong and TikTok) released with the paper [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891) by Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao.
2627
1. **[DETR](https://huggingface.co/docs/transformers/model_doc/detr)** (from Facebook) released with the paper [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.

scripts/supported_models.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -299,6 +299,22 @@
299299
'sileod/deberta-v3-large-tasksource-nli',
300300
],
301301
},
302+
# TODO: Add back in v3
303+
# 'decision-transformer': {
304+
# # Reinforcement learning
305+
# 'reinforcement-learning': [
306+
# 'edbeeching/decision-transformer-gym-hopper-expert',
307+
# 'edbeeching/decision-transformer-gym-hopper-medium',
308+
# 'edbeeching/decision-transformer-gym-hopper-medium-replay',
309+
# 'edbeeching/decision-transformer-gym-hopper-expert-new',
310+
# 'edbeeching/decision-transformer-gym-halfcheetah-expert',
311+
# 'edbeeching/decision-transformer-gym-halfcheetah-medium',
312+
# 'edbeeching/decision-transformer-gym-halfcheetah-medium-replay',
313+
# 'edbeeching/decision-transformer-gym-walker2d-expert',
314+
# 'edbeeching/decision-transformer-gym-walker2d-medium',
315+
# 'edbeeching/decision-transformer-gym-walker2d-medium-replay',
316+
# ],
317+
# },
302318
'deit': {
303319
# Image classification
304320
'image-classification': [

src/models.js

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5481,6 +5481,17 @@ export class EfficientNetForImageClassification extends EfficientNetPreTrainedMo
54815481
}
54825482
//////////////////////////////////////////////////
54835483

5484+
//////////////////////////////////////////////////
5485+
// Decision Transformer models
5486+
export class DecisionTransformerPreTrainedModel extends PreTrainedModel { }
5487+
5488+
/**
5489+
* The model builds upon the GPT2 architecture to perform autoregressive prediction of actions in an offline RL setting.
5490+
* Refer to the paper for more details: https://arxiv.org/abs/2106.01345
5491+
*/
5492+
export class DecisionTransformerModel extends DecisionTransformerPreTrainedModel { }
5493+
5494+
//////////////////////////////////////////////////
54845495

54855496
//////////////////////////////////////////////////
54865497
// AutoModels, used to simplify construction of PreTrainedModels
@@ -5607,6 +5618,7 @@ const MODEL_MAPPING_NAMES_ENCODER_ONLY = new Map([
56075618
['hifigan', ['SpeechT5HifiGan', SpeechT5HifiGan]],
56085619
['efficientnet', ['EfficientNetModel', EfficientNetModel]],
56095620

5621+
['decision_transformer', ['DecisionTransformerModel', DecisionTransformerModel]],
56105622
]);
56115623

56125624
const MODEL_MAPPING_NAMES_ENCODER_DECODER = new Map([

0 commit comments

Comments
 (0)