We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 60c917b commit 1cef541Copy full SHA for 1cef541
README.md
@@ -1,5 +1,6 @@
1
# Reward-Model
2
Reward Model training framework for LLM RLHF. For in-depth understanding of Reward modeling, checkout our [blog](https://explodinggradients.com/)
3
+The word nemesis originally meant the distributor of fortune, neither good nor bad, simply in due proportion to each according to what was deserved.
4
### Quick Start
5
* Inference
6
```python
@@ -16,5 +17,7 @@ tokenizer = AutoTokenizer.from_pretrained(MODEL)
16
17
python src/training.py --config-name <your-config-name>
18
```
19
20
+
21
22
## Contributions
23
* All contributions are welcome. Checkout #issues
0 commit comments