Skip to content

Commit 7617e9f

Browse files
authored
Add note on baseline results and re-running tests
Added an important note regarding baseline performance metrics and a recommendation to re-run baselines.
1 parent 5ab294e commit 7617e9f

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,14 @@
1010

1111
[Paper](https://arxiv.org/abs/2412.21033) | [Website](https://gautierdag.github.io/plancraft/)
1212

13+
### ⚠️ Important Note on Baseline Results
14+
15+
The baseline performance metrics reported in the original paper are underreported due to a bug in the environment that has since been fixed.
16+
17+
If you are using Plancraft, please re-run the baselines yourself using the code in this repository and use those as your point of comparison.
18+
19+
For a full explanation, please see Issue [#2](/../../issues/2).
20+
1321
### Plancraft was accepted to COLM 2025!
1422

1523
Plancraft is a minecraft environment that benchmarks planning in LLM agents with an oracle RAG retriever.

0 commit comments

Comments
 (0)