Commit 0479c57
Assistant: Dissuade the model from guessing results of code execution (#9635)
Addresses #9580.
In the above issue, the model hallucinates the statistical results from
a code block execution in Ask mode, where it cannot actually execute
code.
The working theory was that the following lines in the default prompt
were to blame:
https://github.com/posit-dev/positron/blob/d9f3047f74ac2d01186b19efdbf9afaa6f115316/extensions/positron-assistant/src/md/prompts/chat/default.md?plain=1#L37-L39
However, through experimentation I see the same or similar issues
appearing even with these lines removed. It seems that (at least for
Claude 4 Sonnet) the model does indeed need some instruction on how to
handle the situation where it wants to run some code but cannot execute
it directly. Without guidance, it defaults to hallucinating (trying to
be as helpful as possible, I assume).
---
So, this PR takes a slightly stronger hand with the following changes:
* As described above, the "Present statistics and insights about the
data" line was removed from the base default prompt and added to the
Agent mode prompt, where it probably should always have been.
* A new prompt specifically to be included when Ask mode is active has
been added. Here we try to strongly state:
- That the model cannot see the results of code blocks it emits.
- That the model should stop if the rest of its response depends on the
result of a code block it has emitted.
- That the model should not try to run or present the output of code it
has emitted.
* A similar prompt for Edit mode, where the model also cannot directly
execute code.
* Added the Ask mode prompting also to the inline editor.
And, while not strictly related, while I am here:
* Added the default prompt text to the `/explain` command. Without this
we lose the base Positron Assistant behaviour whenever `/explain` is
activated (which can now happen automatically!).
---
## QA
Follow the reproduction steps in #9580. The model should no longer try
to output a statistical summary, but instead explain what the code would
do if the user runs it:
<img width="520" height="337" alt="Screenshot 2025-09-29 at 14 52 06"
src="https://github.com/user-attachments/assets/2c262436-8de0-4c15-8738-4d17173d9caf"
/>
---
Stopping hallucinations is a hard problem in general, but this change
should hopefully help to avoid this particular situation.
---------
Signed-off-by: George Stagg <[email protected]>
Co-authored-by: sharon <[email protected]>1 parent 7426123 commit 0479c57
File tree
6 files changed
+43
-10
lines changed- extensions/positron-assistant/src
- commands
- md/prompts/chat
6 files changed
+43
-10
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
26 | 29 | | |
27 | 30 | | |
28 | 31 | | |
| |||
Lines changed: 5 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
20 | | - | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
21 | 24 | | |
22 | 25 | | |
23 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
38 | | - | |
39 | | - | |
| 37 | + | |
40 | 38 | | |
41 | 39 | | |
42 | 40 | | |
| |||
71 | 69 | | |
72 | 70 | | |
73 | 71 | | |
| 72 | + | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
Lines changed: 11 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
768 | 768 | | |
769 | 769 | | |
770 | 770 | | |
| 771 | + | |
771 | 772 | | |
772 | | - | |
| 773 | + | |
773 | 774 | | |
774 | 775 | | |
775 | 776 | | |
| |||
781 | 782 | | |
782 | 783 | | |
783 | 784 | | |
| 785 | + | |
784 | 786 | | |
785 | | - | |
| 787 | + | |
786 | 788 | | |
787 | 789 | | |
788 | 790 | | |
| |||
831 | 833 | | |
832 | 834 | | |
833 | 835 | | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
834 | 839 | | |
835 | 840 | | |
836 | 841 | | |
837 | | - | |
| 842 | + | |
838 | 843 | | |
839 | 844 | | |
840 | 845 | | |
841 | 846 | | |
842 | | - | |
| 847 | + | |
843 | 848 | | |
844 | 849 | | |
845 | 850 | | |
| |||
0 commit comments