Update: [AEA-5920] - Llama Prompt Engineering (#260)

kieran-wilkinson-4 · web-flow · commit 8b3444895c3d · 2025-12-23T15:51:01.000Z
## Summary

Improved bot responses

### Details

- Moved from "Amazon Nova" to "Meta Llama" 
- Nova doesn't handle 0.0 temperature (min 0.0001) which still creates
significant variation in responses
   - Nova's prompt engineering was very different to Anthropic's Claude 
- Llama is a lot more similar, meaning the prompts can be much easier
switched over
- Fixed regex for "Markdown" -&gt; (Slacks) "Mrkdwn"
   - Fixed tests for regex
- Updated System Prompt to be shorter and stricter
- Longer prompts weaken directions, leading to mistakes in responses or
the bot ignoring directions altogether
   - Longer prompts also eat up tokens (which we want to minimise)
   - Providing examples strengthens understanding
- Increased logging throughout response generation
- Inc. Response from Bedrock, parsing, formatting, and Slack's block
building
diff --git a/packages/cdk/prompts/systemPrompt.txt b/packages/cdk/prompts/systemPrompt.txt
@@ -1,89 +1,40 @@
-# 1. Persona
-You are an AI assistant designed to provide guidance and references from your knowledge base to help users make decisions during onboarding.
-
-It is **VERY** important that you return **ALL** references found in the context for user examination.
-
----
-
-# 2. THINKING PROCESS & LOGIC
-Before generating a response, adhere to these processing rules:
-
-## A. Context Verification
-Scan the retrieved context for the specific answer
-1. **No information found**: If the information is not present in the context:
-   - Do NOT formulate a general answer.
-   - Do NOT user external resources (i.e., websites, etc) to get an answer.
-   - Do NOT infer an answer from the users question.
-
-## B. Question Analysis
-1.  **Detection:** Determine if the query contains one or multiple questions.
-2.  **Decomposition:** Split complex queries into individual sub-questions.
-3.  **Classification:** Identify if the question is Factual, Procedural, Diagnostic, Troubleshooting, or Clarification-seeking.
-4.  **Multi-Question Strategy:** Number sub-questions clearly (Q1, Q2, etc).
-5.  **No Information:** If there is no information supporting an answer to the query, do not try and fill in the information
-6. **Strictness:** Do not infer information, be strict on evidence.
-
-## C. Entity Correction
-- If you encounter "National Health Service Digital (NHSD)", automatically treat and output it as **"National Health Service England (NHSE)"**.
-
-## D. RAG Confidence Scoring
-```
-Evaluate retrieved context using these relevance score thresholds:
-- `Score > 0.9`     : **Diamond** (Definitive source)
-- `Score 0.8 - 0.9` : **Gold** (Strong evidence)
-- `Score 0.7 - 0.8` : **Silver** (Partial context)
-- `Score 0.6 - 0.7` : **Bronze** (Weak relevance)
-- `Score < 0.6`     : **Scrap** (Ignore completely)
-```
-
----
-
-# 3. OUTPUT STRUCTURE
-Construct your response in this exact order:
-
-1.  **Summary:** A concise overview (Maximum **100 characters**).
-2.  **Answer:** The core response using the specific "mrkdwn" styling defined below (Maximum **800 characters**).
-3.  **Separator:** A literal line break using `------`.
-4.  **Bibliography:** The list of all sources used.
-
----
-
-# 4. FORMATTING RULES ("mrkdwn")
-You must use a specific variation of markdown. Follow this table strictly:
-
-| Element | Style to Use | Example |
-| :--- | :--- | :--- |
-| **Headings / Subheadings** | Bold (`*`) | `*Answer:*`, `*Bibliography:*` |
-| **Source Names** | Bold (`*`) | `*NHS England*`, `*EPS*` |
-| **Citations / Titles** | Italic (`_`) | `_Guidance Doc v1_` |
-| **Quotes (>1 sentence)** | Blockquote (`>`) | `> text` |
-| **Tech Specs / Examples** | Blockquote (`>`) | `> param: value` |
-| **System / Field Names** | Inline Code (`` ` ``) | `` `PrescriptionID` `` |
-| **Technical Terms** | Inline Code (`` ` ``) | `` `HL7 FHIR` `` |
-| **Hyperlinks** | **NONE** | Do not output any URLs. |
-
----
-
-# 5. BIBLIOGRAPHY GENERATOR
-**Requirements:**
-- Return **ALL** retrieved documents from the context.
-- Title length must be **< 50 characters**.
-- Use the exact string format below (do not render it as a table or list).
-
-**Template:**
-```text
-<cit>source number||summary title||excerpt||relevance score||source name</cit>
-
-# 6. Example
+# 1. Persona & Logic
+You are an AI assistant for onboarding guidance. Follow these strict rules:
+* **Strict Evidence:** If the answer is missing, do not infer or use external knowledge. 
+* **The "List Rule":** If a term (e.g. `on-hold`) exists only in a list/dropdown without a specific definition in the text, you **must** state it is "listed but undefined." Do NOT invent definitions.
+* **Decomposition:** Split multi-part queries into numbered sub-questions (Q1, Q2).
+* **Correction:** Always output `National Health Service England (NHSE)` instead of `NHSD`.
+* **RAG Scores:** `>0.9`: Diamond | `0.8-0.9`: Gold | `0.7-0.8`: Silver | `0.6-0.7`: Bronze | `<0.6`: Scrap (Ignore).
+* **Smart Guidance:** If no information can be found, provide next step direction.
+
+# 2. Output Structure
+1. *Summary:* Concise overview (Max 200 chars).
+2. *Answer:* Core response in `mrkdwn` (Max 800 chars).
+3. *Next Steps:* If the answer contains no information, provide useful helpful directions.
+4. Separator: Use "------"
+5. Bibliography: All retrieved documents using the `<cit>` template.
+
+# 3. Formatting Rules (`mrkdwn`)
+Use British English.
+* **Bold (`*`):** Headings, Subheadings, Source Names (e.g. `*NHS England*`).
+* **Italic (`_`):** Citations and Titles (e.g. `_Guidance v1_`).
+* **Blockquote (`>`):** Quotes (>1 sentence) and Tech Specs/Examples.
+* **Inline Code (`\``):** System/Field Names and Technical Terms (e.g. `HL7 FHIR`).
+* **Links:** `<text|link>`
+
+# 4. Bibliography Template
+Return **ALL** sources using this exact format:
+<cit>index||summary||excerpt||relevance score</cit>
+
+# 5. Example
 """
 *Summary*
-Short summary text
+This is a concise, clear answer - without going into a lot of depth.
 
-* Answer *
+*Answer*
 A longer answer, going into more detail gained from the knowledge base and using critical thinking.
-
 ------
-<cit>1||A document||This is the precise snippet of the pdf file which answers the question.||0.98||very_helpful_doc.pdf</cit>
-<cit>2||Another file||A 500 word text excerpt which gives some inference to the answer, but the long citation helps fill in the information for the user, so it's worth the tokens.||0.76||something_interesting.txt</cit>
-<cit>3||A useless file||This file doesn't contain anything that useful||0.05||folder/another/some_file.txt</cit>
+<cit>1||Example name||This is the precise snippet of the pdf file which answers the question.||0.98</cit>
+<cit>2||Another example file name||A 500 word text excerpt which gives some inference to the answer, but the long citation helps fill in the information for the user, so it's worth the tokens.||0.76</cit>
+<cit>3||A useless example file's title||This file doesn't contain anything that useful||0.05</cit>
 """
diff --git a/packages/slackBotFunction/app/slack/slack_events.py b/packages/slackBotFunction/app/slack/slack_events.py
@@ -271,17 +271,14 @@ def convert_markdown_to_slack(body: str) -> str:
     body = body.replace("»", "")
     body = body.replace("â¢", "-")
 
-    # 2. Convert Markdown Italics (*text*) and (__text__) to Slack Italics (_text_)
-    body = re.sub(r"(?<!\*)\*([^*]+)\*(?!\*)", r"_\1_", body)
-    body = re.sub(r"_{1,2}([^_]+)_{1,2}", r"_\1_", body)
+    # 2. Convert Markdown Bold (**text**) and Italics (__text__)
+    # to Slack Bold (*text*) and Italics (_text_)
+    body = re.sub(r"([\*_]){2,10}([^*]+)([\*_]){2,10}", r"\1\2\1", body)
 
-    # 3. Convert Markdown Bold (**text**) to Slack Bold (*text*)
-    body = re.sub(r"\*\*([^*]+)\*\*", r"*\1*", body)
-
-    # 4. Handle Lists (Handle various bullet points and dashes, inc. unicode support)
+    # 3. Handle Lists (Handle various bullet points and dashes, inc. unicode support)
     body = re.sub(r"(?:^|\s{1,10})[-•–—▪‣◦⁃]\s{0,10}", r"\n- ", body)
 
-    # 5. Convert Markdown Links [text](url) to Slack <url|text>
+    # 4. Convert Markdown Links [text](url) to Slack <url|text>
     body = re.sub(r"\[([^\]]+)\]\(([^\)]+)\)", r"<\2|\1>", body)
 
     return body.strip()
diff --git a/packages/slackBotFunction/tests/test_slack_events/test_slack_events_citations.py b/packages/slackBotFunction/tests/test_slack_events/test_slack_events_citations.py
@@ -540,7 +540,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(
             {
                 "source_number": "1",
                 "title": "Citation Title",
-                "excerpt": "**Bold**, __italics__, *markdown italics*, and `code`.",
+                "excerpt": "**Bold**, __italics__, and `code`.",
                 "relevance_score": "0.95",
             }
         ],
@@ -556,7 +556,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(
     citation_element = response[1]["elements"][0]
     citation_value = json.loads(citation_element["value"])
 
-    assert "*Bold*, _italics_, _markdown italics_, and `code`." in citation_value.get("body")
+    assert "*Bold*, _italics_, and `code`." in citation_value.get("body")
 
 
 def test_create_response_body_creates_body_with_lists(
diff --git a/packages/slackBotFunction/tests/test_slack_events/test_slack_events_messages.py b/packages/slackBotFunction/tests/test_slack_events/test_slack_events_messages.py
@@ -432,7 +432,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(
     response = _create_response_body(
         citations=[],
         feedback_data={},
-        response_text="**Bold**, __italics__, *markdown italics*, and `code`.",
+        response_text="**Bold**, __italics__, and `code`.",
     )
 
     # assertions
@@ -441,7 +441,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(
 
     response_value = response[0]["text"]["text"]
 
-    assert "*Bold*, _italics_, _markdown italics_, and `code`." in response_value
+    assert "*Bold*, _italics_, and `code`." in response_value
 
 
 def test_create_response_body_creates_body_with_lists(

Original file line number	Diff line number	Diff line change
`@@ -540,7 +540,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(`
`540`	`540`	`{`
`541`	`541`	`"source_number": "1",`
`542`	`542`	`"title": "Citation Title",`
`543`		- "excerpt": "Bold, __italics__, markdown italics, and `code`.",
	`543`	+ "excerpt": "Bold, __italics__, and `code`.",
`544`	`544`	`"relevance_score": "0.95",`
`545`	`545`	`}`
`546`	`546`	`],`
`@@ -556,7 +556,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(`
`556`	`556`	`citation_element = response[1]["elements"][0]`
`557`	`557`	`citation_value = json.loads(citation_element["value"])`
`558`	`558`
`559`		- assert "Bold, _italics_, _markdown italics_, and `code`." in citation_value.get("body")
	`559`	+ assert "Bold, _italics_, and `code`." in citation_value.get("body")
`560`	`560`
`561`	`561`
`562`	`562`	`def test_create_response_body_creates_body_with_lists(`