Skip to content

Commit 12c68b0

Browse files
docs: update misinformation for steve (#443)
1 parent 623a760 commit 12c68b0

File tree

2 files changed

+31
-37
lines changed

2 files changed

+31
-37
lines changed

2_0_voting/voting_round_two/Misinformation.md

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,35 @@
1-
21
## Misinformation
32

43
### Description
54

6-
Misinformation output from LLMs poses a core vulnerability for applications that rely on these models. Misinformation occurs when LLMs produce false or misleading information that appears credible. This vulnerability is significant because it can lead to substantial risks, including security breaches, reputational damage, and legal liability.
5+
Misinformation from LLMs poses a core vulnerability for applications relying on these models. Misinformation occurs when LLMs produce false or misleading information that appears credible. This vulnerability can lead to security breaches, reputational damage, and legal liability.
76

8-
One of the major causes of misinformation is hallucination—when the LLM generates content that seems accurate but is fabricated. Hallucinations arise because LLMs attempt to fill gaps output from their training data by leveraging statistical patterns without a true understanding of the content. As a result, the model may produce answers that sound correct but are completely unfounded. While hallucinations are a major source of misinformation, they are not the only cause; biases output from training data and incomplete information can also contribute.
7+
One of the major causes of misinformation is hallucination—when the LLM generates content that seems accurate but is fabricated. Hallucinations occur when LLMs fill gaps in their training data using statistical patterns, without truly understanding the content. As a result, the model may produce answers that sound correct but are completely unfounded. While hallucinations are a major source of misinformation, they are not the only cause; biases introduced by the training data and incomplete information can also contribute.
98

10-
A related issue is overreliance. Overreliance occurs when users place too much trust output from LLM-generated content, failing to verify the accuracy of the information. This overreliance exacerbates the impact of misinformation, as users may integrate incorrect data into critical decisions or processes without adequate scrutiny.
9+
A related issue is overreliance. Overreliance occurs when users place excessive trust in LLM-generated content, failing to verify its accuracy. This overreliance exacerbates the impact of misinformation, as users may integrate incorrect data into critical decisions or processes without adequate scrutiny.
1110

1211
### Common Examples of Risk
1312

14-
1. **Factual Inaccuracies:** The model produces incorrect statements, leading users to make decisions based on false information. For example, Air Canada's chatbot provided misinformation to travelers, leading to confusion and complications. The airline was successfully sued as a result ([BBC](https://www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know)).
15-
2. **Unsupported Claims:** The model generates baseless assertions, which can be especially harmful output from sensitive contexts such as healthcare or legal proceedings. For example, ChatGPT fabricated fake legal cases, leading to significant issues in court ([LegalDive](https://www.legaldive.com/news/chatgpt-fake-legal-cases-generative-ai-hallucinations/651557/)).
16-
3. **Misrepresentation of Expertise:** The model gives the illusion of understanding complex topics, misleading users regarding its level of expertise. For example, chatbots have been found to misrepresent the complexity of health-related issues, suggesting uncertainty where there is none, which misled users into believing that unsupported treatments were still under debate ([KFF](https://www.kff.org/health-misinformation-monitor)).
17-
4. **Unsafe Code Generation:** The model suggests insecure or non-existent code libraries, which can introduce vulnerabilities when integrated into software systems. For example, LLMs propose using insecure third-party libraries, which, if trusted without verification, led to security risks ([Lasso](https://www.lasso.security/blog/ai-package-hallucinations)).
13+
1. **Factual Inaccuracies:** The model produces incorrect statements, leading users to make decisions based on false information. For example, Air Canada's chatbot provided misinformation to travelers, leading to operational disruptions and legal complications. The airline was successfully sued as a result ([BBC](https://www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know)).
14+
2. **Unsupported Claims:** The model generates baseless assertions, which can be especially harmful in sensitive contexts such as healthcare or legal proceedings. For example, ChatGPT fabricated fake legal cases, leading to significant issues in court ([LegalDive](https://www.legaldive.com/news/chatgpt-fake-legal-cases-generative-ai-hallucinations/651557/)).
15+
3. **Misrepresentation of Expertise:** The model gives the illusion of understanding complex topics, misleading users regarding its level of expertise. For example, chatbots have been found to misrepresent the complexity of health-related issues, suggesting uncertainty where there is none, which misled users into believing that unsupported treatments were still under debate ([KFF](https://www.kff.org/health-misinformation-monitor/volume-05/)).
16+
4. **Unsafe Code Generation:** The model suggests insecure or non-existent code libraries, which can introduce vulnerabilities when integrated into software systems. For example, LLMs propose using insecure third-party libraries, which, if trusted without verification, leads to security risks ([Lasso](https://www.lasso.security/blog/ai-package-hallucinations)).
1817

1918
### Prevention and Mitigation Strategies
2019
1. **Retrieval-Augmented Generation (RAG):** Use Retrieval-Augmented Generation to enhance the reliability of model outputs by retrieving relevant and verified information from trusted external databases during response generation. This helps mitigate the risk of hallucinations and misinformation.
2120
2. **Model Fine-Tuning:** Enhance the model with fine-tuning or embeddings to improve output quality. Techniques such as parameter-efficient tuning (PET) and chain-of-thought prompting can help reduce the incidence of misinformation.
22-
3. **Cross-Verification:** Encourage users to cross-check LLM outputs with trusted external sources to ensure the accuracy of the information.
21+
3. **Cross-Verification and Human Oversight:** Encourage users to cross-check LLM outputs with trusted external sources to ensure the accuracy of the information. Implement human oversight and fact-checking processes, especially for critical or sensitive information. Ensure that human reviewers are properly trained to avoid overreliance on AI-generated content.
2322
4. **Automatic Validation Mechanisms:** Implement tools and processes to automatically validate key outputs, especially output from high-stakes environments.
24-
5. **Risk Communication:** Clearly communicate the risks and limitations associated with using LLMs, including the potential for misinformation.
23+
5. **Risk Communication:** Identify the risks and possible harms associated with LLM-generated content, then clearly communicate these risks and limitations to users, including the potential for misinformation.
2524
6. **Secure Coding Practices:** Establish secure coding practices to prevent the integration of vulnerabilities due to incorrect code suggestions.
26-
7. **User Interface Design:** Design APIs and user interfaces that encourage responsible use of LLMs, such as integrating content filters, clearly labeling AI-generated content and informing users on limitations of relibility and accuracy. Be specific about intended field of use limitations.
27-
8. **Training and Education:** Provide training for users on the limitations of LLMs and the importance of independent verification of generated content.
25+
7. **User Interface Design:** Design APIs and user interfaces that encourage responsible use of LLMs, such as integrating content filters, clearly labeling AI-generated content and informing users on limitations of reliability and accuracy. Be specific about the intended field of use limitations.
26+
8. **Training and Education:** Provide comprehensive training for users on the limitations of LLMs, the importance of independent verification of generated content, and the need for critical thinking. In specific contexts, offer domain-specific training to ensure users can effectively evaluate LLM outputs within their field of expertise.
2827

2928
### Example Attack Scenarios
29+
30+
**Scenario #1:** Attackers experiment with popular coding assistants to find commonly hallucinated package names. Once they identify these frequently suggested but nonexistent libraries, they publish malicious packages with those names to widely used repositories. Developers, relying on the coding assistant's suggestions, unknowingly integrate these poised packages into their software. As a result, the attackers gain unauthorized access, inject malicious code, or establish backdoors, leading to significant security breaches and compromising user data.
3031

31-
**Scenario #1:** Attackers experiment with popular coding assistants for commonly hallucinated package names. Once they identify these frequently suggested but non-existent libraries, they publish malicious packages with those names to widely-used repositories. Developers, relying on the coding assistant's suggestions, unknowingly integrate these poised packages into their software. As a result, the attackers gain unauthorized access, inject malicious code, or establish backdoors, leading to significant security breaches and compromising user data.
32-
33-
**Scenario #2:** A company builds an LLM application to automatically summarize and publish news stories without sufficient human oversight. An attacker compromises the system by experimenting with malicious code, attempting to manipulate the news summaries to spread political, financial, or health-related misinformation. This misinformation is published as legitimate news articles, causing widespread confusion, influencing public opinion, and potentially leading to financial market disruptions or public health risks.
34-
35-
**Scenario #3:** A company puts out a chatbot for medical diagnosis without ensuring sufficient accuracy. The chatbot provides poor information, leading to harmful consequences for patients. As a result, the company is successfully sued for damages. In this case, the safety and security breakdown did not require a malicious attacker, but instead arose from the insufficient oversight and reliability of the LLM system.
32+
**Scenario #2:** A company provides a chatbot for medical diagnosis without ensuring sufficient accuracy. The chatbot provides poor information, leading to harmful consequences for patients. As a result, the company is successfully sued for damages. In this case, the safety and security breakdown did not require a malicious attacker but instead arose from the insufficient oversight and reliability of the LLM system. In this scenario, there is no need for an active attacker for the company to be at risk of reputational and financial damage.
3633

3734
### Reference Links
3835
1. [AI Chatbots as Health Information Sources: Misrepresentation of Expertise](https://www.kff.org/health-misinformation-monitor/volume-05/): **KFF**
@@ -46,7 +43,6 @@ A related issue is overreliance. Overreliance occurs when users place too much t
4643
9. [How to Reduce the Hallucinations from Large Language Models](https://thenewstack.io/how-to-reduce-the-hallucinations-from-large-language-models/): **The New Stack**
4744
10. [Practical Steps to Reduce Hallucination](https://newsletter.victordibia.com/p/practical-steps-to-reduce-hallucination): **Victor Debia**
4845
11. [A Framework for Exploring the Consequences of AI-Mediated Enterprise Knowledge](https://www.microsoft.com/en-us/research/publication/a-framework-for-exploring-the-consequences-of-ai-mediated-enterprise-knowledge-access-and-identifying-risks-to-workers/): **Microsoft**
49-
5046
### Related Frameworks and Taxonomies
5147

5248
Refer to this section for comprehensive information, scenarios strategies relating to infrastructure deployment, applied environment controls and other best practices.

0 commit comments

Comments
 (0)