You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/Security-Focused-Guide-for-AI-Code-Assistant-Instructions.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ By keeping these points in mind, you can harness AI code assistants effectively
24
24
### TL;DR Sample Instructions
25
25
26
26
Here are sample instructions that you can copy and paste.
27
-
In most cases you should extract *from* this sample (for details see below):
27
+
In most cases you should **extract from** this sample (for details see below).
28
28
If you copy and paste irrelevant parts, the AI is more likely to generate
29
29
extraneous or even incorrect code as it attempts to compensate for
30
30
attacks that can't happen:
@@ -49,7 +49,7 @@ When suggesting dependency versions, prefer the latest stable release and mentio
49
49
Generate a Software Bill of Materials (SBOM) by using tools that support standard formats like SPDX or CycloneDX.
50
50
Where applicable, use in-toto attestations or similar frameworks to create verifiable records of your build and deployment processes.
51
51
Prefer high-level libraries for cryptography rather than rolling your own.
52
-
---
52
+
>
53
53
> When adding important external resources (scripts, containers, etc.), include steps to verify integrity (like checksum verification or signature validation) if applicable.
54
54
When writing file or OS-level operations, use safe functions and check for errors (e.g., use secure file modes, avoid temp files without proper randomness, etc.). If running as a service, drop privileges when possible.
55
55
Always include appropriate security headers (Content Security Policy, X-Frame-Options, etc.) in web responses, and use frameworks' built-in protections for cookies and sessions.
@@ -72,7 +72,7 @@ For Python, follow PEP 8 and use type hints, as this can catch misuse early.
72
72
For JavaScript/TypeScript, when generating Node.js code, use prepared statements for database queries (just like any other language) and encode any data that goes into HTML to prevent XSS.
73
73
For Java, when suggesting web code (e.g., using Spring), ensure to use built-in security annotations and avoid old, vulnerable libraries (e.g., use `BCryptPasswordEncoder` rather than writing a custom password hash).
74
74
For C#, Use .NET's cryptography and identity libraries instead of custom solutions.
75
-
---
75
+
>
76
76
> Never suggest turning off security features like XML entity security or type checking during deserialization.
77
77
Code suggestions should adhere to OWASP Top 10 principles (e.g., avoid injection, enforce access control) and follow the OWASP ASVS requirements where applicable.
78
78
Our project follows SAFECode's secure development practices – the AI should prioritize those (e.g., proper validation, authentication, cryptography usage per SAFECode guidance).
<aid="josephspracklen2024f">[josephspracklen2024f]</a> "3 of the 4 models ... proved to be highly adept in detecting their own hallucinations with detection accuracy above 75%. Table 2 displays the recall and precision values for this test, with similarly strong performance across the 3 proficient models. This phenomenon implies that each model’s specific error patterns are detectable by the same mechanisms that generate them, suggesting an inherent self-regulatory capability. The indication that these models have an implicit understanding of their own generative patterns that could be leveraged for self-improvement is an important finding for developing mitigation strategies." (Spracklen - [We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs](https://arxiv.org/abs/2406.10279))
258
258
259
-
<aid="catherinetony2024a">[catherinetony2024a]</a> "). "When using LLMs or LLM-powered tools like ChatGPT or Copilot... (1) Using RCI is preferable over the other techniques studied in this work, as RCI can largely improve the security of the generated code (up to an order of magnitude w.r.t weakness density) even when applied with just 2 iterations. This technique has stayed valuable over several versions of the LLM models, and, hence, there is an expectation that it will stay valid in the future as well. ... (4) In cases where multi-step techniques like RCI are not feasible, using simple zero-shot prompting with templates similar to comprehensive prompts, that specify well-established secure coding standards, can provide comparable results in relation to more complex techniques." (Catherine Tony, Nicolás E. Díaz Ferreyra, Markus Mutas, Salem Dhiff, Riccardo Scandariato - [Prompting Techniques for Secure Code Generation: A Systematic Investigation](https://arxiv.org/abs/2407.07064v2))
259
+
<aid="catherinetony2024a">[catherinetony2024a]</a> "When using LLMs or LLM-powered tools like ChatGPT or Copilot... (1) Using RCI is preferable over the other techniques studied in this work, as RCI can largely improve the security of the generated code (up to an order of magnitude w.r.t weakness density) even when applied with just 2 iterations. This technique has stayed valuable over several versions of the LLM models, and, hence, there is an expectation that it will stay valid in the future as well. ... (4) In cases where multi-step techniques like RCI are not feasible, using simple zero-shot prompting with templates similar to comprehensive prompts, that specify well-established secure coding standards, can provide comparable results in relation to more complex techniques." (Catherine Tony, Nicolás E. Díaz Ferreyra, Markus Mutas, Salem Dhiff, Riccardo Scandariato - [Prompting Techniques for Secure Code Generation: A Systematic Investigation](https://arxiv.org/abs/2407.07064v2))
260
260
261
-
<aid="catherinetony2024b">[catherinetony2024b]</a> "). "Across all the examined LLMs, the persona/memetic proxy approach has led to the highest average number of security weaknesses among all the evaluated prompting techniques excluding the baseline prompt that does not include any security specifications." (Catherine Tony, Nicolás E. Díaz Ferreyra, Markus Mutas, Salem Dhiff, Riccardo Scandariato - [Prompting Techniques for Secure Code Generation: A Systematic Investigation](https://arxiv.org/abs/2407.07064v2))
261
+
<aid="catherinetony2024b">[catherinetony2024b]</a> "Across all the examined LLMs, the persona/memetic proxy approach has led to the highest average number of security weaknesses among all the evaluated prompting techniques excluding the baseline prompt that does not include any security specifications." (Catherine Tony, Nicolás E. Díaz Ferreyra, Markus Mutas, Salem Dhiff, Riccardo Scandariato - [Prompting Techniques for Secure Code Generation: A Systematic Investigation](https://arxiv.org/abs/2407.07064v2))
0 commit comments