Skip to content

Perform input validation and/or filtering on all user-provided and user-controlled data before it is used #15270

@anupriya13

Description

@anupriya13

Requirement:

The following input validations must be performed on all input data from user-provided and user-controlled sources prior to being used:
Requirement:

The following input validations must be performed on all input data from user-provided and user-controlled sources prior to being used:

All inputs from the user or user-supplied data from external systems are validated on the server side before being consumed.
Numeric data is validated for range (upper and lower bound) as well as type (signed vs. unsigned).
String data is validated for length, format, and character set.
If the string data is being used to form a URL, please consult the Server-Side Request Forgery Guidance.
All data must, where possible, be validated against a list of permissible values rather than a list of values to be rejected.
Guidance:

Developers must use defensive coding techniques and assume that all user-supplied input data received by their applications is potentially dangerous. Applications must perform input validation as soon as it is practical to do so and "fail fast" if invalid data is detected before any significant processing, storage or transmission of the data occurs.

In some circumstances it may be impractical to fail outright with an exception or other error when invalid data is encountered and an application may be designed to filter out the invalid data instead. This design pattern should only be used when necessary since it can make it difficult to detect attacks against the system.

User-supplied input data can come directly from users or from other computer systems outside the trust boundary of the application. Do not assume that user-supplied data from external systems outside your direct control has been properly validated. Note that malicious input may be injected while data is in transit or storage and not just at its initial entry point.

Failure to perform input validation may be a direct cause of code injection, data corruption, business logic/authorization vulnerabilities or denial of service, regardless of platform or technology, protocol or environment.

For web applications, consider that in addition to visible form fields, entry points also include hidden form fields, query strings, cookies, HTTP headers, XMLHttpRequests and web service parameters.

Where possible, use well-terminated regular expressions and an allow-list (known, valid, and safe input) rather than a deny-list (rejecting a known list of malicious or dangerous input). Deny-lists are often incomplete and easily circumvented.

Use Unicode categories in your regular expressions to help ensure that your input validation works for all locales and does not erroneously reject legitimate foreign language characters.

Write automated tests for your application that verify it handles invalid input correctly.

Log input validation failures to aid in attack detection and forensic analysis.

Code analysis tools can assist in locating code without validation checks. Use of tools will not find all instances but provides a good indicator of commonly missed validation points.

Validating URL input
Use Standard Libraries: Leverage standard libraries for URL parsing to ensure that the URL is split correctly into its components.
Normalize URLs: Convert URLs to a standard format to avoid discrepancies and potential security issues. Use URI Builder to create URIs in C#.
Validate URL Components: Ensure that only expected/allowed URL components are used, especially if your application interacts with external resources.
Handle Encoding: Ensure that all URLs tainted with customer input are decoded to avoid misinterpretation of the URL components using HttpUtility.UrlDecode. Use a loop to continue to decode the string until no further decoding is possible (i.e. the decoded string remains the same). This method ensures that even if the string is encoded multiple times, it will be fully decoded upon exiting the loop.
Restrict HTTP Verbs: When allowing customers to define their own outgoing web requests in a REST API, you must limit the HTTP verbs they can use. Determine the specific actions that customers need to perform and limit the available HTTP verbs to only those that are necessary for the intended functionality.
Preventing Abuse of URL Components
Each part of the URL can be a potential vector for abuse. Here's how to mitigate risks associated with each component:

Host Validation
Allowlist Domains: Only allow requests to known, trusted domains to prevent Server-Side Request Forgery (SSRF) attacks. If this is not possible, please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn about alternate approaches to protect against SSRF.
DNS Resolution Checks: Perform DNS resolution to validate the host and prevent DNS rebinding attacks.
HTTP Redirect Checks: Ensure that each HTTP redirect is validated to confirm the legitimacy of the host prior to establishing a connection. To enhance security, disable auto-redirects if your service does not depend on them and if doing so will not disrupt your application.
Certificate Validation: For https schemes, ensure that the SSL/TLS certificate is valid, matches the host, and is not expired or revoked.
Path Validation
Directory Traversal Prevention: Sanitize input to prevent directory traversal attacks by enforcing an allow-list of values or alphanumeric characters. Characters like .. or % that can navigate the file system should not be allowed.
Map to Safe Resources: Ensure that the path maps to a resource that the user is authorized to access. You may need to reevaluate user authorization after path canonicalization.
Additional guidance for mitigating path traversal vulnerabilities is linked for C#, Java, and JavaScript.
Query Parameter Validation
Strict Type Checking: Enforce strict type checking on query parameters to prevent injection attacks.
Length Limits: Impose length limits on parameters to prevent buffer overflow vulnerabilities.
Allowlist Parameters: Only accept known parameters and discard any unexpected or unnecessary ones.
Fragment Handling
Client-Side Caution: Since fragments are handled client-side, ensure that any logic processing the fragment is secure against Cross-Site Scripting (XSS) attacks.
Negative Test Cases to validate URL input
The following test cases are intentionally crafted to be malicious, and should be included in unit tests in your project to ensure they are appropriately rejected to prevent abuse.

Dangerous URL Hostnames
URL hostnames can be abused to cause Server-Side Request Forgery attacks. Please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn more.

All these hostnames are dangerous, and must be rejected by your application. https://localHoSt/ https://127.0.1.2/ https://0177.0.23.19/ https://2130706433/ https://0x7f.00331.0246.174/ https://[::1]/ https://[fc00::]/ https://169.254.169.254/

Dangerous URL Paths
URL paths can be abused to cause Path Traversal attacks.

All these paths are indicative of an attempt to traverse the URL path, and must be rejected by your application. /../../OtherPath/ /..//OtherPath/ /%2E%2E%2f/OtherPath/

Frequently Asked Questions
Do I need to validate Request.ServerVariables collection?

Request.ServerVariables may be applicable as some elements in this collection may have user data or may be tampered with, including but not limited to HTTP_<HeaderName>

SERVER_NAME
URL
REMOTE_ADDR
REMOTE_HOST
REMOTE_USER
For example, Request.ServerVariables, also called HTTP_REFERER, is used many times by attackers for spoofing. Applications consuming this variable without validation may fall into a trap and process a malicious request that appears to originate from a trusted URL.

What security attacks are possible if input is not validated before being consumed?

The following attacks are possible in this scenario:

Cross site scripting
Server-side Request Forgery
Path Traversal
SQL injection
XML injection
OS Command injection
LDAP injection
Canonicalization issues
Integer Overflow/Underflow
Response Splitting
Data Tampering
Code Injection
What is the difference between deny-listing and allow-listing approaches for validating input?

Each approach has a different strategy.

The deny-list approach, also called an exclusions list

In this approach, the developer tries to imagine all the bad input that may find its way into the application, and then rejects all these specific inputs. All other data is accepted. These are just a few of the inputs to look out for:

User Input Expected: First Name
Regular Expression: (<|<|<)( |\s)*(script|applet|embed|))
The deny-list strategy is a weak protection mechanism because you cannot brainstorm all the bad characters that attackers will use for a particular attack. Deny-listing depends heavily on the attacker's next moves, and therefore has to be continuously updated and changed. As new attack techniques come out, the list becomes outdated and requires constant monitoring.

The allow-list approach, also called an inclusions list

The allow-list strategy compares foreign user input to specific input that will be treated as acceptable. For example:

User Input Expected: First Name
Regular Expression: [a-z A-Z-]
This is an allow-list of all known good inputs. It permits only caps A to Z and small a-z, and discards all other input.

Allow-list filtering gives more control to the programmer as it is a restrictive kind of filtering mechanism. Allow-listing offers much better protection when the programmer has a good idea of the type of input expected for the application.

What is OWASP & CWE/SANS ranking for this vulnerability?

The industry standard Open Worldwide Application Security Project (OWASP) has identified this class of defect as one of their Top 10 security defects for 2021. See A03:2021-Injection

The industry standard Common Weakness Enumeration (CWE/SANS) v1.7 has identified this class of defect as one of their Top 25 security defects. See CWE-20: Improper Input Validation

The industry standard Common Weakness Enumeration (CWE/SANS) Dictionary has identified these related classes of defect as common security vulnerabilities:

CWE-183: Permissive List of Allowed Inputs
CWE-184: Incomplete List of Disallowed Inputs
CWE-185: Incorrect Regular Expression
All inputs from the user or user-supplied data from external systems are validated on the server side before being consumed.
Numeric data is validated for range (upper and lower bound) as well as type (signed vs. unsigned).
String data is validated for length, format, and character set.
If the string data is being used to form a URL, please consult the Server-Side Request Forgery Guidance.
All data must, where possible, be validated against a list of permissible values rather than a list of values to be rejected.
Guidance:

Developers must use defensive coding techniques and assume that all user-supplied input data received by their applications is potentially dangerous. Applications must perform input validation as soon as it is practical to do so and "fail fast" if invalid data is detected before any significant processing, storage or transmission of the data occurs.

In some circumstances it may be impractical to fail outright with an exception or other error when invalid data is encountered and an application may be designed to filter out the invalid data instead. This design pattern should only be used when necessary since it can make it difficult to detect attacks against the system.

User-supplied input data can come directly from users or from other computer systems outside the trust boundary of the application. Do not assume that user-supplied data from external systems outside your direct control has been properly validated. Note that malicious input may be injected while data is in transit or storage and not just at its initial entry point.

Failure to perform input validation may be a direct cause of code injection, data corruption, business logic/authorization vulnerabilities or denial of service, regardless of platform or technology, protocol or environment.

For web applications, consider that in addition to visible form fields, entry points also include hidden form fields, query strings, cookies, HTTP headers, XMLHttpRequests and web service parameters.

Where possible, use well-terminated regular expressions and an allow-list (known, valid, and safe input) rather than a deny-list (rejecting a known list of malicious or dangerous input). Deny-lists are often incomplete and easily circumvented.

Use Unicode categories in your regular expressions to help ensure that your input validation works for all locales and does not erroneously reject legitimate foreign language characters.

Write automated tests for your application that verify it handles invalid input correctly.

Log input validation failures to aid in attack detection and forensic analysis.

Code analysis tools can assist in locating code without validation checks. Use of tools will not find all instances but provides a good indicator of commonly missed validation points.

Validating URL input
Use Standard Libraries: Leverage standard libraries for URL parsing to ensure that the URL is split correctly into its components.
Normalize URLs: Convert URLs to a standard format to avoid discrepancies and potential security issues. Use URI Builder to create URIs in C#.
Validate URL Components: Ensure that only expected/allowed URL components are used, especially if your application interacts with external resources.
Handle Encoding: Ensure that all URLs tainted with customer input are decoded to avoid misinterpretation of the URL components using HttpUtility.UrlDecode. Use a loop to continue to decode the string until no further decoding is possible (i.e. the decoded string remains the same). This method ensures that even if the string is encoded multiple times, it will be fully decoded upon exiting the loop.
Restrict HTTP Verbs: When allowing customers to define their own outgoing web requests in a REST API, you must limit the HTTP verbs they can use. Determine the specific actions that customers need to perform and limit the available HTTP verbs to only those that are necessary for the intended functionality.
Preventing Abuse of URL Components
Each part of the URL can be a potential vector for abuse. Here's how to mitigate risks associated with each component:

Host Validation
Allowlist Domains: Only allow requests to known, trusted domains to prevent Server-Side Request Forgery (SSRF) attacks. If this is not possible, please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn about alternate approaches to protect against SSRF.
DNS Resolution Checks: Perform DNS resolution to validate the host and prevent DNS rebinding attacks.
HTTP Redirect Checks: Ensure that each HTTP redirect is validated to confirm the legitimacy of the host prior to establishing a connection. To enhance security, disable auto-redirects if your service does not depend on them and if doing so will not disrupt your application.
Certificate Validation: For https schemes, ensure that the SSL/TLS certificate is valid, matches the host, and is not expired or revoked.
Path Validation
Directory Traversal Prevention: Sanitize input to prevent directory traversal attacks by enforcing an allow-list of values or alphanumeric characters. Characters like .. or % that can navigate the file system should not be allowed.
Map to Safe Resources: Ensure that the path maps to a resource that the user is authorized to access. You may need to reevaluate user authorization after path canonicalization.
Additional guidance for mitigating path traversal vulnerabilities is linked for C#, Java, and JavaScript.
Query Parameter Validation
Strict Type Checking: Enforce strict type checking on query parameters to prevent injection attacks.
Length Limits: Impose length limits on parameters to prevent buffer overflow vulnerabilities.
Allowlist Parameters: Only accept known parameters and discard any unexpected or unnecessary ones.
Fragment Handling
Client-Side Caution: Since fragments are handled client-side, ensure that any logic processing the fragment is secure against Cross-Site Scripting (XSS) attacks.
Negative Test Cases to validate URL input
The following test cases are intentionally crafted to be malicious, and should be included in unit tests in your project to ensure they are appropriately rejected to prevent abuse.

Dangerous URL Hostnames
URL hostnames can be abused to cause Server-Side Request Forgery attacks. Please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn more.

All these hostnames are dangerous, and must be rejected by your application. https://localHoSt/ https://127.0.1.2/ https://0177.0.23.19/ https://2130706433/ https://0x7f.00331.0246.174/ https://[::1]/ https://[fc00::]/ https://169.254.169.254/

Dangerous URL Paths
URL paths can be abused to cause Path Traversal attacks.

All these paths are indicative of an attempt to traverse the URL path, and must be rejected by your application. /../../OtherPath/ /..//OtherPath/ /%2E%2E%2f/OtherPath/

Frequently Asked Questions
Do I need to validate Request.ServerVariables collection?

Request.ServerVariables may be applicable as some elements in this collection may have user data or may be tampered with, including but not limited to HTTP_<HeaderName>

SERVER_NAME
URL
REMOTE_ADDR
REMOTE_HOST
REMOTE_USER
For example, Request.ServerVariables, also called HTTP_REFERER, is used many times by attackers for spoofing. Applications consuming this variable without validation may fall into a trap and process a malicious request that appears to originate from a trusted URL.

What security attacks are possible if input is not validated before being consumed?

The following attacks are possible in this scenario:

Cross site scripting
Server-side Request Forgery
Path Traversal
SQL injection
XML injection
OS Command injection
LDAP injection
Canonicalization issues
Integer Overflow/Underflow
Response Splitting
Data Tampering
Code Injection
What is the difference between deny-listing and allow-listing approaches for validating input?

Each approach has a different strategy.

The deny-list approach, also called an exclusions list

In this approach, the developer tries to imagine all the bad input that may find its way into the application, and then rejects all these specific inputs. All other data is accepted. These are just a few of the inputs to look out for:

User Input Expected: First Name
Regular Expression: (<|<|<)( |\s)*(script|applet|embed|))
The deny-list strategy is a weak protection mechanism because you cannot brainstorm all the bad characters that attackers will use for a particular attack. Deny-listing depends heavily on the attacker's next moves, and therefore has to be continuously updated and changed. As new attack techniques come out, the list becomes outdated and requires constant monitoring.

The allow-list approach, also called an inclusions list

The allow-list strategy compares foreign user input to specific input that will be treated as acceptable. For example:

User Input Expected: First Name
Regular Expression: [a-z A-Z-]
This is an allow-list of all known good inputs. It permits only caps A to Z and small a-z, and discards all other input.

Allow-list filtering gives more control to the programmer as it is a restrictive kind of filtering mechanism. Allow-listing offers much better protection when the programmer has a good idea of the type of input expected for the application.

What is OWASP & CWE/SANS ranking for this vulnerability?

The industry standard Open Worldwide Application Security Project (OWASP) has identified this class of defect as one of their Top 10 security defects for 2021. See A03:2021-Injection

The industry standard Common Weakness Enumeration (CWE/SANS) v1.7 has identified this class of defect as one of their Top 25 security defects. See CWE-20: Improper Input Validation

The industry standard Common Weakness Enumeration (CWE/SANS) Dictionary has identified these related classes of defect as common security vulnerabilities:

CWE-183: Permissive List of Allowed Inputs
CWE-184: Incomplete List of Disallowed Inputs
CWE-185: Incorrect Regular ExpressionRequirement:

The following input validations must be performed on all input data from user-provided and user-controlled sources prior to being used:

All inputs from the user or user-supplied data from external systems are validated on the server side before being consumed.
Numeric data is validated for range (upper and lower bound) as well as type (signed vs. unsigned).
String data is validated for length, format, and character set.
If the string data is being used to form a URL, please consult the Server-Side Request Forgery Guidance.
All data must, where possible, be validated against a list of permissible values rather than a list of values to be rejected.
Guidance:

Developers must use defensive coding techniques and assume that all user-supplied input data received by their applications is potentially dangerous. Applications must perform input validation as soon as it is practical to do so and "fail fast" if invalid data is detected before any significant processing, storage or transmission of the data occurs.

In some circumstances it may be impractical to fail outright with an exception or other error when invalid data is encountered and an application may be designed to filter out the invalid data instead. This design pattern should only be used when necessary since it can make it difficult to detect attacks against the system.

User-supplied input data can come directly from users or from other computer systems outside the trust boundary of the application. Do not assume that user-supplied data from external systems outside your direct control has been properly validated. Note that malicious input may be injected while data is in transit or storage and not just at its initial entry point.

Failure to perform input validation may be a direct cause of code injection, data corruption, business logic/authorization vulnerabilities or denial of service, regardless of platform or technology, protocol or environment.

For web applications, consider that in addition to visible form fields, entry points also include hidden form fields, query strings, cookies, HTTP headers, XMLHttpRequests and web service parameters.

Where possible, use well-terminated regular expressions and an allow-list (known, valid, and safe input) rather than a deny-list (rejecting a known list of malicious or dangerous input). Deny-lists are often incomplete and easily circumvented.

Use Unicode categories in your regular expressions to help ensure that your input validation works for all locales and does not erroneously reject legitimate foreign language characters.

Write automated tests for your application that verify it handles invalid input correctly.

Log input validation failures to aid in attack detection and forensic analysis.

Code analysis tools can assist in locating code without validation checks. Use of tools will not find all instances but provides a good indicator of commonly missed validation points.

Validating URL input
Use Standard Libraries: Leverage standard libraries for URL parsing to ensure that the URL is split correctly into its components.
Normalize URLs: Convert URLs to a standard format to avoid discrepancies and potential security issues. Use URI Builder to create URIs in C#.
Validate URL Components: Ensure that only expected/allowed URL components are used, especially if your application interacts with external resources.
Handle Encoding: Ensure that all URLs tainted with customer input are decoded to avoid misinterpretation of the URL components using HttpUtility.UrlDecode. Use a loop to continue to decode the string until no further decoding is possible (i.e. the decoded string remains the same). This method ensures that even if the string is encoded multiple times, it will be fully decoded upon exiting the loop.
Restrict HTTP Verbs: When allowing customers to define their own outgoing web requests in a REST API, you must limit the HTTP verbs they can use. Determine the specific actions that customers need to perform and limit the available HTTP verbs to only those that are necessary for the intended functionality.
Preventing Abuse of URL Components
Each part of the URL can be a potential vector for abuse. Here's how to mitigate risks associated with each component:

Host Validation
Allowlist Domains: Only allow requests to known, trusted domains to prevent Server-Side Request Forgery (SSRF) attacks. If this is not possible, please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn about alternate approaches to protect against SSRF.
DNS Resolution Checks: Perform DNS resolution to validate the host and prevent DNS rebinding attacks.
HTTP Redirect Checks: Ensure that each HTTP redirect is validated to confirm the legitimacy of the host prior to establishing a connection. To enhance security, disable auto-redirects if your service does not depend on them and if doing so will not disrupt your application.
Certificate Validation: For https schemes, ensure that the SSL/TLS certificate is valid, matches the host, and is not expired or revoked.
Path Validation
Directory Traversal Prevention: Sanitize input to prevent directory traversal attacks by enforcing an allow-list of values or alphanumeric characters. Characters like .. or % that can navigate the file system should not be allowed.
Map to Safe Resources: Ensure that the path maps to a resource that the user is authorized to access. You may need to reevaluate user authorization after path canonicalization.
Additional guidance for mitigating path traversal vulnerabilities is linked for C#, Java, and JavaScript.
Query Parameter Validation
Strict Type Checking: Enforce strict type checking on query parameters to prevent injection attacks.
Length Limits: Impose length limits on parameters to prevent buffer overflow vulnerabilities.
Allowlist Parameters: Only accept known parameters and discard any unexpected or unnecessary ones.
Fragment Handling
Client-Side Caution: Since fragments are handled client-side, ensure that any logic processing the fragment is secure against Cross-Site Scripting (XSS) attacks.
Negative Test Cases to validate URL input
The following test cases are intentionally crafted to be malicious, and should be included in unit tests in your project to ensure they are appropriately rejected to prevent abuse.

Dangerous URL Hostnames
URL hostnames can be abused to cause Server-Side Request Forgery attacks. Please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn more.

All these hostnames are dangerous, and must be rejected by your application. https://localHoSt/ https://127.0.1.2/ https://0177.0.23.19/ https://2130706433/ https://0x7f.00331.0246.174/ https://[::1]/ https://[fc00::]/ https://169.254.169.254/

Dangerous URL Paths
URL paths can be abused to cause Path Traversal attacks.

All these paths are indicative of an attempt to traverse the URL path, and must be rejected by your application. /../../OtherPath/ /..//OtherPath/ /%2E%2E%2f/OtherPath/

Frequently Asked Questions
Do I need to validate Request.ServerVariables collection?

Request.ServerVariables may be applicable as some elements in this collection may have user data or may be tampered with, including but not limited to HTTP_<HeaderName>

SERVER_NAME
URL
REMOTE_ADDR
REMOTE_HOST
REMOTE_USER
For example, Request.ServerVariables, also called HTTP_REFERER, is used many times by attackers for spoofing. Applications consuming this variable without validation may fall into a trap and process a malicious request that appears to originate from a trusted URL.

What security attacks are possible if input is not validated before being consumed?

The following attacks are possible in this scenario:

Cross site scripting
Server-side Request Forgery
Path Traversal
SQL injection
XML injection
OS Command injection
LDAP injection
Canonicalization issues
Integer Overflow/Underflow
Response Splitting
Data Tampering
Code Injection
What is the difference between deny-listing and allow-listing approaches for validating input?

Each approach has a different strategy.

The deny-list approach, also called an exclusions list

In this approach, the developer tries to imagine all the bad input that may find its way into the application, and then rejects all these specific inputs. All other data is accepted. These are just a few of the inputs to look out for:

User Input Expected: First Name
Regular Expression: (<|<|<)( |\s)*(script|applet|embed|))
The deny-list strategy is a weak protection mechanism because you cannot brainstorm all the bad characters that attackers will use for a particular attack. Deny-listing depends heavily on the attacker's next moves, and therefore has to be continuously updated and changed. As new attack techniques come out, the list becomes outdated and requires constant monitoring.

The allow-list approach, also called an inclusions list

The allow-list strategy compares foreign user input to specific input that will be treated as acceptable. For example:

User Input Expected: First Name
Regular Expression: [a-z A-Z-]
This is an allow-list of all known good inputs. It permits only caps A to Z and small a-z, and discards all other input.

Allow-list filtering gives more control to the programmer as it is a restrictive kind of filtering mechanism. Allow-listing offers much better protection when the programmer has a good idea of the type of input expected for the application.

What is OWASP & CWE/SANS ranking for this vulnerability?

The industry standard Open Worldwide Application Security Project (OWASP) has identified this class of defect as one of their Top 10 security defects for 2021. See A03:2021-Injection

The industry standard Common Weakness Enumeration (CWE/SANS) v1.7 has identified this class of defect as one of their Top 25 security defects. See CWE-20: Improper Input Validation

The industry standard Common Weakness Enumeration (CWE/SANS) Dictionary has identified these related classes of defect as common security vulnerabilities:

CWE-183: Permissive List of Allowed Inputs
CWE-184: Incomplete List of Disallowed Inputs
CWE-185: Incorrect Regular ExpressionRequirement:

The following input validations must be performed on all input data from user-provided and user-controlled sources prior to being used:

All inputs from the user or user-supplied data from external systems are validated on the server side before being consumed.
Numeric data is validated for range (upper and lower bound) as well as type (signed vs. unsigned).
String data is validated for length, format, and character set.
If the string data is being used to form a URL, please consult the Server-Side Request Forgery Guidance.
All data must, where possible, be validated against a list of permissible values rather than a list of values to be rejected.
Guidance:

Developers must use defensive coding techniques and assume that all user-supplied input data received by their applications is potentially dangerous. Applications must perform input validation as soon as it is practical to do so and "fail fast" if invalid data is detected before any significant processing, storage or transmission of the data occurs.

In some circumstances it may be impractical to fail outright with an exception or other error when invalid data is encountered and an application may be designed to filter out the invalid data instead. This design pattern should only be used when necessary since it can make it difficult to detect attacks against the system.

User-supplied input data can come directly from users or from other computer systems outside the trust boundary of the application. Do not assume that user-supplied data from external systems outside your direct control has been properly validated. Note that malicious input may be injected while data is in transit or storage and not just at its initial entry point.

Failure to perform input validation may be a direct cause of code injection, data corruption, business logic/authorization vulnerabilities or denial of service, regardless of platform or technology, protocol or environment.

For web applications, consider that in addition to visible form fields, entry points also include hidden form fields, query strings, cookies, HTTP headers, XMLHttpRequests and web service parameters.

Where possible, use well-terminated regular expressions and an allow-list (known, valid, and safe input) rather than a deny-list (rejecting a known list of malicious or dangerous input). Deny-lists are often incomplete and easily circumvented.

Use Unicode categories in your regular expressions to help ensure that your input validation works for all locales and does not erroneously reject legitimate foreign language characters.

Write automated tests for your application that verify it handles invalid input correctly.

Log input validation failures to aid in attack detection and forensic analysis.

Code analysis tools can assist in locating code without validation checks. Use of tools will not find all instances but provides a good indicator of commonly missed validation points.

Validating URL input
Use Standard Libraries: Leverage standard libraries for URL parsing to ensure that the URL is split correctly into its components.
Normalize URLs: Convert URLs to a standard format to avoid discrepancies and potential security issues. Use URI Builder to create URIs in C#.
Validate URL Components: Ensure that only expected/allowed URL components are used, especially if your application interacts with external resources.
Handle Encoding: Ensure that all URLs tainted with customer input are decoded to avoid misinterpretation of the URL components using HttpUtility.UrlDecode. Use a loop to continue to decode the string until no further decoding is possible (i.e. the decoded string remains the same). This method ensures that even if the string is encoded multiple times, it will be fully decoded upon exiting the loop.
Restrict HTTP Verbs: When allowing customers to define their own outgoing web requests in a REST API, you must limit the HTTP verbs they can use. Determine the specific actions that customers need to perform and limit the available HTTP verbs to only those that are necessary for the intended functionality.
Preventing Abuse of URL Components
Each part of the URL can be a potential vector for abuse. Here's how to mitigate risks associated with each component:

Host Validation
Allowlist Domains: Only allow requests to known, trusted domains to prevent Server-Side Request Forgery (SSRF) attacks. If this is not possible, please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn about alternate approaches to protect against SSRF.
DNS Resolution Checks: Perform DNS resolution to validate the host and prevent DNS rebinding attacks.
HTTP Redirect Checks: Ensure that each HTTP redirect is validated to confirm the legitimacy of the host prior to establishing a connection. To enhance security, disable auto-redirects if your service does not depend on them and if doing so will not disrupt your application.
Certificate Validation: For https schemes, ensure that the SSL/TLS certificate is valid, matches the host, and is not expired or revoked.
Path Validation
Directory Traversal Prevention: Sanitize input to prevent directory traversal attacks by enforcing an allow-list of values or alphanumeric characters. Characters like .. or % that can navigate the file system should not be allowed.
Map to Safe Resources: Ensure that the path maps to a resource that the user is authorized to access. You may need to reevaluate user authorization after path canonicalization.
Additional guidance for mitigating path traversal vulnerabilities is linked for C#, Java, and JavaScript.
Query Parameter Validation
Strict Type Checking: Enforce strict type checking on query parameters to prevent injection attacks.
Length Limits: Impose length limits on parameters to prevent buffer overflow vulnerabilities.
Allowlist Parameters: Only accept known parameters and discard any unexpected or unnecessary ones.
Fragment Handling
Client-Side Caution: Since fragments are handled client-side, ensure that any logic processing the fragment is secure against Cross-Site Scripting (XSS) attacks.
Negative Test Cases to validate URL input
The following test cases are intentionally crafted to be malicious, and should be included in unit tests in your project to ensure they are appropriately rejected to prevent abuse.

Dangerous URL Hostnames
URL hostnames can be abused to cause Server-Side Request Forgery attacks. Please refer to the SDL Guidance to implement protections against Server Side Request Forgery (SSRF) attacks to learn more.

All these hostnames are dangerous, and must be rejected by your application. https://localHoSt/ https://127.0.1.2/ https://0177.0.23.19/ https://2130706433/ https://0x7f.00331.0246.174/ https://[::1]/ https://[fc00::]/ https://169.254.169.254/

Dangerous URL Paths
URL paths can be abused to cause Path Traversal attacks.

All these paths are indicative of an attempt to traverse the URL path, and must be rejected by your application. /../../OtherPath/ /..//OtherPath/ /%2E%2E%2f/OtherPath/

Frequently Asked Questions
Do I need to validate Request.ServerVariables collection?

Request.ServerVariables may be applicable as some elements in this collection may have user data or may be tampered with, including but not limited to HTTP_<HeaderName>

SERVER_NAME
URL
REMOTE_ADDR
REMOTE_HOST
REMOTE_USER
For example, Request.ServerVariables, also called HTTP_REFERER, is used many times by attackers for spoofing. Applications consuming this variable without validation may fall into a trap and process a malicious request that appears to originate from a trusted URL.

What security attacks are possible if input is not validated before being consumed?

The following attacks are possible in this scenario:

Cross site scripting
Server-side Request Forgery
Path Traversal
SQL injection
XML injection
OS Command injection
LDAP injection
Canonicalization issues
Integer Overflow/Underflow
Response Splitting
Data Tampering
Code Injection
What is the difference between deny-listing and allow-listing approaches for validating input?

Each approach has a different strategy.

The deny-list approach, also called an exclusions list

In this approach, the developer tries to imagine all the bad input that may find its way into the application, and then rejects all these specific inputs. All other data is accepted. These are just a few of the inputs to look out for:

User Input Expected: First Name
Regular Expression: (<|<|<)( |\s)*(script|applet|embed|))
The deny-list strategy is a weak protection mechanism because you cannot brainstorm all the bad characters that attackers will use for a particular attack. Deny-listing depends heavily on the attacker's next moves, and therefore has to be continuously updated and changed. As new attack techniques come out, the list becomes outdated and requires constant monitoring.

The allow-list approach, also called an inclusions list

The allow-list strategy compares foreign user input to specific input that will be treated as acceptable. For example:

User Input Expected: First Name
Regular Expression: [a-z A-Z-]
This is an allow-list of all known good inputs. It permits only caps A to Z and small a-z, and discards all other input.

Allow-list filtering gives more control to the programmer as it is a restrictive kind of filtering mechanism. Allow-listing offers much better protection when the programmer has a good idea of the type of input expected for the application.

What is OWASP & CWE/SANS ranking for this vulnerability?

The industry standard Open Worldwide Application Security Project (OWASP) has identified this class of defect as one of their Top 10 security defects for 2021. See A03:2021-Injection

The industry standard Common Weakness Enumeration (CWE/SANS) v1.7 has identified this class of defect as one of their Top 25 security defects. See CWE-20: Improper Input Validation

The industry standard Common Weakness Enumeration (CWE/SANS) Dictionary has identified these related classes of defect as common security vulnerabilities:

CWE-183: Permissive List of Allowed Inputs
CWE-184: Incomplete List of Disallowed Inputs
CWE-185: Incorrect Regular Expression

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions