|
23 | 23 |
|
24 | 24 |
|
25 | 25 | /** |
26 | | - * The Encoder interface contains a number of methods for decoding input and encoding output |
27 | | - * so that it will be safe for a variety of interpreters. To prevent |
28 | | - * double-encoding, callers should make sure input does not already contain encoded characters |
29 | | - * by calling canonicalize. Validator implementations should call canonicalize on user input |
30 | | - * <b>before</b> validating to prevent encoded attacks. |
| 26 | + * The {@code Encoder} interface contains a number of methods for decoding input and encoding output |
| 27 | + * so that it will be safe for a variety of interpreters. Its primary use is to |
| 28 | + * provide <i>output</i> encoding to prevent XSS. |
31 | 29 | * <p> |
32 | | - * All of the methods must use a "whitelist" or "positive" security model. |
33 | | - * For the encoding methods, this means that all characters should be encoded, except for a specific list of |
34 | | - * "immune" characters that are known to be safe. |
35 | | - * <p> |
36 | | - * The Encoder performs two key functions, encoding and decoding. These functions rely |
| 30 | + * To prevent double-encoding, callers should make sure input does not already contain encoded characters |
| 31 | + * by calling one of the {@code canonicalize()} methods. Validator implementations should call |
| 32 | + * {@code canonicalize()} on user input <b>before</b> validating to prevent encoded attacks. |
| 33 | + * </p><p> |
| 34 | + * All of the methods <b>must</b> use an "allow list" or "positive" security model rather |
| 35 | + * than a "deny list" or "negative" security model. For the encoding methods, this means that |
| 36 | + * all characters should be encoded, except for a specific list of "immune" characters that are |
| 37 | + * known to be safe. |
| 38 | + * </p><p> |
| 39 | + * The {@code Encoder} performs two key functions, encoding and decoding. These functions rely |
37 | 40 | * on a set of codecs that can be found in the org.owasp.esapi.codecs package. These include: |
38 | | - * <ul><li>CSS Escaping</li> |
| 41 | + * <ul> |
| 42 | + * <li>CSS Escaping</li> |
39 | 43 | * <li>HTMLEntity Encoding</li> |
40 | 44 | * <li>JavaScript Escaping</li> |
41 | | - * <li>MySQL Escaping</li> |
42 | | - * <li>Oracle Escaping</li> |
| 45 | + * <li>MySQL Database Escaping</li> |
| 46 | + * <li>Oracle Database Escaping</li> |
43 | 47 | * <li>Percent Encoding (aka URL Encoding)</li> |
44 | | - * <li>Unix Escaping</li> |
| 48 | + * <li>Unix Shell Escaping</li> |
45 | 49 | * <li>VBScript Escaping</li> |
46 | | - * <li>Windows Encoding</li></ul> |
47 | | - * <p> |
| 50 | + * <li>Windows Cmd Escaping</li> |
| 51 | + * <li>LDAP Escaping</li> |
| 52 | + * <li>XML and XML Attribute Encoding</li> |
| 53 | + * <li>XPath Escaping</li> |
| 54 | + * <li>Base64 Encoding</li> |
| 55 | + * </ul> |
| 56 | + * </p><p> |
| 57 | + * The primary use of ESAPI {@code Encoder} is to prevent XSS vulnerabilities by |
| 58 | + * providing output encoding using the various "encodeFor<i>XYZ</i>()" methods, |
| 59 | + * where <i>XYZ</i> is one of CSS, HTML, HTMLAttribute, JavaScript, or URL. When |
| 60 | + * using the ESAPI output encoders, it is important that you use the one for the |
| 61 | + * <b>appropriate context</b> where the output will be rendered. For example, it |
| 62 | + * the output appears in an JavaScript context, you should use {@code encodeForJavaScript} |
| 63 | + * (note this includes all of the DOM JavaScript event handler attributes such as |
| 64 | + * 'onfocus', 'onclick', 'onload', etc.). If the output would be rendered in an HTML |
| 65 | + * attribute context (with the exception of the aforementioned 'onevent' type event |
| 66 | + * handler attributes), you would use {@code encodeForHTMLAttribute}. If you are |
| 67 | + * encoding anywhere a URL is expected (e.g., a 'href' attribute for for <a> or |
| 68 | + * a 'src' attribute on a <img> tag, etc.), then you should use use {@code encodeForURL}. |
| 69 | + * If encoding CSS, then use {@code encodeForCSS}. Etc. This is because there are |
| 70 | + * different escaping requirements for these different contexts. Developers who are |
| 71 | + * new to ESAPI or to defending against XSS vulnerabilities are highly encouraged to |
| 72 | + * <i>first</i> read the |
| 73 | + * <a href="https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html" target="_blank" rel="noopener noreferreer"> |
| 74 | + * OWASP Cross-Site Scripting Prevention Cheat Sheet</a>. |
| 75 | + * </p><p> |
| 76 | + * Note that in addition to these encoder methods, ESAPI also provides a JSP Tag |
| 77 | + * Library ({@code META-INF/esapi.tld}) in the ESAPI jar. This allows one to use |
| 78 | + * the more convenient JSP tags in JSPs. These * tags are simply wrappers for the |
| 79 | + * various "encodeForX<i>XYZ</i>()" methods. |
| 80 | + * </p><p> |
| 81 | + * <b>Some important final words:</b> |
| 82 | + * <ul> |
| 83 | + * <li><b>Where to output encode:</b> |
| 84 | + * Knowing where to place the output encoding in your code |
| 85 | + * is just as important as knowing which context (HTML, HTML attribute, CSS, |
| 86 | + * JavaScript, or URL) to use for the output encoding and surprisingly the two |
| 87 | + * are often related. In general, output encoding should be done just prior to the |
| 88 | + * output being rendered because that is what determines what the appropriate |
| 89 | + * context is for the output encoding. In fact, doing output encoding on |
| 90 | + * untrusted data that is stored and to be used later--whether stored in an HTTP |
| 91 | + * session or in a database--is almost always considered an anti-pattern. An |
| 92 | + * example of this is one gathers and stores some untrusted data item such as an |
| 93 | + * email address from a user. A developer thinks "let's output encode this and |
| 94 | + * store the encoded data in the database, thus making the untrusted data safe |
| 95 | + * to use, thus saving us all the encoding troubles later on". On the surface, |
| 96 | + * that sounds like a reasonable approach. The problem is how to know what |
| 97 | + * output encoding to use, not only for now, but for all possible <i>future</i> |
| 98 | + * uses? It might be that the current application code base is only using it in |
| 99 | + * an HTML contexxt that is displayed in an HTML report or shown in an HTML |
| 100 | + * context in the user's profile. But what it it is later used in a mailto: URL? |
| 101 | + * Then instead of HTML encoding, it would need to have URL encoding. Similarly, |
| 102 | + * what if there is a later switch made to use AJAX and the untrusted email |
| 103 | + * address gets used in a JavaScript context? The complication is that even if |
| 104 | + * you know with certainty today all the ways that an untrusted data item is |
| 105 | + * used in your application, it is genrally impossible to predict all the |
| 106 | + * contexts that it may be used in the future, not only in your application, but |
| 107 | + * in other applications that could access that data in the database. |
| 108 | + * </li> |
| 109 | + * <li><b>Avoiding multiple <i>nested</i> contexts:</b> |
| 110 | + * A really tricky situation to get correct is hen there are multiple nested |
| 111 | + * encoding contexts. But far, the most common place this seems to come up is |
| 112 | + * untrusted URLs used in JavaScript. How should you handle that? Well, to be |
| 113 | + * honest, the best way is to rewrite your code to avoid it. An example of |
| 114 | + * this that is well worth reading may be found at |
| 115 | + * <a href="https://lists.owasp.org/pipermail/esapi-dev/2012-March/002090" |
| 116 | + * target="_blank" rel="noopener noreferrer">ESAPI-DEV mailing list archives: |
| 117 | + * URL encoding within JavaScript</a>. Be sure to read the entire thread. |
| 118 | + * The question itself is too nuanced to be answered in Javadoc, but now, |
| 119 | + * hopefully you are at least aware of the potential pitfalls. |
| 120 | + * </li> |
| 121 | + * </ul> |
48 | 122 | * |
49 | 123 | * @author Jeff Williams (jeff.williams .at. aspectsecurity.com) <a |
50 | 124 | * href="http://www.aspectsecurity.com">Aspect Security</a> |
|
0 commit comments