Skip to content

Commit a1df1ba

Browse files
Merge pull request ossf#98 from ossf/application-side
Clean up application-side text
2 parents e9deefb + ec81199 commit a1df1ba

File tree

1 file changed

+14
-8
lines changed

1 file changed

+14
-8
lines changed

secure_software_development_fundamentals.md

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2597,20 +2597,26 @@ can be confusing, so an example may help.
25972597
In the Node.js mysqljs/mysql library,
25982598
imagine that an attacker manages to provide
25992599
the JavaScript *object* `{password = 1}` as the password parameter
2600-
and it's used in the SQL query
2601-
`SELECT * FROM accounts WHERE username = ? AND password = ?`.
2600+
(this is not just a string, but an actual JavaScript object).
2601+
Now imagine that this object is used in the SQL query
2602+
<tt>SELECT &#42; FROM accounts WHERE username = ? AND password = ?</tt>
2603+
(note that this is parameterized).
26022604
The library will internally expand the expression after `AND`
2603-
into `password = ``password`` = 1`.
2604-
The MYSQL DBMS will interpret `password = ``password``` as 1 (true),
2605-
and then determine that `1 = 1` is true.
2605+
into <tt>password = &#96;password&#96; = 1</tt> because the library does simple
2606+
text replacement of the second `?`, without noticing that a JavaScript object
2607+
doesn't make sense in the context of this query (a string or number would
2608+
be expected here).
2609+
The MYSQL DBMS will interpret <tt>password = &#96;password&#96;</tt>
2610+
as 1 (true), and then determine that `1 = 1` is true.
26062611
The result: this expression will *always* be true.
26072612
This incorrect escaping of a complex data type
26082613
is enough to completely bypass authentication in some situations.
26092614

26102615
Unfortunately, this last issue can be a challenge to solve:
26112616

26122617
1. The safe solution is to make sure that complex data types
2613-
(types other than numbers and strings) are not expanded by the library
2618+
(types other than numbers and strings) are not expanded by
2619+
application-side libraries
26142620
unless the developer specifically marks them as allowed.
26152621
This may be impractical if the application already depends on this,
26162622
and the library might not provide a way to fully disable the functionality.
@@ -3207,7 +3213,7 @@ In that case, where possible, use libraries *already designed* to allow only wha
32073213

32083214
We have focused on escaping HTML, because that is the biggest problem in web applications. But HTML can embed other kinds of data, and of those, perhaps the most common are URLs.
32093215

3210-
Embedded URLs must also be escaped, and the rules for escaping URLs are different. The URL syntax is generally **scheme&#58;[//authority]path[?query][#fragment]**. For example, in the URL **<https://www.linuxfoundation.org/about/>**, the scheme is “**https**”, authority “<b>www.linuxfoundation.org</b>”, path is “**/about/**”, and this example has no query or fragment part. Sometimes you need special characters in the path, query, or fragment. The conventional way to escape those parts of the URLs is to first ensure the data is encoded with UTF-8, and escape as “**%hh**” (where **hh** is the hexadecimal representation) all bytes except for “safe” bytes, which are typically **A-Z**, **a-z**, **0-9**, “**.**”, “**-**”, “**&#42;**”, and “**&#95;**”. The Java routine **java.net.URLEncoder.encode()** turns all spaces into “**+**” instead of “**%20**”; both the “**+**” and “**%20**” conventions are in wide use.
3216+
Embedded URLs must also be escaped, and the rules for escaping URLs are different. The URL syntax is generally **scheme&#58;[//authority]path[?query]&#8202;[&#35;fragment]**. For example, in the URL **<https://www.linuxfoundation.org/about/>**, the scheme is “**https**”, authority “<b>www.linuxfoundation.org</b>”, path is “**/about/**”, and this example has no query or fragment part. Sometimes you need special characters in the path, query, or fragment. The conventional way to escape those parts of the URLs is to first ensure the data is encoded with UTF-8, and escape as “**%hh**” (where **hh** is the hexadecimal representation) all bytes except for “safe” bytes, which are typically **A-Z**, **a-z**, **0-9**, “**.**”, “**-**”, “**&#42;**”, and “**&#95;**”. The Java routine **java.net.URLEncoder.encode()** turns all spaces into “**+**” instead of “**%20**”; both the “**+**” and “**%20**” conventions are in wide use.
32113217

32123218
#### XSS Alternatives
32133219

@@ -3497,7 +3503,7 @@ This is true! Yes, this is a weird and subtle point. There is reason to hope tha
34973503

34983504
A Uniform Resource Locator (URL) is a way to refer to a specific web resource by location. Technically, a URL is a specific type of Uniform Resource Identifier (URI), but for our purposes we will use the terms interchangeably. As specified in [IETF RFC 3986](https://tools.ietf.org/html/rfc3986), a generic URI has this syntax:
34993505

3500-
**scheme:[//authority]path[?query][#fragment]**
3506+
**scheme:[//authority]path[?query]&#8202;[&#35;fragment]**
35013507

35023508
And **authority** has this syntax:
35033509

0 commit comments

Comments
 (0)