Many editorial fixes from Flavia Cioanca

david-a-wheeler · david-a-wheeler · commit 40127c1c55d8 · 2022-04-14T21:37:22.000-04:00
Thanks Flavia!

Signed-off-by: David A. Wheeler &lt;dwheeler@dwheeler.com&gt;
diff --git a/secure_software_development_fundamentals.md b/secure_software_development_fundamentals.md
@@ -1115,7 +1115,7 @@ Some consider the selection of reused software as part of design, since it clear
 
 If you are purchasing expensive software you selected on behalf of an organization, there are often many steps and processes to work through, primarily focused on controlling money. That is outside the scope of this course. Instead, we are going to focus on the specific aspects related to security.
 
-Many systems support installing extensions that are separately developed and maintained than the “core” program (often by different developers).  ***Extensions need to be separately evaluated before installing them***. The core system may be relatively secure, but that does not mean all its extensions are secure, and often the biggest risks are from the extensions. These extensions may be called many names including extensions, plug-ins, add-ons, themes, components, or packages. No matter what they’re called, evaluate them too. For example, PatchStack reported that while WordPress powered 43.2% of websites on the web in 2021, “vulnerabilities from plugins and themes remain as one of the biggest threats to websites built on WordPress.” They noted that only 0.58% of security vulnerabilities originate from WordPress core in 2021; the rest of the vulnerabilities were in components (plugins and themes). What’s worse, 29% of the WordPress plugins with critical vulnerabilities received no patch. This wouldn’t matter as much if few sites used components, but on average a WordPress website has 18 different components (plugins and themes) installed. See [*State Of WordPress Security In 2021*](https://patchstack.com/whitepaper/the-state-of-wordpress-security-in-2021/) by PatchStack for more information.
+Many systems support installing extensions that are separately developed and maintained than the “core” program (often by different developers). ***Extensions need to be separately evaluated before installing them***. The core system may be relatively secure, but that does not mean all its extensions are secure, and often the biggest risks are from the extensions. These extensions may be called many names including extensions, plug-ins, add-ons, themes, components, or packages. No matter what they’re called, evaluate them too. For example, PatchStack reported that while WordPress powered 43.2% of websites on the web in 2021, “vulnerabilities from plugins and themes remain as one of the biggest threats to websites built on WordPress.” They noted that only 0.58% of security vulnerabilities originate from WordPress core in 2021; the rest of the vulnerabilities were in components (plugins and themes). What’s worse, 29% of the WordPress plugins with critical vulnerabilities received no patch. This wouldn’t matter as much if few sites used components, but on average a WordPress website has 18 different components (plugins and themes) installed. See [*State Of WordPress Security In 2021*](https://patchstack.com/whitepaper/the-state-of-wordpress-security-in-2021/) by PatchStack for more information.
 
 We’ll use the term “reused software” here, because that is our concern. This reused software includes all the software you depend on when the software runs, aka its dependencies. In practice, the vast majority of the software you reuse will be open source software (OSS), so we will especially focus on tips when reusing OSS.
 
@@ -1810,22 +1810,22 @@ Applications often need to search for other resources, such as libraries, comman
 
 If an attacker can control the search path, the attacker can often cause the application to run malicious code, use attacker-controlled data, or reveal private data. For example, if an attacker can control the `PATH` environment variable, an attacker may be able to force the application to run unintended programs. The best solution is to ensure the attacker can’t control the search path, e.g., by not providing the opportunity or by setting the search path to a safe value before using it.
 
-A related problem is that the search path may contain a location that an attacker can control or influence in an unanticipated way. For example, if the `PATH` environment variable has an entry set to the current directory, that is, “.” or “” (blank), and the entry is listed before more trustworthy directories, then the current directory would be used first. This can become a vulnerability since the attacker may be able to eventually insert contents into a directory that the application will use as a current directory. On old Unix-like systems this insecure PATH setting was the default. One common way this mistake happens today is if the path is modified by concatenating the directory separator and then the new directory; if the path was empty to start with, doing this adds a blank entry as the first entry. For example, if PYTHONPATH was empty, naively concatenating “:” and directory /usr/share/foo would produce the PYTHONPATH “:/usr/share/foo”; notice that the empty directory is listed first, which would be interpreted as first searching the current directory. The solution in this case is to only insert the separator if there is already text there.
+A related problem is that the search path may contain a location that an attacker can control or influence in an unanticipated way. For example, if the `PATH` environment variable has an entry set to the current directory, that is, “.” or “” (blank), and the entry is listed before more trustworthy directories, then the current directory would be used first. This can become a vulnerability since the attacker may be able to eventually insert contents into a directory that the application will use as a current directory. On old Unix-like systems this insecure PATH setting was the default. One common way this mistake happens today is if the path is modified by concatenating the directory separator and then the new directory; if the path was empty to start with, doing this adds a blank entry as the first entry. For example, if `PYTHONPATH` was empty, naively concatenating “:” and directory `/usr/share/foo` would produce the `PYTHONPATH` “<tt>:/usr/share/foo</tt>”; notice that the empty directory is listed first, which would be interpreted as first searching the current directory. The solution in this case is to only insert the separator if there is already text there.
 
 There are several potential countermeasures for search path problems, for example:
 
 * On startup, examine a search path you’ll use (like `PATH`) for common errors such as including a blank directory or “.” before more trusted paths (like `/usr/bin`). On Windows systems, check if safe DLL search mode is enabled. You might halt, or at least warn, if the current system settings are dangerous.
-* Use full path names when making a request (e.g., calling executable programs, importing libraries, or requesting packages).  Most systems that support a search path also support directly requesting the specific component; making a direct request will ensure that you are requesting the right ones. This is a plausible hardening mechanism, but it is easy to forget doing this in some cases, and this does sometimes make it harder to port software between systems.
+* Use full path names when making a request (e.g., calling executable programs, importing libraries, or requesting packages). Most systems that support a search path also support directly requesting the specific component; making a direct request will ensure that you are requesting the right ones. This is a plausible hardening mechanism, but it is easy to forget doing this in some cases, and this does sometimes make it harder to port software between systems.
 
 Many configuration values, including many search paths, are provided via environment variables. Some execution environments, like client-side JavaScript, don’t have environment variables. In most other environments (client and server), environment variables exist, but are typically considered trusted (that is, environment variables can only be set by someone with authorization to set them). However, there are some special cases where their trust should be limited.
 
 Some historical operating systems had insecure settings of environment variables. One of the most common cases is that old operating systems had an unsafe PATH environment variable so the current directory “.” was searched for executables before more trustworthy directories. Similarly, some naive users set their PATH variable to insecure values, though thankfully this kind of mistake is less common today. It’s also very environment-specific; in many environments an attacker won’t be able to control the contents of any of the locations.
 
 However, there’s another important special case: if you are writing something called a **setuid** or **setgid** program, then *environment variables can come from an attacker*. A little introduction is probably in order. Unix-like systems (including Linux and MacOS) allow programs to be **setuid** and/or **setgid**. When a **setuid** program runs, it has the privileges of its *owner* (not its requestor). A **setgid** program runs with the privileges of its *group* (not the groups of its requestor). These kinds of programs inherit many inputs from a potential attacker, including the current directory value and environment variables. One solution is to not write a **setuid** or **setgid** program, as in many cases that approach is not needed today.
 
-If you *do* write a **setuid**/**setgid** program, your program must protect itself from all its inputs, and that includes the current directory and environment variables. Environment variables can be especially tricky, as there are many unsafe approaches that appear safe. That’s because environment variables are typically from trusted sources, so most developers aren’t prepared to deal the unusual case where they environment variables are *not* from trusted sources. The only safe solution is to (as part of startup) extract *only* the environment variables that are needed, ensure their values are safe, erase *all* environment variables, and reset the variables needed to safe values (including safe values provided on program startup). Erasing all environment variables in most programming languages is easy, simply set the global variable **environ** to a null pointer or its equivalent ([the **environ** variable is defined in the POSIX standard](https://pubs.opengroup.org/onlinepubs/9699919799/functions/environ.html)). Do this *early*, before creating any threads. You cannot simply remove a few environment variables; an attacker may create a bizarre environment variable data structure, and there are simply too many potentially-dangerous environment variables (such as **LD_LIBRARY_PATH**) to try to erase only certain dangerous values. This is yet another example of allowlisting instead of denylisting; only allow the few environment variables you need, with their allowed values, and nothing else. Instead, ensure that the only possible variables are ones you expect and have safe values for. This includes **PATH** and all other environment variables.
+If you *do* write a **setuid**/**setgid** program, your program must protect itself from all its inputs, and that includes the current directory and environment variables. Environment variables can be especially tricky, as there are many unsafe approaches that appear safe. That’s because environment variables are typically from trusted sources, so most developers aren’t prepared to deal with the unusual case where the environment variables are *not* from trusted sources. The only safe solution is to (as part of startup) extract *only* the environment variables that are needed, ensure their values are safe, erase *all* environment variables, and reset the variables needed to safe values (including safe values provided on program startup). Erasing all environment variables in most programming languages is easy, simply set the global variable **environ** to a null pointer or its equivalent ([the **environ** variable is defined in the POSIX standard](https://pubs.opengroup.org/onlinepubs/9699919799/functions/environ.html)). Do this *early*, before creating any threads. You cannot simply remove a few environment variables; an attacker may create a bizarre environment variable data structure, and there are simply too many potentially-dangerous environment variables (such as **LD_LIBRARY_PATH**) to try to erase only certain dangerous values. This is yet another example of allowlisting instead of denylisting; only allow the few environment variables you need, with their allowed values, and nothing else. Instead, ensure that the only possible variables are ones you expect and have safe values for. This includes **PATH** and all other environment variables.
 
-🔔 Untrusted search path is such a common cause of security vulnerabilities that it is 2019 CWE Top 25 #22. It is [CWE-426](https://cwe.mitre.org/data/definitions/426.html), *Untrusted Search Path*. 2021 CWE Top 25 #34 covers the related Uncontrolled Search Path Element ([CWE-427](https://cwe.mitre.org/data/definitions/427.html)).
+🔔 Untrusted search path is such a common cause of security vulnerabilities that it is 2019 CWE Top 25 #22. It is [CWE-426](https://cwe.mitre.org/data/definitions/426.html), *Untrusted Search Path*. 2021 CWE Top 25 #34 covers the related *Uncontrolled Search Path Element* ([CWE-427](https://cwe.mitre.org/data/definitions/427.html)).
 
 
 ### Quiz 1.9
@@ -1856,7 +1856,7 @@ Most larger systems need some mechanism to receive configuration information. Ma
 
 Some systems try to depend on *secure boot* or similar mechanisms to ensure that only specific software is run on a particular computer. Don’t take these mechanisms very seriously if the computer (such as a smartphone) may be physically controlled by a potential attacker. If an attacker has physical control over a device, then that attacker has ultimate control over the device. The reality is that secure boot systems have been repeatedly broken; trusting this to never happen in the future is ignoring the lessons of the past. You are better off designing your system so that you don’t need to trust the application on that device, but instead run software you need to trust on hardware controlled by someone you trust. Secure boot systems are far more powerful if the system is physically controlled by a trusted party, because then they are simply providing an additional protective measure for the one physically in control.
 
-🔔 Security misconfiguration is such a common mistake in web applications that it is 2017 OWASP Top 10 #6 and 2021 OWASP Top 10 #5. 2021 CWE Top 25 #19 [CWE-276](https://cwe.mitre.org/data/definitions/276.html) covers Incorrect Default Permissions.
+🔔 Security misconfiguration is such a common mistake in web applications that it is 2017 OWASP Top 10 #6 and 2021 OWASP Top 10 #5. 2021 CWE Top 25 #19 [CWE-276](https://cwe.mitre.org/data/definitions/276.html) covers *Incorrect Default Permissions*.
 
 ### Quiz 1.10
 
@@ -2010,7 +2010,7 @@ Type confusion isn’t limited to C and C++, however. Type confusion can happen
 
 For our purposes, conversions do not include determining if a value is truthy. In general, programming languages have conditional constructs (such as **if** and **while**) that will produce different results depending on whether or not a condition’s value is truthy. What is truthy is a key design decision when creating a programming language. For example, every value in JavaScript is considered truthy except for a specific list of falsy values (currently **false**, **0**, **-0**, **0n**, **“”**, **null**, **undefined**, and **NaN**). In such languages, **if p** and similar are a shorthand for checking if a value is truthy. This interpretation in conditionals might be considered a conversion from some other type into a boolean type, but such constructs are really just an abbreviated way to determine if a value is truthy, and that is not what we are concerned with here.
 
-🔔 *Incorrect Type Conversion or Cast* ([CWE-704](https://cwe.mitre.org/data/definitions/704.html)) is such a common cause of security vulnerabilities that it is 2019 CWE Top 25 #28. 2021 CWE Top 25 #36 refers to its special case Access of Resource Using Incompatible Type ('Type Confusion') ([CWE-843](https://cwe.mitre.org/data/definitions/843.html)).
+🔔 *Incorrect Type Conversion or Cast* ([CWE-704](https://cwe.mitre.org/data/definitions/704.html)) is such a common cause of security vulnerabilities that it is 2019 CWE Top 25 #28. 2021 CWE Top 25 #36 refers to its special case, *Access of Resource Using Incompatible Type ('Type Confusion')* ([CWE-843](https://cwe.mitre.org/data/definitions/843.html)).
 
 
 ### Quiz 2.3
@@ -2059,7 +2059,7 @@ One of the best-known attacker tricks is out-of-bounds reads and writes (includi
 
 One of the most common kinds of security vulnerabilities is where a read or write is *“out of bounds”* inside memory-unsafe code. Such vulnerabilities are common, and attackers find them easy to exploit. This problem has been well-known for a long time; Aleph One (Elias Levy) describes in detail in [*Smashing the Stack for Fun and Profit*](http://phrack.org/issues/49/14.html#article) (1996) how to exploit such vulnerabilities.
 
-🔔 Out-of-bounds reads and writes are so common and dangerous that in the 2021 CWE Top 25 list, the #1 weakness involves writes ([CWE-787](https://cwe.mitre.org/data/definitions/787.html) *Out-of-bounds Write*), the #3 weakness involves reads ([CWE-125](https://cwe.mitre.org/data/definitions/125.html) *Out-of-bounds Read*), and the general issue is #17 ([CWE-119](https://cwe.mitre.org/data/definitions/119.html) *Improper Restriction of Operations within the Bounds of a Memory Buffer*).  In the 2019 CWE Top 25 list the general issue is #1 ([CWE-119](https://cwe.mitre.org/data/definitions/119.html) *Improper Restriction of Operations within the Bounds of a Memory Buffer*), and specific cases of it are #5 ([CWE-125](https://cwe.mitre.org/data/definitions/125.html) *Out-of-bounds Read*) and #12 ([CWE-787](https://cwe.mitre.org/data/definitions/787.html) *Out-of-bounds Write*).
+🔔 Out-of-bounds reads and writes are so common and dangerous that in the 2021 CWE Top 25 list, the #1 weakness involves writes ([CWE-787](https://cwe.mitre.org/data/definitions/787.html) *Out-of-bounds Write*), the #3 weakness involves reads ([CWE-125](https://cwe.mitre.org/data/definitions/125.html) *Out-of-bounds Read*), and the general issue is #17 ([CWE-119](https://cwe.mitre.org/data/definitions/119.html) *Improper Restriction of Operations within the Bounds of a Memory Buffer*). In the 2019 CWE Top 25 list the general issue is #1 ([CWE-119](https://cwe.mitre.org/data/definitions/119.html) *Improper Restriction of Operations within the Bounds of a Memory Buffer*), and specific cases of it are #5 ([CWE-125](https://cwe.mitre.org/data/definitions/125.html) *Out-of-bounds Read*) and #12 ([CWE-787](https://cwe.mitre.org/data/definitions/787.html) *Out-of-bounds Write*).
 
 Here are the fundamentals. Almost all programs have to store intermediate results, and such storage areas are often called *buffers*. Reading and writing within that buffer is fine. But what happens when your program tries to read from or write to that buffer, but it tries to do that outside the range of that storage area? For example, here is a trivial fragment of a C program that allocates some array **x** of size 10 (index values 0 through 9), and later stores the value of **y** to the index value **i** of that array: