- 
                Notifications
    You must be signed in to change notification settings 
- Fork 9
[PROPOSAL] Transient Execution Weaknesses #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Fixed formatting issues that emerged after the docx->md translation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied Gananand's and Steve's comments into this GitHub PR.
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | New CWE Proposals | ||
| -------------------------------- | ||
|  | ||
| ### CWE-A: Processor Event Causes Transient Execution | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gananand: Should CWE-A be moved to the bottom?
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | example, the attacker may be able to infer program data that was | ||
| accessed or used by those operations. | ||
|  | ||
| ### CWE-B: Transient Data Forwarding from an Operation that Triggers a Processor Event | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gananand: Based on the description, perhaps this should be “Incorrect forwarding of transient data”? Or “Incorrect forwarding of transient data that is observable after an architectural state commit”?
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | attacker to infer program data, such as the incorrect data forwarded by | ||
| the operation that triggered the assist. | ||
|  | ||
| ### CWE-C: Transient Execution Influenced by Shared Microarchitectural Predictor State | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gananand: This one seems very close to the issue already. Perhaps this could be “Improper Isolation of Hardware Domains through Shared Micro-architectural State and Transient Execution”. How would this differ from CWE-1189: Improper Isolation of Shared Resources on a System-On-Chip?
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | sharing across domain transitions, these features may be always-on, on | ||
| by default, or may require opt-in from software. | ||
|  | ||
| ### CWE-D: Microarchitectural Predictor Causes Transient Execution | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gananand: This sounds like “Incorrectly implemented microarchitectural predictors lead to incorrect transient execution after misprediction.” What is the weakness here? Is it that after misprediction, the transient state is not cleaned up or is that transient state is not shutdown/released etc.? How is this different from CWE-C? Is it related to inferrability/observability of the transient state? What would be the mitigation here? Steven M Christey suggested perhaps this is “improper isolation of code/data of multiple users to separate hardware domains”
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | sharing across domain transitions, these features may be always-on, on | ||
| by default, or may require opt-in from software. | ||
|  | ||
| ### CWE-D: Microarchitectural Predictor Causes Transient Execution | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steven: The key issue here seems to be that the operations "affect observable microarchitectural state in a manner that could allow an attacker to infer program data" (said in the ext desc). If that's the key point, then it should be emphasized in the main desc
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | New CWE Proposals | ||
| -------------------------------- | ||
|  | ||
| ### CWE-A: Processor Event Causes Transient Execution | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gananand: Based on the description, perhaps this is “Enabling of optimizations during sensitive operations that can lead to observable side effects from transient execution due to processor events”? Potentially multiple weaknesses here, still not clear what the weakness is, attack focused, focused on optimization of out-of-order processor. Seems to be observable discrepancies related. What is the perspective of the weakness? Hardware designer, software user or hardware implementor? What would be the mitigation to these weaknesses? Could it be to turn off these optimizations or maybe it is a parameter for an implementor or user to use?
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | New CWE Proposals | ||
| -------------------------------- | ||
|  | ||
| ### CWE-A: Processor Event Causes Transient Execution | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steven: I'm confused about why this is a concern. Isn't transient execution a normal, expected behavior? So, being able to cause it doesn't seem like an issue. Is the key issue about "allowing transient execution with microarchitectural side effects that can be observed by an adversary"?
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | “attacker” to align with the [CWE | ||
| glossary](https://cwe.mitre.org/documents/glossary/). | ||
|  | ||
| - “Transient,” “transient execution,” “transient operations,” etc. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gananand: We can do this.
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | “attacker” to align with the [CWE | ||
| glossary](https://cwe.mitre.org/documents/glossary/). | ||
|  | ||
| - “Transient,” “transient execution,” “transient operations,” etc. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steven: We can include a definition in the glossary but we might also want to explain it very briefly in a single sentence in the extended desc - something like the first sentence of the last paragraph of CWE-A.
        
          
                working-docs/transient.md
              
                Outdated
          
        
      | commit to architectural state” many times. Perhaps MITRE should | ||
| consider adding “transient” to its CWE glossary. | ||
|  | ||
| - I make liberal use of the term “processor event,” which is | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gananand: Should we perhaps make a list of all processor events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steven: If "all" events is too large, it would probably be good to list a few common/popular ones.
| Additional feedback from David: 
 | 
| Thanks Scott for adding all of our feedback and for the instructions! | 
…iented language. Specifically: - CWE-B describes the condition where transient operations are allowed to access and operate on data in a shared microarchitectural structure - CWE-C describes the condition where a hardware exception causes incorrect/stale data to be forwarded to dependent transient operations - CWE-D is only a renaming of CWE-C in the previous proposal. CWE-D describes the condition of sharing microarchitectural predictor state - CWE-E is only a renaming of CWE-D in the previous proposal. CWE-E describes the condition of a microarchitectural predictor causing transient execution - CWE-A is a catch-all for transient execution, and would be a parent of CWE-[B-E]. Since CWE-B and CWE-C have been refined into specific conditions, I saw no way to avoid introducing a catch-all.
| Here is a summary of the updated PR: 
 | 
| All: I have these suggestions for titles: Scott: | 
| Hi @BobH-MITRE , Thank you for the feedback! I have made some tweaks to the prepositions your proposed titles: In all cases, I believe that the exposure happens during transient execution when microarchitectural state is altered in a manner that corresponds to sensitive data. The sensitive data may later be recovered (or inferred) via a covert channel analysis technique. I also added the critical word "Shared" to CWE-B. I changed "through" to "caused by" in CWE-C because the incorrect data forwarding may not directly expose sensitive information--it could also be the case that the incorrect forwarded data is malicious data injected by the attacker, for example, to inject a pointer value that will be used to access sensitive information. I think that the "caused by" language generalizes the title to cover both of these scenarios. I agree that CWE-200 seems like a good parent candidate! But I also admit that I am not nearly familiar enough with the CWE landscape to ascertain that CWE-200 would be the best choice. I do not understand the critique about CWE-D and CWE-E. In CWE-D the weakness is shared microarchitectural predictor state, which, in accordance with the CWE definition, "could contribute to the introduction of vulnerabilities." In CWE-E the weakness is having a microarchitectural predictor that can cause transient execution, which can also contribute to the introduction of vulnerabilities. You asked, "Are these two CWEs the ones where you are trying to handle the cases for cross-domain boundaries and same address space?" CWE-D and CWE-E delineate between predictor-based vulnerabilities that arise from predictor state shared across domain boundaries, versus vulnerabilities that arise from abuse of a predictor within a domain boundary. CWE-B and CWE-C are intended to delineate between non-predictor-based vulnerabilities that expose data across a domain boundary, versus those that expose data within a domain boundary (though perhaps that exposed data can later be recovered by another domain). CWE-A can cover other idiosyncratic vulnerabilities such as Speculative Code Store Bypass (CVE-2021-0089) that share little in common with other transient execution vulnerabilities. | 
| I can go along with the title tweaks for CWE-A through CWE-C. I'll leave it to the through group if it is during, before, or after transient execution. For CWE-D and CWE-E, I think we are on the right track with these after your explanation, but maybe we need to tweak the lens a bit. You wrote, "In CWE-E the weakness is having a microarchitectural predictor that can cause transient execution, which can also contribute to the introduction of vulnerabilities." Isn't that normal behavior? I thought the issue was that the attacker has the ability to influence the predictor so they can cause transient execution when convenient. If my understanding here is correct, then my follow up question would be, "what is the mechanism that is in place that allows an attacker to influence predictor state?" | 
| 
 Having a microarchitectural predictor is a normal condition that leads to normal behavior. My understanding of the definition of "weakness" is that even a normal, acceptable condition can also contribute to the introduction of vulnerabilities. In the real world we have to live with these conditions while acknowledging and understanding their implications. 
 This comment certainly applies to CWE-D, where the weakness is that a hardware condition (shared predictor state) allows malicious software to influence the transient execution behavior of other software on the same system. With CWE-E the point is more subtle: if predictor state is not shared (or if there isn't any predictor state and a predictor is "static") then the predictor can contribute to vulnerabilities (and thus is a weakness), but software must also be a co-participant. This can happen in many different ways, but here are a couple practical examples: 
 | 
| 
 Can you try to incorporate some of the nuances here into the titles for CWE-D and CWE-E? As the titles stand now, they seem to describe normal behavior. I spent some time trying to come up with a suggestion, but I don't quite grasp the nuances of the space. | 
- We removed a CWE that applied exclusively to predictor-based transient execution not involving shared predictor state. We believe that CWE-A suffices to cover these cases. - Some of that CWE's extended description has been updated and merged into CWE-A. - There is a placeholder CWE-E that will cover "speculation oracle" weaknesses such as Pacman.


Please don't merge this PR right away! We can use the PR itself to collect feedback and address issues, without creating lots of commits and other traffic on the main repo.