-
Notifications
You must be signed in to change notification settings - Fork 161
Description
It seems RA fails, this is a two prone error for both pruntimev1 and v2.
for v2 you will get this error (which makes it hard to understand):
error 18; this may indicate that infrastructure for the epid attestation requested by gramine is missing on this machine
for v1 you will get more indicating error SGX_RA_TIMEOUT
There are 2 conditions that trigger this error:
- obviously networking issues, mostly DNS, e.g. docker container uses wrong DNS servers like 127.0.0.1 which obviously wont work
- the reply from intel is somehow delayed, mostly due to routing issues between ISP's and Microsoft Azure
We are specifically looking at issue number 2 here, where if you do a tcpdump you will notice that the reply is received later than 8 seconds (in my case between 8.2 and 11.7 seconds), which is long yes, but not a problem. Intel does not ratelimit like this, only send HTTP codes for that (see : https://www.intel.in/content/www/in/en/support/articles/000090552/software/intel-security-products.html)
The underlying code for this is :
`fn get_report_from_intel(quote: &[u8], ias_key: &str) -> Result<(String, String, String)> {
let encoded_quote = base64::encode(quote);
let encoded_json = format!("{{"isvEnclaveQuote":"{encoded_quote}"}}\r\n");
let mut res_body_buffer = Vec::new(); //container for body of a response
let timeout = Some(Duration::from_secs(8));
let url: reqwest::Url = format!("https://%7Bias_host%7D%7Bias_report_endpoint%7D%22%29.parse%28%29/?;
info!(from=%url, "Getting RA report");`
As we can see there is no catching fail here, or retry, and the 8 seconds is hardcoded. I would request that the timeout and amount of retries can be configured via ENV and this put in a retry/catch loop to solve this.