Skip to content

Investigate symbolic propagation and metavariable-comparison #11

@0xdea

Description

@0xdea

Symbolic propagation is an experimental feature that used together with metavariable-comparison might help reduce some false positives. See an example related to out-of-bound reads at https://cc-sw.com/semgrep-guide-for-a-security-engineer-part-6-of-6/. Quoting from the article:

From my testing, I was able to add this constraint through the use of metavariable-comparison, where I used a Python regex expression (not PCRE2 used in the previous sections). Here, it’s ascertaining that the value of $PTR and $OFFSET are not known to be either a base-10 number or a hex value.

rules:
- id: memcpy_src_offset_is_not_checked
  languages:
  - c
  - cpp
  message: |
    The source of memcpy has a variable offset which may result in an OOB read
  severity: WARNING
  mode: search
  options:
    symbolic_propagation: true
  patterns:
  - pattern: $MEMCPY($DST, $SRC, $LEN, ...) #Note: memmove and memccpy has fourth argument
  - pattern-not: $MEMCPY($DST, $SRC, sizeof($SRC), ...)
  - metavariable-regex:
      metavariable: $MEMCPY
      regex: '(?i)^\w*mem(c)?cpy\w*\s*$|^bcopy$|^memmove$'
  - metavariable-pattern:
      metavariable: $SRC
      patterns:
      - pattern-either:
        - pattern: <... $PTR + $OFFSET ...>
        - pattern: <... $PTR[$OFFSET] ...>
      - metavariable-comparison:
          comparison: not re.match('(-)?(0x)?\d+(u)?(LL)?', str($PTR))
      - metavariable-comparison:
          comparison: not re.match('(-)?(0x)?\d+(u)?(LL)?', str($OFFSET))

This pattern takes advantage of the waterfall behavior of metavariable-comparison where the invocation of a metavariable (like $PTR or $OFFSET) binds to a literal or a constant (if available) and otherwise the code is kept unevaluated. We acquire the either the string representation or symbolic expression of the metavariable via str($MVAR). Then, to verify that it’s not an integer, we use the Python’s re.match() option. Note that we don’t write the r'' regex string symbol before the string.

Note that without the last two metavariable-comparison clauses, our query would have flagged both usages of memcpy in:

// https://cc-sw.com/semgrep-guide-for-a-security-engineer-part-6-of-6/
int main() {
  char src[32]; 
  read(STDIN_FILENO, src, sizeof(src));
  char* dst = malloc(64);
  memcpy(dst, src + 3, 29); //Safe
  printf("%s\n", dst);
 
  int offset;
  if(scanf("%d", &offset) != 1) 
      return -1;
  memcpy(dst, src + offset, 24); //Unsafe
  printf("%s\n", dst);
}

// https://cc-sw.com/semgrep-guide-for-a-security-engineer-part-6-of-6/
undefined8 main(void)
{
  int iVar1;
  undefined8 uVar2;
  long in_FS_OFFSET;
  int local_44;
  char *local_40;
  undefined1 local_38 [40];
  long local_10;
   
  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  read(0,local_38,0x20);
  local_40 = (char *)malloc(0x40);
  memcpy(local_40,local_38 + 3,0x1d);
  puts(local_40);
  iVar1 = __isoc99_scanf(&DAT_00102004,&local_44);
  if (iVar1 == 1) {
    memcpy(local_40,local_38 + local_44,0x18);
    puts(local_40);
    uVar2 = 0;
  }
  else {
    uVar2 = 0xffffffff;
  }
  // ...
}

Other sample vuln code:

// https://cc-sw.com/semgrep-guide-for-a-security-engineer-part-6-of-6/
static int service_attr_req(sdp_req_t *req, sdp_buf_t *buf) {
  sdp_cont_state_t *cstate = NULL;
  // ...
  if (cstate) {
  sdp_buf_t *pCache = sdp_get_cached_rsp(cstate);
 
  SDPDBG("Obtained cached rsp : %p", pCache);
 
  if (pCache) {
    short sent = MIN(max_rsp_size, pCache->data_size - cstate->cStateValue.maxBytesSent);
    pResponse = pCache->data;
    memcpy(buf->data, pResponse + cstate->cStateValue.maxBytesSent, sent);  //Out-of-bound read here
    buf->data_size += sent;
    cstate->cStateValue.maxBytesSent += sent;
    // ...
  }
  // ...
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions