Skip to content

Review potential backtracking issuesΒ #22

@cscorley

Description

@cscorley

From my email

Hi Christopher,

My name is James Davis. I'm a security researcher at Virginia Tech.

The pypi module whatthepatch has 2 regular expressions vulnerable to catastrophic backtracking.
The vulnerable expressions are listed below.
If you don't know, catastrophic backtracking is when the regex engine takes more than linear time to scan a string.
There are lots of resources about it on the web. I have included some starting points below.

Catastrophic backtracking is particularly problematic if two conditions are met:

  1. The module is used by server processes, and
  2. The regex can be reached by user input.

Please evaluate whether these conditions apply to your users and your regexes.
However, copy/pasting regexes is quite common. Even if one or both conditions fail now,
these regexes may be a time bomb in your code for later -- you might still want to fix it.
For example, if your module is used on the client side, consider whether a server-side counterpart might re-use this pattern later.

Fix advice:

If you want to make fixes, here are the most common approaches with examples:

  1. Refactor the regex pattern
  2. Replace the regex with custom parsing
  3. Apply input sanitization, suitable if legitimate input strings are (much) shorter than the attack string

You can study other examples using the links from Snyk.io's vulnerability database: https://snyk.io/vuln/?packageManager=all

Validating your fix:

  1. I am happy to test a refactored regex with my tools -- send me an email with the commit ID.
  2. Existing GitHub projects for regex testing are:

If this is a new topic for you, here are some good starting points for information:

======================================

Now, here are the vulnerable patterns with attack details.

Each vulnerability has the following keys, explained in more detail below:

  • pattern
  • filesIn (as of December/January; I excluded any appearances in irrelevant-looking dirs, and in '.min' files)
  • stringLenFor10Sec
  • nPumpsFor10Sec
  • attackFormat
  • blowupCurve

The attack format describes how to generate an attack string.
On my machine, an attack string generated using nPumpsFor10Sec repetitions ("pumps") of the pump string(s)
blocks the python regex engine for 10 seconds, though this will vary based on your hardware.

Compose an attack string like this:
'prefix 1' + 'pump 1' X times + 'prefix 2' + 'pump 2' X times + ... + suffix
Example:
With pumpPairs: [{'prefix': 'a', 'pump': 'b'}], suffix: 'c', an attack string with three pumps would be:
abbbc

Catastrophic backtracking blows up at either an exponential rate or a super-linear (power law) rate.
The blowupCurve indicates how severe the blow-up is.
The 'type' is either EXP(onential) or POW(er law) in the number of pumps (x).
The 'parms' are the parameters for the two curve types. The second parameter is more important, because:
EXP: f(x) = parms[0] * parms[1]^x
POW: f(x) = parms[0] * x^parms[1]

I strongly recommend you address any EXP(onential) blow-ups.
POW(er law) blow-ups vary in their severity depending on the value of the second parameter,
and in the length of the attack string ('stringLenFor10Sec').
But they are still a legitimate attack vector.

JSON formatted:

Vuln 1:

{
   "nPumpsFor10Sec" : "70755",
   "filesIn" : [
      [
         "whatthepatch/patch.py"
      ]
   ],
   "pattern" : ".*(\\(.*\\))$",
   "stringLenFor10Sec" : 70757,
   "attackFormat" : {
      "pumpPairs" : [
         {
            "pump" : "(",
            "prefix" : "a"
         }
      ],
      "suffix" : "("
   },
   "blowupCurve" : {
      "type" : "POWER",
      "r2" : 0.999691087941083,
      "parms" : [
         3.56638848402774e-09,
         1.94916438553238
      ]
   }
}




Vuln 2:

{
   "attackFormat" : {
      "suffix" : "*0",
      "pumpPairs" : [
         {
            "prefix" : "*** 0",
            "pump" : "0"
         }
      ]
   },
   "blowupCurve" : {
      "parms" : [
         9.13185781990315e-08,
         1.82012197261697
      ],
      "r2" : 0.994885929334018,
      "type" : "POWER"
   },
   "nPumpsFor10Sec" : "27276",
   "filesIn" : [
      [
         "whatthepatch/patch.py"
      ]
   ],
   "stringLenFor10Sec" : 27283,
   "pattern" : "^\\*\\*\\* (\\d+),?(\\d*) \\*\\*\\*\\*$"
}

Best,

James

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions