-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Labels
2.xRelated to ModSecurity version 2.xRelated to ModSecurity version 2.x
Description
In utf8_unicode_inplace_ex(), we have the following code:
c = *utf;
/* If first byte begins with binary 0 it is single byte encoding */
if ((c & 0x80) == 0) {
/* single byte unicode (7 bit ASCII equivilent) has no validation */
count++;
if (count <= len) {
if (c == 0) *data = x2c(&c);
else *data++ = c;
}
}
The code if (c == 0) *data = x2c(&c); is wrong.
utf the input string.
If input is "x", we copy the character => correct.
If input is "\x00\xA0", we call the x2c function which expects an hexadecimal string as input (like "20" for a space).
Furthermore, the output buffer (data) is not increased, so the character is overwritten on the next loop.
The correct code is
if (c == 0) {
sprintf(data, "%%u%04x", utf[1]);
count += 4;
data += 6;
i++;
*changed = 1;
}
Remarks:
- Using "%04x" replaces a lot of useless code used in other parts of the code
- I didn't add a check
if (count <= len)aftercount += 4;because it cannot overflow if we fix the code:len = input_len * 6 + 1;instead oflen = input_len * 4 + 1;(4 bytes -> %u1234). The check (and the variable count) could thus be removed everywhere - After filling data, we have several checks (/* invalid UTF-8 character number range (RFC 3629) /, / check for overlong */, ...) with this code
count++; *data++ = c;What's the point? This copies the character after the ones we already copied. This looks wrong to me. Any reason to keep this code?
Metadata
Metadata
Assignees
Labels
2.xRelated to ModSecurity version 2.xRelated to ModSecurity version 2.x