|
| 1 | ++++ |
| 2 | +title = "Underhanded" |
| 3 | +date = 2025-07-23 |
| 4 | +authors = ["Swaminath Shiju"] |
| 5 | ++++ |
| 6 | + |
| 7 | +# Underhanded GCTF25 |
| 8 | +### Description |
| 9 | + |
| 10 | +Proudly sharing our Python implementation of AES. By the way, we sneakily hid a backdoor. Can you see sharp and see what went wrong? |
| 11 | + |
| 12 | +```python |
| 13 | +def challenge(): |
| 14 | + k = os.urandom(16) |
| 15 | + aes = AES(k) |
| 16 | + |
| 17 | + # I will encrypt FIVE messages for you, that's it. |
| 18 | + for _ in range(5): |
| 19 | + m = bytes.fromhex(input('📄 ')) |
| 20 | + c = aes.encrypt(m) |
| 21 | + print('🤫', c.hex()) |
| 22 | + |
| 23 | + _k = bytes.fromhex(input('🔑 ')) |
| 24 | + if k != _k: raise Exception('incorrect guess!') |
| 25 | +``` |
| 26 | + |
| 27 | +### Solution |
| 28 | + |
| 29 | +Here the gist of the question is to guess the custom AES's key from 5 chosen plaintext ciphers 16 times. Now we need to hunt for backdoors in the AES implementation. |
| 30 | + |
| 31 | +Looking into the encrypt function |
| 32 | + |
| 33 | +```python |
| 34 | +def encrypt(self, m: bytes) -> bytes: |
| 35 | + c = bytearray(m) |
| 36 | + c = self.add_round_key(c, 0) |
| 37 | + for r in range(1, self.n_rounds): |
| 38 | + c = self.sub_bytes(c) |
| 39 | + c = self.shift_rows(c) |
| 40 | + c = self.mix_columns(c) |
| 41 | + c = self.add_round_key(c, r) |
| 42 | + c = self.sub_bytes(c) |
| 43 | + c = self.shift_rows(c) |
| 44 | + c = self.add_round_key(c, self.n_rounds) |
| 45 | + return bytes(c) |
| 46 | +``` |
| 47 | + |
| 48 | +this seems pretty normal, but when we look into `shift_rows` the first statement looks like it has a "typo". |
| 49 | + |
| 50 | +```python |
| 51 | +def shift_rows(self, m: bytearray) -> bytearray: |
| 52 | + m[+0], m[+4], m[+8], m[12] = m[+0], m[+4], m[-8], m[12] |
| 53 | + m[+1], m[+5], m[+9], m[13] = m[+5], m[+9], m[13], m[+1] |
| 54 | + m[+2], m[+6], m[10], m[14] = m[10], m[14], m[+2], m[+6] |
| 55 | + m[+3], m[+7], m[11], m[15] = m[15], m[+3], m[+7], m[11] |
| 56 | + return m |
| 57 | +``` |
| 58 | + |
| 59 | +That `m[-8]` is supposed to be a `m[+8]`. Another backdoor is simply how multiple blocks of plaintext (i.e length of PT > 16). For instance looking at shift rows it clearly operates, uses only the first 16 bytes (other than `m[-8]` of course). |
| 60 | + |
| 61 | +The same is true for `mix_columns` as well. |
| 62 | + |
| 63 | +```python |
| 64 | +def mix_columns(self, m: bytearray) -> bytearray: |
| 65 | + for i in range(0, 16, 4): |
| 66 | + t = m[i+0] ^ m[i+1] ^ m[i+2] ^ m[i+3] |
| 67 | + u = m[i+0] |
| 68 | + m[i+0] ^= t ^ xtime(m[i+0] ^ m[i+1]) |
| 69 | + m[i+1] ^= t ^ xtime(m[i+1] ^ m[i+2]) |
| 70 | + m[i+2] ^= t ^ xtime(m[i+2] ^ m[i+3]) |
| 71 | + m[i+3] ^= t ^ xtime(m[i+3] ^ u) |
| 72 | + return m |
| 73 | +``` |
| 74 | + |
| 75 | +So if we send multiple blocks, every block other than the first one is only affected by `add_round_key` and `sub_bytes`. |
| 76 | + |
| 77 | +Now we can bring both of them together to create an exploit. Denoting resulting cipher text as $c_0, c_1,\cdots$, the input to the last shift rows as $s_0,s_1,\cdots$ and the keys using $k_0,k_1,\cdots$. Now looking at how we use the `m[-8]` to exploit the last `add_round_key`. |
| 78 | +`r = (n-8)%16` |
| 79 | + |
| 80 | +$$ |
| 81 | +\begin{aligned} |
| 82 | +c_8&=s_r\oplus k_{10}[8] \\ |
| 83 | +c_r&=s_r\oplus k_{10}[r] \\ \ \\ |
| 84 | +c_r\oplus c_8 &=k_{10}[8]\oplus k_{10}[r] |
| 85 | +\end{aligned} |
| 86 | +$$ |
| 87 | +Here $n$ is the total plaintext length. This gives us 5 relations for the bytes in $k_{10}$. So if we guess a value for $k_{10}[8]$ that gives us 6 bytes in $k_{10}$. |
| 88 | + |
| 89 | +Now we can try to reverse the key-scheduling to get more bytes in the other round keys. We have multiple possible choices for these 6 bytes I chose `0, 4, 5, 8, 9, 13` bytes. Now looking at the relevant parts of the key scheduling algorithm. |
| 90 | + |
| 91 | +$$ |
| 92 | +\begin{aligned} |
| 93 | +k_{n}[0]&=k_{n-1}[0]\oplus \sigma(k_{n-1}[13])\oplus \text{RCON}[n-1][0]\\ \ \\ |
| 94 | +k_{n}[4]&=k_{n-1}[4]\oplus k_{n}[0]\\ \ \\ |
| 95 | +k_{n}[5]&=k_{n-1}[5]\oplus k_{n}[1]\\ \ \\ |
| 96 | +k_{n}[8]&=k_{n-1}[8]\oplus k_{n}[4]\\ \ \\ |
| 97 | +k_{n}[9]&=k_{n-1}[9]\oplus k_{n}[5]\\ \ \\ |
| 98 | +k_{n}[13]&=k_{n-1}[13]\oplus k_{n}[9] |
| 99 | +\end{aligned} |
| 100 | +$$ |
| 101 | + |
| 102 | +Here $\text {RCON}$ is an array of constants. So if we have $k_{10}[0]$,$k_{10}[4]$,$k_{10}[5]$,$k_{10}[8]$,$k_{10}[9]$,$k_{10}[13]$ we can derive $k_9[0]$,$k_9[4]$,$k_9[8]$,$k_9[9]$,$k_9[13]$. We clearly lose a byte but as we keep going backward we get. |
| 103 | + |
| 104 | +| | 0 | 4 | 5 | 8 | 9 | 13 | |
| 105 | +| --- |:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:------------:| |
| 106 | +| 10 | $k_{10}[0]$ | $k_{10}[4]$ | $k_{10}[5]$ | $k_{10}[8]$ | $k_{10}[9]$ | $k_{10}[13]$ | |
| 107 | +| 9 | $k_{9}[0]$ | $k_{9}[4]$ | | $k_{9}[8]$ | $k_{9}[9]$ | $k_{9}[13]$ | |
| 108 | +| 8 | $k_{8}[0]$ | $k_{8}[4]$ | | $k_{8}[8]$ | | $k_{8}[13]$ | |
| 109 | +| 7 | | $k_{7}[4]$ | | $k_{7}[8]$ | | | |
| 110 | +| 6 | | | | $k_{6}[8]$ | | | |
| 111 | + |
| 112 | +Now we look at the byte xor'ed with $k_j[8]$ in a later block denote it by C. |
| 113 | +Then |
| 114 | +$$ |
| 115 | +C=k_{10}[8]\oplus\sigma(k_9[8]\oplus \sigma(k_8[8]\cdots\sigma(k_0[8]\oplus P))) |
| 116 | +$$ |
| 117 | + |
| 118 | +we can move the known bytes to the left hand side since $\sigma$ and $\oplus$ are reversible. |
| 119 | + |
| 120 | +$$ |
| 121 | +C'=k_{5}[8]\oplus\sigma(k_4[8]\oplus \sigma(k_3[8]\cdots\sigma(k_0[8]\oplus P))) |
| 122 | +$$ |
| 123 | +The naive brute force now would need $2^8 \times 2^{8\times 6}=2^{56}$ guesses (1 byte for $k_{10}$ and then 6 bytes from the previous round keys). |
| 124 | + |
| 125 | +However we can be clever here, since they are reversible we can rewrite it as. |
| 126 | + |
| 127 | +$$ |
| 128 | +\sigma^{-1}(\sigma^{-1}(\sigma^{-1}(C'\oplus k_5[8])\oplus k_4[8])\oplus k_3[8]) = k_2[8]\oplus \sigma(k_1[8]\oplus\sigma(k_0[0]\oplus P)) |
| 129 | +$$ |
| 130 | +Now we can brute force each side separately using a meet in the middle attack needing is more feasible. We can shorten the time some more by leveraging more information from the CT. |
| 131 | + |
| 132 | +If we look at another byte xor'd with $k_j[8]$ we get a similar equation |
| 133 | + |
| 134 | +$$ |
| 135 | +\sigma^{-1}(\sigma^{-1}(\sigma^{-1}(C''\oplus k_5[8])\oplus k_4[8])\oplus k_3[8]) = k_2[8]\oplus \sigma(k_1[8]\oplus\sigma(k_0[0]\oplus P')) |
| 136 | +$$ |
| 137 | +Xor-ing both we get |
| 138 | + |
| 139 | +$$ |
| 140 | +\begin{aligned} |
| 141 | +\sigma^{-1}(\sigma^{-1}(\sigma^{-1}(C'\oplus k_5[8])\oplus k_4[8])\oplus k_3[8]) \oplus \sigma^{-1}(\sigma^{-1}(\sigma^{-1}(C''\oplus k_5[8])\oplus k_4[8])\oplus k_3[8]) = \\ \sigma(k_1[8]\oplus\sigma(k_0[0]\oplus P')) \oplus \sigma(k_1[8]\oplus\sigma(k_0[0]\oplus P)) |
| 142 | +\end{aligned} |
| 143 | +$$ |
| 144 | + |
| 145 | +This eliminated a variable without minimal impact in check time but almost halfs number of iterations required. |
| 146 | + |
| 147 | +Now with the entire $k_j[8]$ and $k_{10}[4]$ known we can derive $k_j[4]$ and then $k_j[0]$, $k_j[13]$, $k_j[9]$ and $k_j[5]$. Now we need to guess a byte 10 times for the remaining bytes which are easily doable guesses to get all the remaining key bytes. |
| 148 | + |
| 149 | + |
| 150 | +> Note: You need a sufficiently beefy computer to do this 16 times in 300s |
| 151 | +
|
| 152 | + |
0 commit comments