Skip to content

Commit 4c5691c

Browse files
committed
CONTRIBUTING: add more details about GET_*() macros. [skip ci]
Expand the discussion of why bounds checks are a Good Thing. List all the macros, including the "fetch an address and return a string representation of the address" macros, and indicate what they do. Note the additional advantage that they provide, namely that they 1) can fetch unaligned values correctly and without a fault and 2) will fetch a value in the specified byte order and return it in host byte order.
1 parent 4acf1b5 commit 4c5691c

File tree

1 file changed

+136
-8
lines changed

1 file changed

+136
-8
lines changed

CONTRIBUTING.md

Lines changed: 136 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -164,23 +164,35 @@ and ask!
164164

165165
* The printer may receive incomplete packet in the buffer, truncated at any
166166
random position, for example by capturing with `-s size` option.
167+
This means that an attempt to fetch packet data based on the expected
168+
format of the packet may run the risk of overrunning the buffer.
169+
170+
This is because the printer may receive incomplete packet in the
171+
buffer, truncated at any random position, for example by capturing
172+
with `-s size` option, so any attempt to fetch packet data based on
173+
the expected format of the packet may run the risk of overrunning the
174+
buffer.
175+
176+
Furthermore, if the packet is complete, but is not correctly formed,
177+
that can also cause a printer to overrun the buffer, as it will be
178+
fetching packet data based on the expected format of the packet.
179+
180+
Therefore, integral, IPv4 address, and octet sequence values should
181+
be fetched using the `GET_*()` macros, which are defined in
182+
`extract.h`.
183+
167184
If your code reads and decodes every byte of the protocol packet, then to
168185
ensure proper and complete bounds checks it would be sufficient to read all
169-
packet data using the `GET_*()` macros, typically:
170-
```
171-
GET_U_1(p)
172-
GET_S_1(p)
173-
GET_BE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
174-
GET_BE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
175-
```
186+
packet data using the `GET_*()` macros.
187+
176188
If your code uses the macros above only on some packet data, then the gaps
177189
would have to be bounds-checked using the `ND_TCHECK_*()` macros:
178190
```
179191
ND_TCHECK_n(p), n in { 1, 2, 3, 4, 5, 6, 7, 8, 16 }
180192
ND_TCHECK_SIZE(p)
181193
ND_TCHECK_LEN(p, l)
182194
```
183-
For the `ND_TCHECK_*` macros (if not already done):
195+
For the `GET_*()` and `ND_TCHECK_*` macros (if not already done):
184196
* Assign: `ndo->ndo_protocol = "protocol";`
185197
* Define: `ND_LONGJMP_FROM_TCHECK` before including `netdissect.h`
186198
* Make sure that the intersection of `GET_*()` and `ND_TCHECK_*()` is minimal,
@@ -193,6 +205,122 @@ and ask!
193205
```
194206
You should try several values for snaplen to do various truncation.
195207

208+
* The `GET_*()` macros that fetch integral values are:
209+
```
210+
GET_U_1(p)
211+
GET_S_1(p)
212+
GET_BE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
213+
GET_BE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
214+
GET_LE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
215+
GET_LE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
216+
```
217+
218+
where *p* points to the integral value in the packet buffer. The
219+
macro returns the integral value at that location.
220+
221+
`U` indicates that an unsigned value is fetched; `S` indicates that a
222+
signed value is fetched. For multi-byte values, `BE` indicates that
223+
a big-endian value ("network byte order") is fetched, and `LE`
224+
indicates that a little-endian value is fetched.
225+
226+
In addition to the bounds checking the `GET_*()` macros perform,
227+
using those macros has other advantages:
228+
229+
* tcpdump runs on both big-endian and little-endian systems, so
230+
fetches of multi-byte integral values must be done in a fashion
231+
that works regardless of the byte order of the machine running
232+
tcpdump. The `GET_BE_*()` macros will fetch a big-endian value and
233+
return a host-byte-order value on both big-endian and little-endian
234+
machines, and the `GET_LE_*()` macros will fetch a little-endian
235+
value and return a host-byte-order value on both big-endian and
236+
little-endian machines.
237+
238+
* tcpdump runs on machines that do not support unaligned access to
239+
multi-byte values, and packet values are not guaranteed to be
240+
aligned on the proper boundary. The `GET_BE_*()` and `GET_LE_*()`
241+
macros will fetch values even if they are not aligned on the proper
242+
boundary.
243+
244+
* The `GET_*()` macros that fetch IPv4 address values are:
245+
```
246+
GET_IPV4_TO_HOST_ORDER(p)
247+
GET_IPV4_TO_NETWORK_ORDER(p)
248+
```
249+
250+
where *p* points to the address in the packet buffer.
251+
`GET_IPV4_TO_HOST_ORDER()` returns the address in the byte order of
252+
the host that is running tcpdump; `GET_IPV4_TO_NETWORK_ORDER()`
253+
returns it in network byte order.
254+
255+
Like the integral `GET_*()` macros, these macros work correctly on
256+
both big-endian and little-endian machines and will fetch values even
257+
if they are not aligned on the proper boundary.
258+
259+
* The `GET_*()` macro that fetches an arbitrary sequences of bytes is:
260+
```
261+
GET_CPY_BYTES(dst, p, len)
262+
```
263+
264+
where *dst* is the destination to which the sequence of bytes should
265+
be copied, *p* points to the first byte of the sequence of bytes, and
266+
*len* is the number of bytes to be copied. The bytes are copied in
267+
the order in which they appear in the packet.
268+
269+
* To fetch a network address and convert it to a printable string, use
270+
the following `GET_*()` macros, defined in `addrtoname.h`, to
271+
perform bounds checks to make sure the entire address is within the
272+
buffer and to translate the address to a string to print:
273+
```
274+
GET_IPADDR_STRING(p)
275+
GET_IP6ADDR_STRING(p)
276+
GET_MAC48_STRING(p)
277+
GET_EUI64_STRING(p)
278+
GET_EUI64LE_STRING(p)
279+
GET_LINKADDR_STRING(p, type, len)
280+
GET_ISONSAP_STRING(nsap, nsap_length)
281+
```
282+
283+
`GET_IPADDR_STRING()` fetches an IPv4 address pointed to by *p* and
284+
returns a string that is either a host name, if the `-n` flag wasn't
285+
specified and a host name could be found for the address, or the
286+
standard XXX.XXX.XXX.XXX-style representation of the address.
287+
288+
`GET_IP6ADDR_STRING()` fetches an IPv6 address pointed to by *p* and
289+
returns a string that is either a host name, if the `-n` flag wasn't
290+
specified and a host name could be found for the address, or the
291+
standard XXXX::XXXX-style representation of the address.
292+
293+
`GET_MAC48_STRING()` fetches a 48-bit MAC address (Ethernet, 802.11,
294+
etc.) pointed to by *p* and returns a string that is either a host
295+
name, if the `-n` flag wasn't specified and a host name could be
296+
found in the ethers file for the address, or the standard
297+
XX:XX:XX:XX:XX:XX-style representation of the address.
298+
299+
`GET_EUI64_STRING()` fetches a 64-bit EUI pointed to by *p* and
300+
returns a string that is the standard XX:XX:XX:XX:XX:XX:XX:XX-style
301+
representation of the address.
302+
303+
`GET_EUI64LE_STRING()` fetches a 64-bit EUI, in reverse byte order,
304+
pointed to by *p* and returns a string that is the standard
305+
XX:XX:XX:XX:XX:XX:XX:XX-style representation of the address.
306+
307+
`GET_LINKADDR_STRING()` fetches an octet string, of length *length*
308+
and type *type*, pointed to by *p* and returns a string whose format
309+
depends on the value of *type*:
310+
311+
* `LINKADDR_MAC48` - if the length is 6, the string has the same
312+
value as `GET_MAC48_STRING()` would return for that address,
313+
otherwise, the string is a sequence of XX:XX:... values for the bytes
314+
of the address;
315+
316+
* `LINKADDR_FRELAY` - the string is "DLCI XXX", where XXX is the
317+
DLCI, if the address is a valid Q.922 header, and an error indication
318+
otherwise;
319+
320+
* `LINKADDR_EUI64`, `LINKADDR_ATM`, `LINKADDR_OTHER` -
321+
the string is a sequence of XX:XX:... values for the bytes
322+
of the address.
323+
196324
* Do invalid packet checks in code: Think that your code can receive in input
197325
not only a valid packet but any arbitrary random sequence of octets (packet
198326
* built malformed originally by the sender or by a fuzz tester,

0 commit comments

Comments
 (0)