-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Although cbrrr parses and serializes non-recursively (so it'll never exhaust the native call stack), it still needs to allocate memory to do its work (either a result buffer when serialising, or python objects when deseralising). If you run out of memory, you're liable to get killed by the OOM killer, or maybe your whole system hangs - not good! (if malloc ever returns NULL, it should bail out properly, but I haven't tested this yet)
Asking cbrrr to serialise a circular reference is the most concise way to trigger this condition:
import cbrrr
recur = []
recur.append(recur)
cbrrr.encode_dag_cbor(recur)So, don't do that! You must ensure that your data does not contain circular references before passing it to encode_dag_cbor. In many cases this is trivial (e.g. a JSON source can never contain circular references).
For non-circular conditions, the easiest way to prevent OOM conditions is to limit your input buffer size. e.g. if you were converting JSON to DAG-CBOR, limit the size of the JSON input. If you're trying to parse a DAG-CBOR object, limit the size of the CBOR input. This is a rather coarse-grained control, and it's something I'd like to improve upon in future releases of cbrrr - hence this issue. Bear in mind that the python objects representing a parsed CBOR buffer will likely take up more memory than the original buffer did. (TODO: figure out worst-case "expansion factor")
I may introduce some combination of:
- nesting depth limits (even though cbrrr will never exhaust the native call stack!)
- output buffer size limits (when serialising)
- maximum object creation count (when parsing)