You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[IMP] jinja_to_qweb: store validation data in the DB
upg-2994884
Avoid `MemoryError` and/or killed process because of `malloc()` failing within
`lxml` / `libxml2`.
Debugging this by determining the size of involved datastructures through means
of `sys.getsizeof()` showed that the overly large memory consumption has 3
different sources:
1. During conversion: The global variable `templates_to_check` grows to
hundreds of MiB after the various calls to `upgrade_jinja_fields()` by
upgrade scripts.
2. During conversion: The call to `cr.dictfetchall()` to gather all
templates(fields) that are to be converted, already consumes hundreds of
MiB.
3. At the start of function `verify_upgraded_jinja_fields()`, the process is
at ~1.5GiB because of (1) and (2). While iterating over all the templates in
`templates_to_check`, no significant amount of memory is allocated on top of
this *by python datastructures*. But, with each call to
`is_converted_template_valid()`, the size of the process increases until it
hits the RLIMIT. This function calls into `lxml` multiple times, suggesting
that the memory is allocated in `malloc()` calls in the `lxml` and/or
`libxml2` C library/ies, evading python's memory accounting and garbage
collection.
Internet research shows that `lxml` has a long history of different memory
leaks in C code, plus some caching mechanism *across documents* that could
be responsible[^1]. More recent versions of the module seem to have been
improved, but still we're stuck with old versions.
This patch takes care of (1) by, instead of keeping the converted data in
memory for the validation step, creating a "temporary" table and inserting the
data there. After validation, the table is dropped.
[^1]: https://benbernardblog.com/tracking-down-a-freaky-python-memory-leak-part-2/
Part-of: #288
Signed-off-by: Christophe Simonis (chs) <[email protected]>
0 commit comments