-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
BUG: fix memory leak in JSON datetime serialization #62217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks @Alvaro-Kothe, I can confirm this resolves the memory leak that I am seeing. Also, your solution makes sense. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might also need to add a GET_TC(tc)->freeCStr = 1;
in Object_getBigNumStringValue
Thanks for pointing that out. Just fixed it. |
@@ -107,6 +107,7 @@ typedef struct __TypeContext { | |||
JSINT64 longValue; | |||
|
|||
const char *cStr; | |||
int freeCStr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally the problem with the JSON code is that over time we have incrementally added many different state management operations, and it is not always clear how they should work in tandem. So I am a little hesitant to add more state to solve the current issue. Is it possible to solve the issue without changing this struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions on how to do that? Revert the changes in fb6c4e3 that make some of the uses of cStr
stack allocated? It doesn't seem possible to make all uses of cStr
stack allocated, at least at first glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that might be a reasonable approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the freeCStr
flag and stored all the strings assigned in *IterNext
in dynamic memory.
Revert "fix: flag to free `cStr` in `Object_getBigNumStringValue`" Revert "fix: memory leak in JSON datetime serialization" fix: fix memory leak in `PyDateTimeToIsoCallback` fix: ensure successful malloc.
@@ -1880,6 +1905,7 @@ static void Object_endTypeContext(JSOBJ Py_UNUSED(obj), JSONTypeContext *tc) { | |||
GET_TC(tc)->rowLabels = NULL; | |||
NpyArr_freeLabels(GET_TC(tc)->columnLabels, GET_TC(tc)->columnLabelsLen); | |||
GET_TC(tc)->columnLabels = NULL; | |||
PyObject_Free((void *)GET_TC(tc)->cStr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too sure about this approach - generally the CPython API only wants you to call this function when you've allocated with the corresponding function(s)
https://docs.python.org/3/c-api/memory.html#c.PyObject_Free
Frees the memory block pointed to by p, which must have been returned by a previous call to PyObject_Malloc(), PyObject_Realloc() or PyObject_Calloc(). Otherwise, or if PyObject_Free(p) has been called before, undefined behavior occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I misread your PR. I see you allocate 20 bytes to cStr
, so this is clean in that regards. However, I think you want to use the PyMem_
functions since we are dealing with raw C constructs and not Python objects, so maybe replace this with PyMem_Free
, any PyObject_Malloc
calls with PyMem_Malloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, PyObject_Free()
is being called on memory that's been allocated with PyObject_Malloc()
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any object types where it is possible to go through the destructor without initializing the cStr
member? Looking at say the dict iteration code it seems like that may be happening?
Its a bit confusing here that the type context begin functions have an implicit requirement to allocate the cStr
member. I wonder if it wouldn't be cleaner to allocate that up front, and leave it to individual functions to free or replace memory as required
@@ -1007,16 +1008,24 @@ static const char *List_iterGetName(JSOBJ Py_UNUSED(obj), | |||
//============================================================================= | |||
static void Index_iterBegin(JSOBJ Py_UNUSED(obj), JSONTypeContext *tc) { | |||
GET_TC(tc)->index = 0; | |||
GET_TC(tc)->cStr = PyObject_Malloc(20); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its probably better to create a static const variable for a size within the module, rather than repeating the hard-coded 20 multiple times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created the variable CSTR_SIZE
if (index == 0) { | ||
GET_TC(tc)->cStr = "name"; | ||
strcpy((char *)GET_TC(tc)->cStr, "name"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you want to use strcpy here you will need to drop the const qualifier on the struct member; casting away const-ness like this is going to trigger undefined behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the const
qualifier.
@@ -1880,6 +1905,7 @@ static void Object_endTypeContext(JSOBJ Py_UNUSED(obj), JSONTypeContext *tc) { | |||
GET_TC(tc)->rowLabels = NULL; | |||
NpyArr_freeLabels(GET_TC(tc)->columnLabels, GET_TC(tc)->columnLabelsLen); | |||
GET_TC(tc)->columnLabels = NULL; | |||
PyObject_Free((void *)GET_TC(tc)->cStr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why cast to a void pointer here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was because of the const qualifier in cStr
, but since I removed this qualifier, I am no longer casting.
@@ -49,7 +49,7 @@ char *int64ToIso(int64_t value, NPY_DATETIMEUNIT valueUnit, | |||
pandas_datetime_to_datetimestruct(value, valueUnit, &dts); | |||
|
|||
*len = (size_t)get_datetime_iso_8601_strlen(0, base); | |||
char *result = PyObject_Malloc(*len); | |||
char *result = PyMem_Malloc(*len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the focus of the PR is to fix the JSON modules, let's leave the others untouched for now. These may be viable fixes, but they can always be done in a follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nevermind - I guess you are doing this because the JSON module makes use of this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exacly, I made this change because of the assignment of cStr
in NpyDateTimeToIsoCallback
.
I also made changes in other functions to allocate with PyMem_Malloc
in int64ToIsoDuration
and PyDateTimeToIso
, but I think that the CI errors are because of this changes. Is it correct to use PyMem_Malloc
in them or should I change back to PyObject_Malloc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange - I don't think PyObject_Malloc is correct. Upstream, it looks like they have made a similar change (see python/cpython#114569)
However, if its a blocker for now sure lets stick with the PyObject_ functions; can figure out the rest later
@Alvaro-Kothe the address and undefined behavior violations can be pretty tricky to debug. If you aren't aware, you can turn on sanitizers locally to better understand what is happening (assuming your compiler supports them):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool nice work
@@ -1880,6 +1907,7 @@ static void Object_endTypeContext(JSOBJ Py_UNUSED(obj), JSONTypeContext *tc) { | |||
GET_TC(tc)->rowLabels = NULL; | |||
NpyArr_freeLabels(GET_TC(tc)->columnLabels, GET_TC(tc)->columnLabelsLen); | |||
GET_TC(tc)->columnLabels = NULL; | |||
PyObject_Free(GET_TC(tc)->cStr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you double check the paths that actually hit this? Right now it seems like we either implicitly have some kind of relation with the types allocate storage for the struct member and this function getting called, or we have undefined behavior when always calling free here, if the type doesn't allocate this in the first place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked these paths where cStr
is assigned to a dynamic memory allocated with PyObject_Malloc
:
NpyDateTimeToIsoCallback -> int64ToIso
NpyTimeDeltaToIsoCallback -> int64ToIsoDuration
PyDateTimeToIsoCallback -> PyDateTimeToIso
Index_iterBegin
Series_iterBegin
DataFrame_iterBegin
Object_getBigNumStringValue
There is no *_Free
call for cStr
outside of the Object_endTypeContext
function to cause an undefined behaviour.
Other cases seems like cStr
isn't assigned and end up being NULL
, making the PyObject_Free
a no-op.
Great thanks @Alvaro-Kothe ! Definitely open to any other improvements you can find here |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.