Skip to content

Conversation

rruuaanng
Copy link
Contributor

@rruuaanng rruuaanng commented Oct 5, 2024

@bedevere-app
Copy link

bedevere-app bot commented Oct 5, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@rruuaanng rruuaanng changed the title gh-92953: Correct description error and extra character in struct pool_header gh-92953: Correct comments error and extra character in struct pool_header Oct 5, 2024
@rruuaanng
Copy link
Contributor Author

@skirpichev

@picnixz
Copy link
Member

picnixz commented Oct 5, 2024

The "" just means that we repeat what is written above so it's not really an extra character. Now, I'm not sure that the issue is indeed an issue. Why? because there is a comment that says:

Pools are carved off an arena's highwater mark (an arena_object's pool_address
member) as needed.  Once carved off, a pool is in one of three states forever
after:

used == partially used, neither empty nor full
    At least one block in the pool is currently allocated, and at least one
    block in the pool is not currently allocated (note this implies a pool
    has room for at least two blocks).
    This is a pool's initial state, as a pool is created only when malloc
    needs space.
    The pool holds blocks of a fixed size, and is in the circular list headed
    at usedpools[i] (see above).  It's linked to the other used pools of the
    same size class via the pool_header's nextpool and prevpool members.
    If all but one block is currently allocated, a malloc can cause a
    transition to the full state.  If all but one block is not currently
    allocated, a free can cause a transition to the empty state.

full == all the pool's blocks are currently allocated
    On transition to full, a pool is unlinked from its usedpools[] list.
    It's not linked to from anything then anymore, and its nextpool and
    prevpool members are meaningless until it transitions back to used.
    A free of a block in a full pool puts the pool back in the used state.
    Then it's linked in at the front of the appropriate usedpools[] list, so
    that the next allocation for its size class will reuse the freed block.

empty == all the pool's blocks are currently available for allocation
    On transition to empty, a pool is unlinked from its usedpools[] list,
    and linked to the front of its arena_object's singly-linked freepools list,
    via its nextpool member.  The prevpool member has no meaning in this case.
    Empty pools have no inherent size class:  the next time a malloc finds
    an empty list in usedpools[], it takes the first pool off of freepools.
    If the size class needed happens to be the same as the size class the pool
    last had, some pool initialization can be skipped.

In the 'used' state, it says

It's linked to the other used pools of the same size class via the pool_header's nextpool and prevpool members.

When we are in 'full' state, it says that

its nextpool and prevpool members are meaningless until it transitions back to used

so the comment is still correct. As for the 'empty' state,

The prevpool member has no meaning in this case

and

Empty pools have no inherent size class

So technically, only the change on the nextpool has some meaning. However, I don't think this is worth the change since:

  • the 'used' state is the default state
  • the 'full' state is either temporary, in which case the prev/next pool members have no meaning or transitions back to 'used'.
  • the 'empty' state has no notion of a size class, so we don't really care. And it can just be seen as an "available pool waiting to be used".

I think it's more important to know that we are within the same size class in the comment though than to just have a 'next pool'.

If you want to make the change, I'd advise also checking the corresponding .c file to see if there are similar comments.

@rruuaanng
Copy link
Contributor Author

rruuaanng commented Oct 5, 2024

The "" just means that we repeat what is written above so it's not really an extra character. Now, I'm not sure that the issue is indeed an issue. Why? because there is a comment that says:

Pools are carved off an arena's highwater mark (an arena_object's pool_address
member) as needed.  Once carved off, a pool is in one of three states forever
after:

used == partially used, neither empty nor full
    At least one block in the pool is currently allocated, and at least one
    block in the pool is not currently allocated (note this implies a pool
    has room for at least two blocks).
    This is a pool's initial state, as a pool is created only when malloc
    needs space.
    The pool holds blocks of a fixed size, and is in the circular list headed
    at usedpools[i] (see above).  It's linked to the other used pools of the
    same size class via the pool_header's nextpool and prevpool members.
    If all but one block is currently allocated, a malloc can cause a
    transition to the full state.  If all but one block is not currently
    allocated, a free can cause a transition to the empty state.

full == all the pool's blocks are currently allocated
    On transition to full, a pool is unlinked from its usedpools[] list.
    It's not linked to from anything then anymore, and its nextpool and
    prevpool members are meaningless until it transitions back to used.
    A free of a block in a full pool puts the pool back in the used state.
    Then it's linked in at the front of the appropriate usedpools[] list, so
    that the next allocation for its size class will reuse the freed block.

empty == all the pool's blocks are currently available for allocation
    On transition to empty, a pool is unlinked from its usedpools[] list,
    and linked to the front of its arena_object's singly-linked freepools list,
    via its nextpool member.  The prevpool member has no meaning in this case.
    Empty pools have no inherent size class:  the next time a malloc finds
    an empty list in usedpools[], it takes the first pool off of freepools.
    If the size class needed happens to be the same as the size class the pool
    last had, some pool initialization can be skipped.

In the 'used' state, it says

It's linked to the other used pools of the same size class via the pool_header's nextpool and prevpool members.

When we are in 'full' state, it says that

its nextpool and prevpool members are meaningless until it transitions back to used

so the comment is still correct. As for the 'empty' state,

The prevpool member has no meaning in this case

and

Empty pools have no inherent size class

So technically, only the change on the nextpool has some meaning. However, I don't think this is worth the change since:

* the 'used' state is the default state

* the 'full' state is either temporary, in which case the prev/next pool members have no meaning or transitions back to 'used'.

* the 'empty' state has no notion of a size class, so we don't really care. And it can just be seen as an "available pool waiting to be used".

I think it's more important to know that we are within the same size class in the comment though than to just have a 'next pool'.

If you want to make the change, I'd advise also checking the corresponding .c file to see if there are similar comments.

The important part of this change is that the comment for nextpool doesn't seem to stay consistent with prev, which could cause confusion for anyone reading it since it includes additional information. I also think that if the "" has some special meaning, it should be explained within the program, not just in the comment where it's defined. This is because the structure definition isn’t visible to the user, so they wouldn’t know that there's a "" with a special meaning in the structure. Even if they did see the "", they wouldn’t automatically connect it to any additional description.
However, I agree with what you mentioned about point

If you want to make the change, I'd advise also checking the corresponding .c file to see if there are similar comments.

And I will check out other descriptions of this structure in obmalloc.c.

@rruuaanng
Copy link
Contributor Author

It's linked to the other used pools of the same size class via the pool_header's nextpool and prevpool members.

I'm sorry, I need to add something to this. I found the comment snippet you listed in pycore_obmalloc.h. As you mentioned, it has already explained the use of this field in detail in this snippet, so in fact, it may not be necessary to explain the function of nextpool again in the structure definition, just a general description of its use is enough.

@picnixz
Copy link
Member

picnixz commented Oct 5, 2024

I also think that if the "" has some special meaning

The meaning is more a typographic meaning in English not really with CPython or computer sciences in general. More precisely,

/*
 * First line does: 	ABC DEF
 * Second line does:	  ""
 */

is equivalent to

/*
 * First line does: 	ABC DEF
 * Second line does:	ABC DEF
 */

The important part of this change is that the comment for nextpool doesn't seem to stay consistent with prev, which could cause confusion for anyone reading it since it includes additional information

I think it's more important to read the big comment above the structure which explains the different states quite well rather than the structure itself.

The important part of this change is that the comment for nextpool doesn't seem to stay consistent with prev

Well.. in the 'full' state neither nextpool nor prevpool have a meaning so the comment would be useless in this case as well. I think Eric decided to keep the comment tight to what is an "effective and useful" state, namely the "used" state.

@rruuaanng
Copy link
Contributor Author

rruuaanng commented Oct 5, 2024

Maybe I shouldn't remove the "", but this change comes from here.

Pools in the linked list arena.freepools are not guaranteed in the same size class. So comment in the line below may be not accurate.

struct pool_header *nextpool; /* next pool of this size class */

Maybe I should try to describe it in brief terms.

@picnixz
Copy link
Member

picnixz commented Oct 5, 2024

I would either do nothing or do what you did (namely just write 'next pool' and 'previous pool'). But if we just keep 'next pool' and 'previous pool' in the comments, I feel we do not gain anything since the attribute names nextpool and prevpool are clear enough.

As such, I don't think we need to change what's already written but I'll leave that to Eric to decide.

@rruuaanng
Copy link
Contributor Author

rruuaanng commented Oct 5, 2024

I think what you said makes sense, so how do you think this modification works?

/*
* In used state, it points to the next partially used pool of the same size class
* in the circular doubly linked list (along with prevpool).
*
* In empty state, it points to the next pool in the singly linked list of free pools. 
* this pool can be reused for any size class later.
*
* In full state, nextpool has no meaningful.
*/
    struct pool_header *nextpool;
    struct pool_header *prevpool;       /* previous pool       ""        */

I think if nextpool has different meanings in different states, it should be as detailed as possible.

Like the comment it

    /* Whenever this arena_object is not associated with an allocated
     * arena, the nextarena member is used to link all unassociated
     * arena_objects in the singly-linked `unused_arena_objects` list.
     * The prevarena member is unused in this case.
     *
     * When this arena_object is associated with an allocated arena
     * with at least one available pool, both members are used in the
     * doubly-linked `usable_arenas` list, which is maintained in
     * increasing order of `nfreepools` values.
     *
     * Else this arena_object is associated with an allocated arena
     * all of whose pools are in use.  `nextarena` and `prevarena`
     * are both meaningless in this case.
     */
    struct arena_object* nextarena;
    struct arena_object* prevarena;

@picnixz
Copy link
Member

picnixz commented Oct 5, 2024

I think what you said makes sense, so how do you think this modification works?

No I really don't like duplicating the comment. There is no real gain for most of the users. It's already explained once clearly. The reason why nextarena is explained at the structure level is because it's only documented once here (I couldn't find any other reference).

If we wanted to be both concise and helpful, the best way to document it is to update the structure comment itself to Pool for small blocks (see details below).

@picnixz
Copy link
Member

picnixz commented Oct 5, 2024

Ideally, we could even just remove the entire section on arena management and pools and move them to an internal documentation. This part of the code is not exposed to external users and is really implementation details so documentation does not need to target the regular user.

@rruuaanng
Copy link
Contributor Author

I think what you said makes sense, so how do you think this modification works?

No I really don't like duplicating the comment. There is no real gain for most of the users. It's already explained once clearly. The reason why nextarena is explained at the structure level is because it's only documented once here (I couldn't find any other reference).

If we wanted to be both concise and helpful, the best way to document it is to update the structure comment itself to Pool for small blocks (see details below).

Hmmm, I agree, So see the pool table below for details.

@rruuaanng rruuaanng changed the title gh-92953: Correct comments error and extra character in struct pool_header gh-92953: Correct comments error in struct pool_header Oct 5, 2024
@rruuaanng rruuaanng changed the title gh-92953: Correct comments error in struct pool_header gh-92953: Add description source for nextpool in struct pool_header Oct 5, 2024
@rruuaanng rruuaanng changed the title gh-92953: Add description source for nextpool in struct pool_header gh-92953: Add description source to struct pool_header Oct 5, 2024
@erlend-aasland
Copy link
Contributor

The comment has been there since 2001; it's for a strictly internal API/data structure, and it has not caused confusion within the core dev team. Let's just keep it. Thanks for the PR.

@rruuaanng rruuaanng deleted the gh92953 branch October 15, 2024 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants