Skip to content

Lazy load post meta#4216

Open
spacedmonkey wants to merge 35 commits intoWordPress:trunkfrom
spacedmonkey:fix/lazy-load-post-meta
Open

Lazy load post meta#4216
spacedmonkey wants to merge 35 commits intoWordPress:trunkfrom
spacedmonkey:fix/lazy-load-post-meta

Conversation

@spacedmonkey
Copy link
Member

@spacedmonkey spacedmonkey commented Mar 10, 2023

Trac ticket: https://core.trac.wordpress.org/ticket/57496


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@github-actions
Copy link

github-actions bot commented Jun 12, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props spacedmonkey, joemcgill, peterwilsoncc, swissspidy.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

The commit updates the expected number of database queries in several PHPUnit tests. The changes reflect improvements made in database query management and caching strategy. As a result, these tests are now expecting fewer queries than before, ensuring that our performance optimizations are correctly implemented and functioning.
@github-actions
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

Introduce 'rest_menu_read_access' filter to allow customization of read access permissions in REST API endpoints for menus, menu items, and menu locations. This facilitates more flexible access control based on custom criteria.
Replaced wp_lazyload_post_meta with update_postmeta_cache to improve clarity and ensure the correct function is executed for updating post meta cache.
Copy link
Member

@joemcgill joemcgill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @spacedmonkey. I haven't tested this, but have left a few questions after reading through the proposed changes.


$object_ids = array_keys( $this->pending_objects[ $meta_type ] );
if ( $object_id && ! in_array( $object_id, $object_ids, true ) ) {
return $check;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole method is a line for line duplication of lazyload_meta_callback() except for this line. Why is the logic different here, and could this be simplified by adding a check for $meta_type to return the $check rather than adding the $object_id to the array of IDs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joemcgill This is the secret sause of this change. Unlike other meta types like, site, comment and term, post meta gets called ALOT. So to stop every call to post meta loading this lazy load queue, this only loads if the queue if an item is in the post meta queue.

For, if I have 10 items of post meta loaded and 5 items in the lazy load queue, if you request an post id in the load meta, you DONT want the queue items to load.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacedmonkey I think the logic here should be slightly different. If the object ID passed to get_post_meta() is already cached, then it makes sense not to load the queued data. If the object ID is not cached, then a database call will be made and it makes sense to load the queued data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc that is not how the other meta types work. IDs are put in the queue. If the id is already loaded, when wp_cache_get_multiple, is called, if it is only object is not cache is loaded.

I don't see lots of cases where an id in the queue but also primed. I expect that ids that might be used will be added to the queue, for example attachment meta or post parent meta.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacedmonkey I realise that is not how it works for other meta data (potentially a bug), but you are proposing that post meta works differently.

However, the code in it's current form will trigger a database query for get_post_meta( $uncached_post_id ) without priming other data in the queue. As the idea of lazy loading meta data is to ride another database query, it seems we should do that here even if the uncached object is not queued.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I am starting to get that. I am honestly not sure how we would do this.

I also think there is a trade off here between number of database queries vs memory usage. I see the desire to limit the number of database queries here. But loading post meta that might not ever be used could be worse than before. The current patch ensure that the post meta is used, as it is requested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, memory use is a risk and possibly why the lazy loading of post meta wasn't introduced earlier but I haven't been able to find an explanation.

Using my example above, the risk exists for queued meta get_post_meta( 8 ) too, so I am not sure that it's a problem to accept the risk for the unqueued meta that requires a database call get_post_meta( 10 );

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc I am not sure I understand what you are trying to say here. What is your recommended course of action?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacedmonkey I am saying that if a database query is to be made, the queue should be processed. This should happen regardless of whether the object ID is in the queue or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my suggestion about consolidating this logic into the existing method in ebf588e. Once the approach for clearing the queue has been settled, I'm happy with the shape of this update.

}

if ( ! isset( $q['lazy_load_post_meta'] ) ) {
$q['lazy_load_post_meta'] = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the DocBlock, shouldn't this be the following?

Suggested change
$q['lazy_load_post_meta'] = false;
$q['lazy_load_post_meta'] = $q['update_post_meta_cache'];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than just removing the docs about the default in 4e3a8ec, can it be updated to be accurate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joemcgill I am not sure what is being asked here.

if ( ! isset( $q['lazy_load_post_meta'] ) ) {
$q['lazy_load_post_meta'] = false;
} elseif ( $q['lazy_load_post_meta'] ) {
$q['update_post_meta_cache'] = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For term meta, lazy loading requires term cache priming. Can you add an inline comment about why this is different?

Also, the Docblock states...

Setting to false will disable cache priming for post meta, so that each get_post_meta() call will hit the database.

...but that doesn't seem to be implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacedmonkey it still doesn't seem that this comment was addressed:

Also, the Docblock states...

Setting to false will disable cache priming for post meta, so that each get_post_meta() call will hit the database.

...but that doesn't seem to be implemented.

Am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joemcgill I have added some inline comments.

@spacedmonkey
Copy link
Member Author

@joemcgill @mukeshpanchal27 mind taking a look at this PR?

Copy link
Member

@joemcgill joemcgill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken another look at this and still seems like a few questions need to be addressed:

Also, seems like there is still some discussion that needs to be resolved in this thread about when the queue should be cleared.

Even so, my original question in this thread didn't seem to be answered, which is whether lazyload_meta_callback() could be extended to handle lazy loading of post meta, since so much of the logic is duplicated. Something like this:

diff --git a/src/wp-includes/class-wp-metadata-lazyloader.php b/src/wp-includes/class-wp-metadata-lazyloader.php
index f9c02391d2..2d11e5f6ea 100644
--- a/src/wp-includes/class-wp-metadata-lazyloader.php
+++ b/src/wp-includes/class-wp-metadata-lazyloader.php
@@ -187,6 +187,11 @@ class WP_Metadata_Lazyloader {
 
 		$object_ids = array_keys( $this->pending_objects[ $meta_type ] );
 		if ( $object_id && ! in_array( $object_id, $object_ids, true ) ) {
+			// Only load in the post meta value is in the queue.
+			if ( 'post' === $meta_type ) {
+				return $check;
+			}
+
 			$object_ids[] = $object_id;
 		}
 

Replaces the `lazyload_post_meta` method with a consolidated `lazyload_meta_callback` for all meta types. Simplifies code structure and eliminates redundancy, while maintaining functionality for post metadata handling.
Clarifies the behavior of the `$lazy_load_post_meta` parameter by specifying fallback to default post meta priming. Additionally, adjusts whitespace in `tax_query` array assignments for consistency. These changes improve code readability and maintainability.
Defaulted `lazy_load_post_meta` to false and clarified its dependency on `update_post_meta_cache`. Ensured correct functionality by disabling post meta caching when lazy loading is enabled.
Aligned the array key for better readability and consistency with the surrounding code. This change does not impact functionality but improves code clarity in the test file.
Copy link
Member Author

@spacedmonkey spacedmonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joemcgill I believe I have actioned your feedback.

Comment on lines +194 to +197
// Only load in the post meta value is in the queue.
if ( 'post' === $meta_type ) {
return $check;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacedmonkey Re #4216 (comment), I think the code should simply be this to see if a database query is required. If not, it bypasses processing the queue, if so it adds the object ID and continues lazy loading.

Suggested change
// Only load in the post meta value is in the queue.
if ( 'post' === $meta_type ) {
return $check;
}
if ( false !== wp_cache_get( $object_id, $meta_type . '_meta' ) ) {
// Object already in cache, no database query will be made.
return $check;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc This happens in here

$cache_group = $meta_type . '_meta';
$non_cached_ids = array();
$cache = array();
$cache_values = wp_cache_get_multiple( $object_ids, $cache_group );
foreach ( $cache_values as $id => $cached_object ) {
if ( false === $cached_object ) {
$non_cached_ids[] = $id;
} else {
$cache[ $id ] = $cached_object;
}
}
if ( empty( $non_cached_ids ) ) {
return $cache;
}
// Get meta info.
$id_list = implode( ',', $non_cached_ids );
$id_column = ( 'user' === $meta_type ) ? 'umeta_id' : 'meta_id';
$meta_list = $wpdb->get_results( "SELECT $column, meta_key, meta_value FROM $table WHERE $column IN ($id_list) ORDER BY $id_column ASC", ARRAY_A );

So if it first checks if the id is in cache, then if not, then it will do a database query.

Add this check, just repeats to the logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite the same.

get_metadata( type, **cached** object ID ...) will process the queue, even though a database query is not required. This is because "get_{$meta_type}_metadata" fires prior to checking the objects cache.

As you've been trying to avoid unnecessary database queries while getting post meta data, this acheives that by only processing the queue if a database query will be made.

Copy link
Member Author

@spacedmonkey spacedmonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One future improvement to the lazy_loading class, we could add a loaded_meta property to the class. This could be used to store a list of id / meta type for meta that has already been loaded. Already loaded meta, would not be allowed to be added to the queue again and would be skipped in the loading.

Comment on lines +194 to +197
// Only load in the post meta value is in the queue.
if ( 'post' === $meta_type ) {
return $check;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc This happens in here

$cache_group = $meta_type . '_meta';
$non_cached_ids = array();
$cache = array();
$cache_values = wp_cache_get_multiple( $object_ids, $cache_group );
foreach ( $cache_values as $id => $cached_object ) {
if ( false === $cached_object ) {
$non_cached_ids[] = $id;
} else {
$cache[ $id ] = $cached_object;
}
}
if ( empty( $non_cached_ids ) ) {
return $cache;
}
// Get meta info.
$id_list = implode( ',', $non_cached_ids );
$id_column = ( 'user' === $meta_type ) ? 'umeta_id' : 'meta_id';
$meta_list = $wpdb->get_results( "SELECT $column, meta_key, meta_value FROM $table WHERE $column IN ($id_list) ORDER BY $id_column ASC", ARRAY_A );

So if it first checks if the id is in cache, then if not, then it will do a database query.

Add this check, just repeats to the logic.

@spacedmonkey
Copy link
Member Author

@peterwilsoncc I mocked up an idea that I had to track the already loaded meta ids as part of the metadata lazy loading class. #8399. This would save lot of lookup to cache values. I see this as a general improve to this class and not linked to this change, but it can be bundled, if you see it a blocker

#8399

@peterwilsoncc
Copy link
Contributor

@spacedmonkey I don't think it's a blocker, what is a blocker is failing to process the queue for post types when a databse query will be made. The whole point of lazy loaded meta data is to process the queue alongside another query, the custom code for post meta defeats the purpose of this code.

This test, which is currently failing, demonstrates the issue:

/**
 * @ticket 57496
 */
public function test_post_meta_database_queries_process_the_queue() {
	$lazy_loaded_post_ids = self::$post_ids;
	// Remove one post from the queue.
	$unqueued_post_id = array_shift( $lazy_loaded_post_ids );

	new WP_Query(
		array(
			'post_type'           => 'wptests_pt',
			'post__in'            => $lazy_loaded_post_ids,
			'lazy_load_post_meta' => true,
		)
	);

	// Post meta cache should be empty.
	$cache = wp_cache_get_multiple( self::$post_ids, 'post_meta' );
	$cache = array_filter( $cache );
	$this->assertEmpty( $cache, 'Post meta cache is expected to be empty' );

	// Getting post meta for unqueued post should trigger a database query.
	$queries_before = get_num_queries();
	get_post_meta( $unqueued_post_id );
	$queries_after = get_num_queries();
	$this->assertSame( 1, $queries_after - $queries_before, 'Expected a database query getting unqueued post meta' );

	// Ensure getting post meta for queued posts does not trigger a database query.
	$queries_before = get_num_queries();
	array_map( 'get_post_meta', self::$post_ids );
	$queries_after = get_num_queries();
	$this->assertSame( 0, $queries_after - $queries_before, 'Getting lazy loaded post meta is not expected to trigger a database query' );

	// Ensure all the post meta is now in the cache.
	$cache = wp_cache_get_multiple( self::$post_ids, 'post_meta' );
	$cache = array_filter( $cache );
	$this->assertCount( count( self::$post_ids ), $cache, 'Expected all post meta to be in the cache' );
}

Removing the custom code for post meta allows the test to pass.

@spacedmonkey
Copy link
Member Author

I don't think it's a blocker, what is a blocker is failing to process the queue for post types when a databse query will be made. The whole point of lazy loaded meta data is to process the queue alongside another query, the custom code for post meta defeats the purpose of this code.

This test, which is currently failing, demonstrates the issue:

@peterwilsoncc
This PR is not just about reducing database queries by bundling post meta in the queue when post meta is loaded. Let me clarify.

For blog (site), term, and comment meta, WordPress core rarely, if ever, makes use of this data. These types of meta are primarily designed for developers to extend functionality. Because of this, it would be inefficient to load them on every page request, which is why they are lazily loaded.

Post meta, however, is different. It is widely used throughout WordPress core across all post types. Unlike blog, term, or comment meta functions, get_post_meta is called frequently—sometimes over 100 times per page load. As a result, the post meta queue would be constantly cleared, making lazy loading ineffective.

This PR does not aim to bundle post meta queries together. Instead, it creates a queue of post meta IDs that might be needed in a request. Here are two examples:
1. Post meta for parent posts of posts within the loop.
2. Post meta for posts/pages that are referenced within other posts.

The first time an ID in the queue is requested—such as the first menu item linking to a page—all associated meta will be loaded at once. This allows us to disable post meta priming in areas like menus, where the post meta for linked pages is unlikely to be needed. See #60199.

By design, your test would fail, as it is not designed to do what you are asking.

@peterwilsoncc
Copy link
Contributor

This PR does not aim to bundle post meta queries together. Instead, it creates a queue of post meta IDs that might be needed in a request. Here are two examples: 1. Post meta for parent posts of posts within the loop. 2. Post meta for posts/pages that are referenced within other posts.

The first time an ID in the queue is requested—such as the first menu item linking to a page—all associated meta will be loaded at once. This allows us to disable post meta priming in areas like menus, where the post meta for linked pages is unlikely to be needed. See #60199.

I think what you are describing is the the purpose of the existing update_post_meta_cache option in WP_Query.

The design of lazy loading post meta IS to clear the queue when post meta is requested. Having it behave differently for different types of meta data is simply confusing, and won't result in fewer database queries.

The concern in Core-60199 is a valid concern but the code as written doesn't solve that. Instead of querying post meta in a single query, it increases the chance that post meta will be loaded for each post as a single query.

If loading post meta by single queries is better under some circumstances, then the PR should be targettng that rather than using the existing API.

Comment on lines +800 to +801
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the following changes better reflect the current state of the logic and resolves conversations here and here.

Suggested change
* @type bool $lazy_load_post_meta Whether to lazy-load post meta. If set to false, post meta priming will
* fallback to the default behavior of priming post meta.
* @type bool $lazy_load_post_meta Whether to lazy-load post meta. When set to true, post meta priming will
* be disabled. Default false.


$object_ids = array_keys( $this->pending_objects[ $meta_type ] );
if ( $object_id && ! in_array( $object_id, $object_ids, true ) ) {
return $check;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my suggestion about consolidating this logic into the existing method in ebf588e. Once the approach for clearing the queue has been settled, I'm happy with the shape of this update.

@joemcgill
Copy link
Member

Re-reading the current discussion and if I'm understanding correctly, the only question that needs to be resolved is whether queued post meta should be loaded only when another queued post meta is called, rather than clearing the queue when any uncached post meta is loaded.

The current approach seems to only load queued meta when another queued meta ID is requested.

I think @peterwilsoncc's argument in this comment is valid. To paraphrase:

The risk of loading unused post meta is just as likely when loading another ID from the queue as it is when loading an ID that was not already in the queue. However, I don't feel strongly about this being a blocker to an initial commit. If we find that only clearing the queue when a lazy loaded ID is called results in more unnecessary DB calls, then we can adjust this before the final release.

@swissspidy
Copy link
Member

I'm probably not the right one to review this...are there perhaps other post meta experts?

NB: there are some merge conflicts now, so a refresh would be good.

* @since 5.3.0 Introduced the `$meta_type_key` parameter.
* @since 6.1.0 Introduced the `$update_menu_item_cache` parameter.
* @since 6.2.0 Introduced the `$search_columns` parameter.
* @since 6.8.0 Introduced the `$lazy_load_post_meta` parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs updating now.

/**
* Queue post meta for lazy-loading.
*
* @since 6.8.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs updating now.

Copy link
Member

@SirLouen SirLouen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacedmonkey I've added you some comments to help refreshing the PR

Please ping me in trac when you have finished refreshing this (and maybe adding the proposed changes by the rest) so I can start some performance testing.

$q['category__not_in'] = array_map( 'absint', array_unique( (array) $q['category__not_in'] ) );
sort( $q['category__not_in'] );
$tax_query[] = array(
$tax_query[] = array(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one that is not applying because it was already fixed

$q['tag__not_in'] = array_map( 'absint', array_unique( (array) $q['tag__not_in'] ) );
sort( $q['tag__not_in'] );
$tax_query[] = array(
$tax_query[] = array(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the second that is not applying

$post_ids = array_filter( $post_ids );
if ( $post_ids ) {
_prime_post_caches( $post_ids, $this->query_vars['update_post_term_cache'], $this->query_vars['update_post_meta_cache'] );
if ( $this->query_vars['lazy_load_post_meta'] ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changeset [59993] brought major changes in this section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants