Skip to content

Commit 63d2e62

Browse files
melwitts10
authored andcommitted
Use subqueryload() instead of joinedload() for (system_)metadata
Currently, when we "get" a single instance from the database and we load metadata and system_metadata, we do so using a joinedload() which does JOINs with the respective tables. Because of the one-to-many relationship between an instance and (system_)metadata records, doing the database query this way can result in a large number of additional rows being returned unnecessarily and cause a large data transfer. This is similar to the problem addressed by change I0610fb16ccce2ee95c318589c8abcc30613a3fe9 which added separate queries for (system_)metadata when we "get" multiple instances. We don't, however, reuse the same code for this change because _instances_fill_metadata converts the instance database object to a dict, and some callers of _instance_get_by_uuid need to be able to access an instance database object attached to the session (example: instance_update_and_get_original). By using subqueryload() [1], we can perform the additional queries for (system_)metadata to solve the problem with a similar approach. Closes-Bug: #1799298 [1] https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#subquery-eager-loading Change-Id: I5c071f70f669966e9807b38e99077c1cae5b4606 (cherry picked from commit e728fe6)
1 parent cb4963b commit 63d2e62

File tree

2 files changed

+24
-1
lines changed

2 files changed

+24
-1
lines changed

nova/db/sqlalchemy/api.py

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@
4747
from sqlalchemy.orm import aliased
4848
from sqlalchemy.orm import joinedload
4949
from sqlalchemy.orm import noload
50+
from sqlalchemy.orm import subqueryload
5051
from sqlalchemy.orm import undefer
5152
from sqlalchemy.schema import Table
5253
from sqlalchemy import sql
@@ -1266,13 +1267,27 @@ def _build_instance_get(context, columns_to_join=None):
12661267
continue
12671268
if 'extra.' in column:
12681269
query = query.options(undefer(column))
1270+
elif column in ['metadata', 'system_metadata']:
1271+
# NOTE(melwitt): We use subqueryload() instead of joinedload() for
1272+
# metadata and system_metadata because of the one-to-many
1273+
# relationship of the data. Directly joining these columns can
1274+
# result in a large number of additional rows being queried if an
1275+
# instance has a large number of (system_)metadata items, resulting
1276+
# in a large data transfer. Instead, the subqueryload() will
1277+
# perform additional queries to obtain metadata and system_metadata
1278+
# for the instance.
1279+
query = query.options(subqueryload(column))
12691280
else:
12701281
query = query.options(joinedload(column))
12711282
# NOTE(alaski) Stop lazy loading of columns not needed.
12721283
for col in ['metadata', 'system_metadata']:
12731284
if col not in columns_to_join:
12741285
query = query.options(noload(col))
1275-
return query
1286+
# NOTE(melwitt): We need to use order_by(<unique column>) so that the
1287+
# additional queries emitted by subqueryload() include the same ordering as
1288+
# used by the parent query.
1289+
# https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#the-importance-of-ordering
1290+
return query.order_by(models.Instance.id)
12761291

12771292

12781293
def _instances_fill_metadata(context, instances, manual_joins=None):

nova/tests/unit/db/test_db_api.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1693,6 +1693,14 @@ def test_instance_get_all_with_meta(self):
16931693
sys_meta = utils.metadata_to_dict(inst['system_metadata'])
16941694
self.assertEqual(sys_meta, self.sample_data['system_metadata'])
16951695

1696+
def test_instance_get_with_meta(self):
1697+
inst_id = self.create_instance_with_args().id
1698+
inst = db.instance_get(self.ctxt, inst_id)
1699+
meta = utils.metadata_to_dict(inst['metadata'])
1700+
self.assertEqual(meta, self.sample_data['metadata'])
1701+
sys_meta = utils.metadata_to_dict(inst['system_metadata'])
1702+
self.assertEqual(sys_meta, self.sample_data['system_metadata'])
1703+
16961704
def test_instance_update(self):
16971705
instance = self.create_instance_with_args()
16981706
metadata = {'host': 'bar', 'key2': 'wuff'}

0 commit comments

Comments
 (0)