-
Notifications
You must be signed in to change notification settings - Fork 405
Add an admin API to search for children of a room #19021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
test failures appear to be flakes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't reviewed the code much as there may be a potential pivot (see review comments)
failures look like flakes |
synapse/rest/admin/rooms.py
Outdated
pl_event = await self._storage_controllers.state.get_current_state_event( | ||
room_id, EventTypes.PowerLevels, "" | ||
) | ||
pl_users = [] | ||
if pl_event is not None: | ||
users = pl_event.content.get("users", {}) | ||
for user, level in users.items(): | ||
if level >= 25: | ||
pl_users.append(user) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks arbitrary and specific to some use case. It's probably better to use some other endpoints to fetch this data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the information fetched in _fetch_additional_details
can be filled by other endpoints, so I have removed it in favor of that approach.
synapse/rest/admin/rooms.py
Outdated
if creation_event | ||
else None, | ||
"creator": creation_event.sender if creation_event else None, | ||
"creation_event_id": creation_event.event_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's useful when researching rooms to know the creator to be able to then find other rooms the creator has created, as well as the age of the room. The room creation event id is probably not strictly necessary, but all of this obviated by using other endpoints to pull this information in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the admin endpoint, we may want to include some of this info that you're after. The admin endpoint is meant to be somewhat convenient. It would just be good to include this reasoning for the fields added.
It may also be useful to have the behavior align with the client /hierarchy
endpoint 🤷 So if things work without the extra fields, we can move forward and add them in later if a greater need arises.
We should just be mindful of the N+1 query problem: #19021 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should work without the extra fields and can follow up if in practice that isn't the case.
synapse/rest/admin/rooms.py
Outdated
"room_creation_ts": creation_event.origin_server_ts | ||
if creation_event | ||
else None, | ||
"creator": creation_event.sender if creation_event else None, | ||
"creation_event_id": creation_event.event_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The m.room.create
event will be part of the stripped state that comes along with children_state
already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW it looks like the stripped state will only return the child space state events: "Required: The m.space.child events of the space-room, represented as Stripped State Events with an added origin_server_ts key." which also matches what I saw in practice but this concern is eliminated by using other endpoints to fetch this data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related MSC to include the create event in the stripped state for invites: MSC4311 (not related to the /hierarchy
endpoint)
admin_skip_room_check=True, | ||
max_depth=max_depth, | ||
limit=limit, | ||
from_token=parse_string(request, "from"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_token
being a string instead of actual token seems like a mistake to me but this is prior art ⏩
tests/rest/admin/test_room.py
Outdated
self.assertEqual(room_result["topic"], "being bad") | ||
self.assertEqual(room_result["join_rule"], "invite") | ||
self.assertEqual(len(room_result["children_state"]), 0) | ||
(self.assertEqual(room_result["world_readable"], False),) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra random syntax?
(self.assertEqual(room_result["world_readable"], False),) | |
self.assertEqual(room_result["world_readable"], False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still seems to be in the diff in various places
synapse/rest/admin/rooms.py
Outdated
room_entry_summary = await self._room_summary_handler.get_room_hierarchy( | ||
requester, | ||
room_id, | ||
omit_remote_rooms=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we be leaving holes in the /hierarchy
by not looking for remote rooms? (I think so)
For example, if no one from the server is joined to a given sub-space but is joined to some rooms in the sub-space, we would never see the structure here. It might be strange that no one is joined to a m.space.child
but seems possible.
But the federation /hierarchy
endpoint only seems to return public rooms that would be peekable and joined without an invite. So it would only fill in public holes.
Holes from private space rooms are expected unless someone is joined.
synapse/rest/admin/rooms.py
Outdated
|
||
# we omit returning remote rooms that the server is not currently participating in, | ||
# as that information shouldn't be available to the server admin (as they are not | ||
# participating in those rooms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't feel like the correct or whole reason why as far as I can tell. Context #19021 (comment)
I feel like this should read more like:
We omit remote rooms because we only care about managing rooms local to the homeserver. This also immensely helps with the response time of the endpoint since we don't need to reach out over federation. There is a trade-off where this will leave holes where xxx (see other conversation).
that information shouldn't be available to the server admin (as they are not participating in those rooms)
Why is that the case?
As far as I can tell, GET /_matrix/federation/v1/hierarchy/{roomId}
is at the server-level and is for peeking so there would be no issue for the admins' server to ask other servers about the remote /hierarchy
.
children the requesting server could feasibly peek/join are returned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking narrowly of restricted remote rooms (ie not public/peekable) - I've amended the comment to be clearer, hopefully it's better.
tests/rest/admin/test_room.py
Outdated
self.assertEqual(room_result["topic"], "being bad") | ||
self.assertEqual(room_result["join_rule"], "invite") | ||
self.assertEqual(len(room_result["children_state"]), 0) | ||
(self.assertEqual(room_result["world_readable"], False),) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still seems to be in the diff in various places
if not click.confirm( | ||
f"Uncommitted changes exist in {path}. Commit or stash them. Ready to continue?" | ||
): | ||
raise click.ClickException("Aborted.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was necessary to prevent the lint from failing, and stopping the rest of the tests from being run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now fixed on develop
-> #19092
# only room_id and children_state fields are returned for remote rooms | ||
self.assertEqual(len(room_result["children_state"]), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a little misleading as it's unknown, not zero. Feels like this should be left unset in the remote room cases that we don't fetch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default entry for _RoomEntry
is ()
@attr.s(frozen=True, slots=True, auto_attribs=True)
class _RoomEntry:
room_id: str
# The room summary for this room.
room: JsonDict
# An iterable of the sorted, stripped children events for children of this room.
#
# This may not include all children.
children_state_events: Sequence[JsonDict] = ()
which gets converted to json for the response by this:
def as_json(self, for_client: bool = False) -> JsonDict:
"""
Returns a JSON dictionary suitable for the room hierarchy endpoint.
It returns the room summary including the stripped m.space.child events
as a sub-key.
Args:
for_client: If true, any server-server only fields are stripped from
the result.
"""
result = dict(self.room)
# Before returning to the client, remove the allowed_room_ids key, if it
# exists.
if for_client:
result.pop("allowed_room_ids", False)
result["children_state"] = self.children_state_events
return result
So I think to leave children_state
unset I think we'd have to change the structure of _Room_Entry
to an optional and have as_json
omit children_state
in the case where self.children_state
is None
, but that seems clunkier than having the children_state
be an empty list in the case of remote rooms we don't fetch - thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if this could be addressed by adding a comment about the face that children_state
is unknown in the case of remote rooms in the endpoint documentation?
the room may have, paginating over the space tree in a depth-first manner to locate child rooms. This is functionally similar to the [CS Hierarchy](https://spec.matrix.org/v1.16/client-server-api/#get_matrixclientv1roomsroomidhierarchy) endpoint but does not return information about any remote | ||
rooms that the server is not currently joined to and does not check for room membership when returning room summaries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description should be updated with the nuance that we return all known rooms but we won't go and fetch the hierarchy from federation which may leave holes. These rooms only have room_ids, etc
state_key=state_key, | ||
) | ||
|
||
# and add remote room to space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a comment about "Ideally, we'd also add some rooms in this remote space but we're trying to keep that information unknown to this server which could only be found out by reaching out to remote_test_server
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I am a little unclear on what is being said here - specifically "we'd add some rooms in this remote space" as this is adding a room to a local space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a local space that we want to get the hiearchy of.
The local space includes a space that is only known remotely. Ideally, that remote space would also have some rooms and we'd assert that those rooms aren't pulled in from federation. But it's hard to test the absence of federation or those rooms appearing. We don't even have a remote homeserver setup to talk to in the case of these trial tests.
This would be best be handled by some in-repo Complement tests for the Synapse admin API. But we don't have that setup in the Synapse codebase yet.
I'm saying that we should have a comment about how this test could be better but it's hard because it's only at integration test level.
False, | ||
) | ||
root_room_entry = room_hierarchy[0] | ||
if not root_room_entry or not await self._is_remote_room_accessible( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should take admin_skip_room_visibility_check
into account here.
It's possible someone may utilize admin_skip_room_visibility_check
without omit_remote_room_hierarchy
in the future.
) | ||
for room in rooms | ||
}, | ||
{"space_room", "room1", "room2", "room3", "remote_room"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would the pagination be returning the same list of all rooms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is asserting on the concatenation of the first call and the second call with pagination, but I can change it so they are asserted separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, yeah that rings a bell. I picked up on this the first time but not this time (read past it).
It's probably good enough as-is. We assert the number of rooms in each request.
It is often useful when investigating a space to get information about that space and it's children. This PR adds an Admin API to return information about a space and it's children, regardless of room membership. Will not return information about remote rooms that the server is not participating in.