Skip to content

Conversation

ChrisChV
Copy link
Contributor

@ChrisChV ChrisChV commented Jul 28, 2025

Description

  • Adds the top_level_parent_usage_key to the EntityLinkBase
  • This field is used to save the top-level parent of a component or container when it is imported into a course. Example: A unit with components imported into a course. The unit is the top-level parent of the components.
  • Updates the DownstreamListView to return the top-level parents instead of downstream child, if this parent exists.
  • Each time containers with children were synchronized, a new downstream block was created for each child instead of updating the existing one. This occurred because the upstream_key was incorrectly validated as an Opaquekey against a list of key strings. This was fixed by converting the upstream_key to a string before the verification. (see 34cd5a4 and 2964783)
  • Which edX user roles will this change impact? "Course Author", "Developer".

Supporting information

Testing instructions

Deadline

None

Other information

@openedx-webhooks
Copy link

openedx-webhooks commented Jul 28, 2025

Thanks for the pull request, @ChrisChV!

This repository is currently maintained by @openedx/wg-maintenance-edx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.


Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Jul 28, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions Jul 28, 2025
@ChrisChV ChrisChV changed the title feat: Add top_level_parent field and updates the API feat: Add top_level_parent field and updates the API [FC-0097] Jul 28, 2025
@ChrisChV ChrisChV marked this pull request as draft July 28, 2025 16:38
@mphilbrick211 mphilbrick211 added the FC Relates to an Axim Funded Contribution project label Jul 28, 2025
@mphilbrick211 mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Jul 28, 2025
@ChrisChV ChrisChV changed the title feat: Add top_level_parent field and updates the API [FC-0097] feat: Add top-level parent logic to Upstream/Dowstream links [FC-0097] Jul 28, 2025
@ChrisChV ChrisChV marked this pull request as ready for review July 28, 2025 21:05
Copy link
Contributor

@navinkarkera navinkarkera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisChV Nice work! This works as expected but I have left some suggestions and a doubt in the tests.

Comment on lines 199 to 205
top_level_keys = result.exclude(id__in=non_top_level_ids).values_list(
'top_level_parent_usage_key', flat=True
).distinct()

top_level_objects = ContainerLink.filter_links(**{
"downstream_usage_key__in": top_level_keys
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, top_level_* variables should be the ones that are at the top, but here they are the objects that are under some other container.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. The top_level_parent always represents the highest-level parent at import time, not the direct parent. This code retrieves the top-level parent keys and their objects (containers) and returns them instead of their descendant.

`use_top_level_parents` is an special filter, replace any result with the top-level parent if exists.
Example: We have linkA and linkB with top-level parent as linkC and linkD without top-level parent.
After all other filters:
Case 1: `use_top_level_parents` is False, the result is [linkA, linkB, linkD]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Case 1: `use_top_level_parents` is False, the result is [linkA, linkB, linkD]
Case 1: `use_top_level_parents` is False, the result is [linkA, linkB, linkC, linkD]

It should list all components no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated: bc484ec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisChV Just want to confirm whether linkC will not show up if use_top_level_parents is False?

assert response.status_code == 200
data = response.json()
self.assertEqual(data["count"], 3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After making the above suggested change, this increases to 4, which seems correct as we are also making a change to top level subsection in line 1096 no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. According to the requirements, only the top-level parents should be displayed. Even if all containers and components within a section are updated, only the top-level parent, which is the section, should be displayed. In this test, the top-level parent of the subsection and component is the section. This test checks that only the section is displayed and is not duplicated in the result.

The test fails because the code above is not replacing descendants with their top-level parent, it is adding the subsection to the response.

I added another more simple tests for this: bc484ec#diff-cdab0b43d246b3a18252af57261eefce305c7a0ed93bce4549288127324c5bf3R1085

Copy link
Contributor

@navinkarkera navinkarkera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisChV Great work, it is not easy to catch all the edge cases here! 👍

  • I tested this: (Played around with multiple containers from library in courses)
  • I read through the code
  • I checked for accessibility issues
  • Includes documentation

@navinkarkera
Copy link
Contributor

navinkarkera commented Jul 30, 2025

@ChrisChV One more request, could you update the process_list function here to something like this:

if item["last_published_at"] and merged[key]["last_published_at"] and item["last_published_at"] > merged[key]["last_published_at"]:

I was testing #37008 from @rpenido and I think it would be possible to copy unpublished blocks and this line fails in that case.

OR we should not allow users to copy unpublished blocks from library,

@ChrisChV
Copy link
Contributor Author

I was testing #37008 from @rpenido and I think it would be possible to copy unpublished blocks and this line fails in that case.

OR we should not allow users to copy unpublished blocks from library,

It's a good question. I agree with not allowing users to copy unpublished blocks from libraries. @edschema @sdaitzman What do you think?

@rpenido
Copy link
Contributor

rpenido commented Jul 30, 2025

OR we should not allow users to copy unpublished blocks from library,

For more context, when we copy/paste a component from a library, we are copying the latest draft version and linking it.
When we use the "Library Content" to add a component, we only show and add the published version of the element.

ChrisChV added 4 commits July 30, 2025 12:31
Each time containers with children were synchronized, a new downstream block was created for each child
instead of updating the existing one. This occurred because the upstream_key was incorrectly validated as
an Opaquekey against a list of key strings. This was fixed by converting the upstream_key to a string before the verification.
@edschema
Copy link

OR we should not allow users to copy unpublished blocks from library,

For more context, when we copy/paste a component from a library, we are copying the latest draft version and linking it. When we use the "Library Content" to add a component, we only show and add the published version of the element.

Hi @rpenido @ChrisChV and team, thanks for catching this! Users should not be able to reuse unpublished blocks. This is a good case I don't believe we have accounted for. I made an issue to track disabling copy from the context menu for unpublished blocks here: openedx/frontend-app-authoring#2344

If copying a block from a library that has been published but has draft changes, when pasted, that block should show the latest published version in the course. By this I mean:

  • Block C is published
  • Block C is edited to Block C', but Block C' is not published
  • If a user copies Block C' and pastes into their course, the version pasted should be Block C

@rpenido
Copy link
Contributor

rpenido commented Jul 30, 2025

Thanks @edschema! I left a comment on the new issue.

@bradenmacdonald
Copy link
Contributor

bradenmacdonald commented Jul 31, 2025

Adds the top_level_parent_usage_key to the EntityLinkBase
This field is used to save the top-level parent of a component or container when it is imported into a course. Example: A unit with components imported into a course. The unit is the top-level parent of the components.

I'm curious, could you achieve the same effect with an is_top_level boolean field, or is it necessary to store the full ID of the top level entity? (edit: Oh, I see - we don't really have the hierarchy available when we're querying the entity links table, so without the full ID we can't merge changes from children into the top level parent item. Makes sense.)

Also, when we get to unlinking (openedx/frontend-app-authoring#2164) we'll need to account for recursively updating the top_level_parent_usage_key. I'll add a note to that issue.

Copy link
Contributor

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work @ChrisChV! I think this is great, but I do have some concerns about import-export and re-runs, because of the way you're storing full usage IDs. I wonder if we can achieve the same thing without storing the course ID part.

@@ -260,6 +295,7 @@ def test_unit_sync(self):
editor="visual"
upstream="{self.upstream_html1['id']}"
upstream_version="2"
top_level_downstream_parent="{downstream_unit["locator"]}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work for copy-paste and other use cases, but course exports might be a problem. When you export the course, you can import it with a different ID. and then the top_level_downstream_parent IDs will all be wrong because they include the course ID part which has now changed.

We could store just the [top level parent block type, top level parent block ID] instead, which would be stable even if courses are re-run, imported, duplicated, etc.

OR instead of making top_level_downstream_parent a usagekey field, we could make it a foreign key to the parent *Link model, and then in the OLX just store is_top_level=True/False.

Actually now I'm wondering if we even have the logic to duplicate all the ContainerLink/ComponentLink objects when you re-run a course? Might be something we have overlooked?

Copy link
Contributor

@navinkarkera navinkarkera Aug 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we are exporting the links along with course at all, we also don't duplicate them on re-run. Simplest option to fix it would be to run recreate_upstream_links for the specific course after importing a course or re-run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplest option to fix it would be to run recreate_upstream_links for the specific course after importing a course or re-run.

Good point. I created openedx/frontend-app-authoring#2269 for this. @ChrisChV can you please review?

@@ -327,6 +327,17 @@ class UpstreamSyncMixin(XBlockMixin):
default=None, scope=Scope.settings, hidden=True, enforce_type=True,
)

top_level_downstream_parent = String(
help=(
"The usage key of the downstream block that is the top-level parent of "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment above. I'm worried about encoding full usage keys in XBlock fields. I think there is a Reference field type that is supposed to handle this properly (store the type+ID but not the course ID part), but I'm not sure if it's actually used anywhere or if it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 49e1989

I have used a simple method to save a string "block_type@block_id". Neither the Reference class nor the DefinitionKey class helped me much.

OR instead of making top_level_downstream_parent a usagekey field, we could make it a foreign key to the parent *Link model, and then in the OLX just store is_top_level=True/False.

Regarding using a boolean, a descendant wouldn't know exactly what its top-level parent is; the boolean doesn't give that information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your simple string method sounds fine to me, but I don't think you should call it a "definition key". In split modulestore, definition refers to a very specific thing that contains a subset of the XBlock fields, and I think it's going to be confusing if you call this a "definition key".

I guess you are thinking of DefinitionLocator and its related DefinitionKey which does represent a pair of [block_type, ID] but the ID is not a block_id - it's a mongoDB object ID. So storing a [block_type, block_id] and calling it a "definition key" (which is always a [block_type, mongoDB object ID]) is going to be confusing.

There is actually a thing you can use for this though - called BlockKey:

class BlockKey(NamedTuple):
"""
A pair of strings (type, id) that uniquely identify an XBlock Usage within a LearningContext.
Put another way: LearningContextKey * BlockKey = UsageKey.
Example:
"course-v1:myOrg+myCourse+myRun" <- LearningContextKey string
("html", "myBlock") <- BlockKey
"course-v1:myOrg+myCourse+myRun+type@html+block@myBlock" <- UsageKey string
"""
type: str
id: str
@classmethod
def from_usage_key(cls, usage_key):
return cls(usage_key.block_type, usage_key.block_id)

I think you can keep all your code the same, but just rename top_level_downstream_parent_def to top_level_downstream_parent_key and maybe rewrite get_definition_from_usage_key to use BlockKey.from_usage_key() before converting it to a string .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald Thanks! It works

Updated in: c664a7a

@ChrisChV
Copy link
Contributor Author

ChrisChV commented Aug 5, 2025

@bradenmacdonald This and openedx/frontend-app-authoring#2271 are ready for another review

I will do this openedx/frontend-app-authoring#2168 (comment) in a follow-up PR. I would like to merge this PR to unlock other tasks.

Copy link
Contributor

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ChrisChV ! Looking good, just one concern about the name "definition" and then once you've renamed that and @navinkarkera confirms he's tested the final version, I will approve.

@ChrisChV
Copy link
Contributor Author

ChrisChV commented Aug 8, 2025

Thanks @ChrisChV ! Looking good, just one concern about the name "definition" and then once you've renamed that and @navinkarkera confirms he's tested the final version, I will approve.

@navinkarkera It's ready for another review

@ChrisChV
Copy link
Contributor Author

@bradenmacdonald @navinkarkera I need some help here.

These tests are failing because of this change 633e050#diff-5b6284478c70bbd62aaaf2b39e1d5b4b526214ecf8aea0cea9fefc84fdab79e7R277

Adding the delay solves the problem, but it is not what is needed now. It seems that some configuration is missing in the tests, but I can't find the solution.

@bradenmacdonald
Copy link
Contributor

@ChrisChV Are you talking about this?

# There is a race condition if it's asynchronous: If a container with children
# is added, everything is created in bulk. If it's asynchronous, there's no certainty
# which links will be created first. They must be executed in order so that the parent
# link is created first and the children can link to it as a top-level parent.
handle_create_or_update_xblock_upstream_link(str(xblock_info.usage_key))

What I would suggest is that in the child handler, when you go to create the link and find that the parent link doesn't exist yet, then create the parent link synchronously and idempotently (call handle_create_or_update_xblock_upstream_link for the parent from within the child handle_create_or_update_xblock_upstream_link handler). Later, the handler will get called to create the parent link and it will already exist, so it will be a no-op, but that's fine. Would something like that work?

@ChrisChV
Copy link
Contributor Author

@bradenmacdonald It works! Thanks! Those changes are ready for review if you wish to take a look: 6cc840a

Thanks @ChrisChV ! Looking good, just one concern about the name "definition" and then once you've renamed that and @navinkarkera confirms he's tested the final version, I will approve.

@navinkarkera It's ready for a final review

@bradenmacdonald
Copy link
Contributor

@ChrisChV Glad to hear! I'll let @navinkarkera do a final check and then you can go ahead and merge. If he's not able to review by tomorrow though let me know.

Copy link
Contributor

@navinkarkera navinkarkera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisChV Nice work! Tested this locally with subsections and units.

`use_top_level_parents` is an special filter, replace any result with the top-level parent if exists.
Example: We have linkA and linkB with top-level parent as linkC and linkD without top-level parent.
After all other filters:
Case 1: `use_top_level_parents` is False, the result is [linkA, linkB, linkD]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisChV Just want to confirm whether linkC will not show up if use_top_level_parents is False?

@ChrisChV
Copy link
Contributor Author

Just want to confirm whether linkC will not show up if use_top_level_parents is False?

Yes, I updated the comment. Thanks! 5d806b0

@ChrisChV ChrisChV enabled auto-merge (squash) August 14, 2025 16:59
@ChrisChV ChrisChV merged commit ec72dc7 into openedx:master Aug 14, 2025
48 checks passed
@ChrisChV ChrisChV deleted the chris/FAL-4231-top-level-parent branch August 14, 2025 17:36
@github-project-automation github-project-automation bot moved this from Waiting on Author to Done in Contributions Aug 14, 2025
@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX production environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants