Fixes #38856 - chain composite CV publishes to wait on children#11621
Fixes #38856 - chain composite CV publishes to wait on children#11621ianballou merged 10 commits intoKatello:masterfrom
Conversation
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Both
scheduled_composite_publish?andrunning_component_publish_task_idsiterate over allContentView::Publishdynflow tasks and then inspect their payloads; if the task table is large this may be expensive, so consider narrowing the query (e.g., by label, input data, or usingfind_each) to avoid loading and deserializing every matching task. scheduled_composite_publish?rescuesStandardErrorand silently returnsfalse; it would be safer to either rescue a narrower set of exceptions (e.g., only dynflow persistence errors) or at least log failures so that issues in loading delayed plans are visible rather than quietly disabling the skip behavior.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Both `scheduled_composite_publish?` and `running_component_publish_task_ids` iterate over all `ContentView::Publish` dynflow tasks and then inspect their payloads; if the task table is large this may be expensive, so consider narrowing the query (e.g., by label, input data, or using `find_each`) to avoid loading and deserializing every matching task.
- `scheduled_composite_publish?` rescues `StandardError` and silently returns `false`; it would be safer to either rescue a narrower set of exceptions (e.g., only dynflow persistence errors) or at least log failures so that issues in loading delayed plans are visible rather than quietly disabling the skip behavior.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| # requests created by concurrent component CVs, causing duplicate publishes. | ||
| # Manual publishes and promotions should still check for pending requests. | ||
| content_view = ::Katello::ContentView.find_by(id: input[:auto_publish_content_view_id]) | ||
| return if content_view&.composite? && input[:auto_published] |
There was a problem hiding this comment.
The auto publisher was meant to be totally data driven and agnostic to which kind of CV it's being told to publish ie it's up to the individual actions populating auto_publish_content_view_ids to decide the auto publish candidates which are the composites. Does your change here hang on the fact that the CV is composite?
There was a problem hiding this comment.
I thought about this a little more. An auto publish request is just that - a request. What if the request_auto_publish method were smart enough to deny the request (return nil) in the scenario described in your comment? Nothing to check here in that case it seems like. That keeps the business logic in the Manager class
There was a problem hiding this comment.
diff --git a/app/lib/actions/helpers/content_view_auto_publisher.rb b/app/lib/actions/helpers/content_view_auto_publisher.rb
index 6f3c1866d0..3b7fd28f05 100644
--- a/app/lib/actions/helpers/content_view_auto_publisher.rb
+++ b/app/lib/actions/helpers/content_view_auto_publisher.rb
@@ -21,13 +21,6 @@ module Actions
end
def auto_publish_view(_execution_plan)
- # Skip for auto-published composite CVs to prevent race condition.
- # When a chained composite publish finishes, its :stopped hook could find
- # requests created by concurrent component CVs, causing duplicate publishes.
- # Manual publishes and promotions should still check for pending requests.
- content_view = ::Katello::ContentView.find_by(id: input[:auto_publish_content_view_id])
- return if content_view&.composite? && input[:auto_published]
-
request = ::Katello::ContentViewAutoPublishRequest.find_by(content_view_id: input[:auto_publish_content_view_id])
return unless request
diff --git a/app/services/katello/content_view_manager.rb b/app/services/katello/content_view_manager.rb
index a7c3b6e474..192efd8f17 100644
--- a/app/services/katello/content_view_manager.rb
+++ b/app/services/katello/content_view_manager.rb
@@ -22,6 +22,11 @@ module Katello
end
def self.request_auto_publish(content_view:, content_view_version:)
+ if scheduled_composite_publish?(content_view)
+ auto_publish_log(nil, "composite publish already scheduled, skipping")
+ return
+ end
+
request = content_view.create_auto_publish_request!(
content_view_version: content_view_version
)
@@ -48,14 +53,6 @@ module Katello
request.with_lock do
composite_cv = request.content_view
- # Check if composite publish is already scheduled (chained to component CVs)
- # If so, skip - the scheduled publish will use latest component versions
- if scheduled_composite_publish?(composite_cv)
- auto_publish_log(request, "composite publish already scheduled, skipping")
- request.destroy!
- return
- end
-
if content_view_locks(content_view: composite_cv).any?
auto_publish_log(request, "locks found")
return
There was a problem hiding this comment.
It's been a couple weeks since I've looked at this, but your method here makes sense.
Question though - does it make sense to skip auto publish for a composite view since a composite will never hold other composites?
There was a problem hiding this comment.
Question though - does it make sense to skip auto publish for a composite view since a composite will never hold other composites?
Did you mean to say, "skip a subsequent auto publish for a composite view that was just auto-published?" I'm inferring from the check you added here.
If so, then I have a question in return: given your PR with my patch applied, when would the conditional added here evaluate to true resulting in a skipped ContentViewAutoPublishRequest? In a real world scenario, not in terms of the logic.
Full disclosure: it seems this check is responsible for the bug I highlighted in my first round of testing and is conveniently removed by my patch.
Sorry if I missed the point of your question!
There was a problem hiding this comment.
Glad to hear the hanging bug is fixed by the change.
What I was seeing without this conditional was that the composite content view auto publish would trigger its own auto publish again because the children created extra requests for content view auto publishing. So two chained child tasks publish and we get two composite content view publishes.
I'll have to test this again with your changes, if you're not seeing extra composite CV publishes, then we're all set.
There was a problem hiding this comment.
I think I see now, with the new patch, the request is only created if there isn't a composite publish already scheduled, which simplifies/fixes this logic where we had to check the request origin. "An auto publish request is just that - a request."
-> indeed, no need for that request if it's unhelpful.
There was a problem hiding this comment.
Since the requests are saved in the DB, it's a benefit too to having fewer of them around in case weirdness happens like the race condition that caused the duplicates.
There was a problem hiding this comment.
Yep! When I was testing most of the duplicate create_auto_publish_request calls terminated there leading to much less contention around triggering the request. Win-win. I just gave it one or two passes of my script - definitely vet it against any other scenarios you have in mind.
There was a problem hiding this comment.
Well it definitely works for my by-hand test
|
Decided to give this a test using the script I shared in #11611 with 3 composites and ten shared components. I've ended up in some stuck state where I can publish the components but composites seems stuck: Run https://gist.github.com/jturel/76e94c4e2cbe39130f8e49b320cff40d with COMPONENT_COUNT=10 (this is just what I've been testing with) Once the components and composites finish there are three auto publish requests still in the DB - should be zero. Now, publish just the components: Components publish OK but composites are stuck |
@jturel thanks for the testing, did you test this with the change suggestion above? Or was it tested as-is? |
That was against the PR as written - didn't see it with the patch I shared! |
c2eac60 to
829a18d
Compare
|
The PR should be free of the race condition mentioned before now. |
|
Gave this another test and it looks like everything is working as expected! My log output looks like this when testing 3 composites sharing 10 components: In the end I get the expected three composite publishes Nice work! |
|
As an aside, but to put it on the record: most if not all of the defensive mechanisms in |
829a18d to
deb1af2
Compare
I'm thinking this over myself. You don't think two threads could call I agree though, regardless, I do prefer to remove this later with some more experiments. |
I can't seem to trigger that kind of scenario but that doesn't mean it's technically impossible. It's just much, much less likely now for the reason you mentioned - two threads shouldn't be trying to trigger the same request thanks to the checking added when creating the request. Although, since the race conditions are addressed and we can't predict how this code or its usage/call patterns might change in the future, I'm leaning toward leaving it indefinitely as far as things stand now. Which means my earlier comment is basically irrelevant :D |
deb1af2 to
ac667d6
Compare
|
Adam is requesting a small refactor so that foreman-tasks provides a chaining interface rather than having to use Dynflow directly: theforeman/foreman-tasks#788 (comment) |
ac667d6 to
5277f9b
Compare
|
@vsedmik I've fixed the issue by failing the CCV publish with a 409 if there is already a publish scheduled. |
test/models/content_view_test.rb
Outdated
| composite_cv = katello_content_views(:composite_view) | ||
|
|
||
| # Create a scheduled publish task for the composite CV | ||
| task = ForemanTasks::Task.create!( |
There was a problem hiding this comment.
The implementation of this test looks more like a test of scheduled_composite_publish? than check_ready_to_publish. Would be nicer to just put an expectation on scheduled_composite_publish? at the model test level here
There was a problem hiding this comment.
This test was created to exercise the new check_scheduled_publish! method. I'll make that more explicit.
There was a problem hiding this comment.
This is finished, let me know if it works for you.
5277f9b to
d90ffdc
Compare
|
This PR now has a hard dependency on theforeman/foreman-tasks#788 since I am using the |
2daa4be to
a5f32c9
Compare
|
@adamruzicka the dependencies (foreman tasks and dynflow) are bumped. |
|
theforeman/foreman_ansible#785 for the RH cloud problem |
1045530 to
7bb730a
Compare
|
I had to bump Ruby to 3.0 in Katello to work with Dynflow 2.0. |
|
The ruby test suite seems stuck on waiting..Can you try a force push or if you see a way to restart it..Changes look good and rh_cloud tests went green here so it's an APJ from me here once the ruby tests run once.. |
Adam told me that he doesn't have anything more to add here.
7bb730a to
a630802
Compare
|
Latest push was just a rebase to get the tests going again. |
8e33592 to
74c989d
Compare
|
It looks like the ruby upgrade added some fun rubocop issues. |
|
@vsedmik found an issue with the task names on the dependencies page in foreman-tasks - they matched the parent task instead of the dependency task. I also found out we can makes the names more verbose. That will be fixed in foreman-tasks itself. This PR is still good to go (at least once I check up on that failing test) |
bb0cf66 to
8280d50
Compare
|
To fix the Foreman Tasks UI dependency naming issue: theforeman/foreman-tasks#796 This PR can merge before that merges, it's not a blocker. |
vsedmik
left a comment
There was a problem hiding this comment.
ACK, works well with the latest foreman-tasks PR mentioned above!





What are the changes introduced in this pull request?
A rebased version of #11540 without the auto publish event queue use. It's a much simpler change now.
Considerations taken when implementing this change?
I needed to edit
auto_publish_viewto ignore requests when publishing composite content views specifically. This was causing an issue where auto publish occurred after a scheduled auto publish ran. I don't see why a composite CV publish would need to trigger an auto publish of itself.What are the testing steps for this pull request?
Same as in #11540
Ensure latest Dynflow is used and that you have Dynflow/dynflow@5538dd4
To make things easier, also test with the UI PR theforeman/foreman-tasks#788
Summary by Sourcery
Coordinate composite content view auto-publishing with component publishes to avoid duplicate or out-of-order composite publishes.
New Features:
Bug Fixes:
Tests: