Skip to content

Conversation

Gerg
Copy link
Member

@Gerg Gerg commented Aug 18, 2025

  • This is a re-implementation of 133486e
  • This reverts commits:
  • This error can be raised by BBS when there is a race condition between
    the CC syncer and CC API when creating an LRP for a process
  • Since we want to desire a process anyway, it already existing (with
    the correct version) is likely not an error case. Thus, we can behave
    idempotently.
  • Improvements from previous implementation:
    • No longer ignores other types of APIErrors in
      desire_app_handler.rb
    • No longer coupled to the exact error message coming from BBS

rescue CloudController::Errors::ApiError => e # catch race condition if Diego Process Sync creates an LRP in the meantime
if e.name == 'RunnerError' && e.message['the requested resource already exists']
existing_lrp = client.get_app(process)
client.update_app(process, existing_lrp)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do lose out on this update. I think the syncer will get to it anyway (or already got to it, since this is a race condition). I'll think about it some more.

Copy link
Member Author

@Gerg Gerg Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, looking at client.update_app, it's getting the LRP based on the CC process's guid and version. If the versions match between the LRP and the CC process, then no update is needed (that's the purpose of version).

Thus, I don't think this additional update is needed.

@Gerg
Copy link
Member Author

Gerg commented Aug 18, 2025

Discovered while reviewing #4452

@Gerg
Copy link
Member Author

Gerg commented Aug 18, 2025

Reproduction steps:

❯ bosh ssh api
...
$ sudo su -
# /var/vcap/jobs/cloud_controller_ng/bin/console
...
> c = ::CloudController::DependencyLocator.instance.bbs_apps_client
...
> c.desire_app ProcessModel.first
CloudController::Errors::ApiError: Runner error: Process Guid: fba69784-a41a-4e8c-bf10-7f42e172718f: the requested resource already exists
from /var/vcap/data/packages/cloud_controller_ng/7e6a3d36ede40c3a067e44f3c9971388757df5ad/cloud_controller_ng/lib/cloud_controller/diego/bbs_apps_client.rb:123:in `handle_diego_errors'

Gerg added 2 commits August 20, 2025 15:29
…p process"

Revert "Remove unnecessary logger"
Revert "Rubocop fixes"

This reverts commits:
e321fa8.
fc96bb6.
133486e.
- This is a re-implementation of 133486e
- This error can be raised by BBS when there is a race condition between
  the CC syncer and CC API when creating an LRP for a process
- Since we want to desire a process anyway, it already existing (with
  the correct version) is likely not an error case. Thus, we can behave
  idempotently.
- Improvements from previous implementation:
    - No longer ignores other types of APIErrors in
      desire_app_handler.rb
    - No longer coupled to the exact error message coming from BBS
@Gerg Gerg force-pushed the raise_desire_app_errors branch from 3e52839 to c2d496a Compare August 20, 2025 22:29
@Gerg Gerg marked this pull request as ready for review August 20, 2025 22:30
@Gerg Gerg requested review from sethboyles and tcdowney August 20, 2025 22:30
Previous:
```
"error":{"descriptor":[{},{}]}}
```

Improved:
```
"error":"\u003cDiego::Bbs::Models::Error: type: :ResourceExists, message: \"the requested resource already exists\"\u003e"
```
@Gerg
Copy link
Member Author

Gerg commented Aug 26, 2025

Manual testing

  1. bosh ssh api
  2. sudo su -
  3. /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/diego/desire_app_handler.rb:
    module VCAP::CloudController
      module Diego
        class DesireAppHandler
          class << self
            def create_or_update_app(process, client)
              # if (existing_lrp = client.get_app(process))
              #   client.update_app(process, existing_lrp)
              # else
                client.desire_app(process)
              # end
            end
          end
        end
      end
    end
  4. monit restart cloud_controller_ng
  5. tail -f /var/vcap/sys/log/cloud_controller_ng/cloud_controller_ng.log
  6. In another terminal: cf map-route --hostname nap3 nip bertram.app-runtime-interfaces.ci.cloudfoundry.org (or anything else that will go through desire_app_handler).

Results:

  1. Command succeeds
    Mapping route nap3.bertram.app-runtime-interfaces.ci.cloudfoundry.org to app nip in org org / space space as admin...
    OK
    
  2. Error log:
    {"timestamp":"2025-08-25T23:07:23.104702127Z","message":"desire.app.response","log_level":"info","source":"cc.bbs.apps_client","data":{"request_guid":"b8eedebd-51b2-49b7-ad9b-f2ca57920873::4afab2b2-4e0f-4546-9aae-58190b7a343a","user_guid":"9adc9b33-9f27-4246-8269-30564dc0655b","b3_trace_id":"b8eedebd51b249b7ad9bf2ca57920873","b3_span_id":"31b15314866356dd","process_guid":"dabdf66a-02a9-43cf-b0cc-8de30e63e2d4-f9a550ce-8b15-4f40-bc2d-d810ee9810e2","error":"\u003cDiego::Bbs::Models::Error: type: :ResourceExists, message: \"the requested resource already exists\"\u003e"},"thread_id":51620,"fiber_id":51640,"process_id":21,"file":"/var/vcap/data/packages/cloud_controller_ng/ddd60da3712747339f983245f3d0eaf84ccba870/cloud_controller_ng/lib/cloud_controller/diego/bbs_apps_client.rb","lineno":16,"method":"block in desire_app"}
    

@Gerg Gerg requested review from philippthun and johha August 26, 2025 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant