-
Notifications
You must be signed in to change notification settings - Fork 534
Upgrade to anoncreds via api endpoint #2922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
67cc007 to
1e7008f
Compare
|
Got a couple of errors while testing this with the alice/faber demo (errors on the alice side). I think the response from the upgrade should include some kind of status re what was upgraded, currently the response is status 200 with an empty My testing: Ran local postgres instance: Ran alice and faber: `` $ AGENT_PORT_OVERRIDE=8010 POSTGRES=1 ./run_demo run faber --wallet-type askar --revocation --multitenant Traceback (most recent call last): The above exception was the direct cause of the following exception: Traceback (most recent call last): 15 After receiving credential offer, send credential request Credential exchange abandoned |
ianco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got some errors testing using the alice/faber demo, see other comments.
|
Love the idea of a tutorial for doing this using Alice/Faber. Is that possible/reasonable/helpful? You do need to do something first (populate the storage) and then do an upgrade. Even if it does nothing, I think it would be helpful to demonstrate? |
|
Tried another test, upgrading faber (and not alice) and everything works fine (can issue and request proof after the upgrade). Then upgraded alice - same errors as above. |
|
Hmm. I remember seeing that error a long time ago and thought I had fixed it. Possibly an issue with postgres. I'll look into it. |
It's easy to demonstrate with with alice/faber when using multitenancy (since you don't have to restart the agent). Not sure how to verify that the database is upgraded, probably the For single tenancy we need to provide an option in the demo to restart the agent after it's been upgraded. |
I didn't test with sqlite (which is what is used by default for the demo) |
... and if it's a single-tenant upgrade we can include a message that "Agent shut down need to restart" ... |
|
Remind me why do we need a restart in single-tenant mode, but not in multi-tenant mode? And is there any way to eliminate the requirement? Maybe we ignore the start up option and use what the storage says? |
The restart is because we need to reload the context (all the anoncreds modules, the anoncreds profile). Currently, it is kind of ignoring the wallet-type configuration on the restart start-up. It's reloading the context and anoncreds modules. However, it seems safer to force the restart for an agent which has an active context. Has already injected other classes into the profile and so on. That being said, it might be possible to reload the context like it's doing on startup now when the upgrade completion is detected. I didn't do it, because I didn't think a restart was an issue. |
|
Haven't been able to reproduce that error yet with the demo and postgres. Also, the integration test |
You can do this with the |
OK we can jump on a call if we need to ... |
|
@ianco I'll call you on Monday. I can't replicate it still, but not sure I'm using postgres correctly. I will want the morning to figure it out if it is a problem with actual upgrade and the blinded secret. I know I had this problem before when I was re-creating the cred defs private and key proof data during the upgrade, but thought it was fixed after I changed the upgrade to use the existing values. |
My expectation of the “upgrade” process for an operating ACA-Py instance in a Kubernetes environment is:
If the upgrade has to be done by a special instance of the code that has to restart different things, it seems like it is much harder to manage. That approach also works fine for a non-Kubernetes environment. |
|
I think it works the way you are describing. The controller doesn't have to manage anything. The acapy agents just need to restart after they detect the upgrade completed, like if they crashed. This would be managed by any type of cloud deployment, like kubernetes. |
|
I tested again this morning, a few different scenarios. If I create the wallet (alice) and then immediately upgrade, then there is no problem and I can issue and request proofs. If I first issue and revoke some credentials (issued 4 revoked 3) and then upgrade alice, then I get the above error when I try to request a proof. |
|
Looks like there's a small change in the Existing record was saved as json value {"value":{"ms":"71064680940360014629618879654672290185096545699652080871503389385314050854890"}}but in anoncreds it's expecting to be saved directly as a string Hoping it should be a trivial fix. Thanks for finding this. |
|
@ianco This has been updated. There was two issues when upgrading a multitenant holder.
Expanded the integration test to cover upgrading the holder subwallet and re-issuing a credential. |
|
Also added a small response payload to the upgrade endpoint. |
Looks good! |
|
Some success! I was able to upgrade both of the alice and faber wallets according to my test scenario above (start alice and faber with askar wallet and in multitenant mode with revocation enabled, issue and revoke a bunch of credentials, then upgrade alice and then faber wallets). However all proof requests from faber to alice return false. I thought it might be related to the other reported bug (#2934) so I deleted all alice credentials from her wallet using the api, issued a new credential (not revoked) and then tried another proof request and it still returned false. With the askar wallet the proof would still pass as long as there was at least one valid credential, regardless of how many were revoked. So I suspect the issue is in either the anoncreds-rs code or in the aca-py anoncreds-related code. |
Yes. I think this is the same issue I reported here. I hadn't tried other scenario you described. I thought it was some logic getting the most valid credential, but looks like that might not be the case. I don't think it's related to the upgrade at all, as I can replicate it with fresh anoncreds agents. I will test some other cases as well but when I hadn't revoked any credentials, the proof verified correctly after the upgrade. It was only when any credentials had been revoked and the wallet was anoncreds that the proof always returned false. |
Signed-off-by: jamshale <[email protected]>
Signed-off-by: jamshale <[email protected]>
Signed-off-by: jamshale <[email protected]>
Signed-off-by: jamshale <[email protected]>
Signed-off-by: jamshale <[email protected]>
Signed-off-by: jamshale <[email protected]>
|
If it's an existing issue we should probably merge this PR and then deal with it as a separate issue. |
ianco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. There is an issue with validating proofs in anoncreds but that's independent of these changes.
With regards to that "minor performance hit" -- I'm noticing that because I'm picking this up after testing with more debug logging (#3689) And I can see with every That means an extra, unnecessary askar session and connection is opened up every single API call. I think this isn't just a minor performance hit, but really not the way that this should have been implemented. Can we please revisit this, and see if there's a different way to handle the "upgrading" logic? Edit: opted to create an issue for it: #3694 |




This was originally opened as #2840 but this addresses some comments and adds improvements, mostly with handling multi instance aca-py scenarios.
The upgrading middleware will now use a cache during upgrades and after the upgrade has been completed. Agents that have the upgrade middleware but never upgrade will still have a very minor performance hit, due to needing to query the is_upgrading record.
There is a sequence diagram at
docs/design/UpgradeViaApi.mdfor describing the design and flow.In multi instance scenarios the instances which did not receive the upgrade request will now detect a upgrade has been initiated from another instance and begin a polling process to check for when the upgrade completes.
In stand alone agents (single tenant) or mutlitenant admin agents any instance that detects the upgrade has completed will still shutdown and be required to start-up again. This could possibly still be avoided, but I think it's best to reload the context from scratch after the upgrade. However, instead of failing after the upgrade restart, due to the
storage-typebeingaskar-anoncredsand thewallet-typeconfig beingaskar. It will instead get a warning to change the configuration, reload the context as anoncreds and successfully launch. I think this is important to prevent accidents where the configuration isn't getting updated immediately after upgrading.Both these improvements mean the controller no longer has to be concerned about scaling down the aca-py instances and co-coordinating the
wallet-typeconfiguration change. The controller operator simply scales down the instances of controller, upgrades itself to handle the anoncreds endpoints and then hits the upgrade endpoints. As long as the aca-py agent restarts itself after shutting down, the rest will happen automatically.I looked into doing a block on the webhook queue, but I don't think it's necessary. The incoming webhooks will be blocked by the upgrade middleware and outgoing messages shouldn't cause issues with agents. My understanding of the webhooks, and everything they might do during an upgrade, is limited though, so I might be off on that.
See also #2881 for the migration guide.