You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds non-semantic prose template metadata to method guards, so that a Familiar-style user interface (and perhaps a future version of the Familiar itself) can enable, by popups and autocomplete, a natural conversational capability interaction with normal exo objects that resembles the conversational capability interactions it supports with humans and LLMs. In all cases, all the prose text, that is, all the text outside of cards, is assumed unreliable. The only text that is visible and reliable is the text within cards. When talking to an exo, whether on not the exo's interface provides prose template metadata, a prose "To @" message sent to an exo just drops the rest of the prose and sends only the values that the text of each card evaluates to. Thus, it doesn't matter if the prose is nonsense.
The purpose of the metadata is so that the user interface can fill in this prose part for the user, once the recipient exo and the method name are known. This produces a dialog record that will usually be somewhat understandable by non-programmers, in the same sense that conversations with humans or LLMs are somewhat understandable. In the context of a Familiar-like system, the non-prose card text is the only thing that is reliable. In the case where you're talking to the exo, the non-prose card text is the only thing that is meaningful.
TODO must draw a mock diagram for how I imagine this PR's trivial test case would look as an interactive user experience.
TODO none of this works with optional or rest args yet. I had a plan going in, but part way through doing this I realized that plan for unworkable. This isn't ready for prime time until we figure that out. But we can do lots of UI experiments before then.
TODO the names proseCall, proseCallWhen, proseReturns are too long for the way we imagine these will be used. We need to bikeshed.
We need to see how well the proseCall function composes with E for expressing async sends with similar mixed prose.
I acknowledge that this experiment resembles many previous PL experiments that did not turn out well. In some ways, it resembles the spirit of COBOL. (@kriskowal also raises AppleScript.) I don't have any strong reason to believe this one will be better. We'll see.
@kriskowal and I did talk about how the card language that appears in the UI may not be Hardened JS itself, but rather @dckc 's scratch-like interactive Jessie experience + a new block for a petname where a lexical variable name may normally appear. (@dckc , please insert a link here.) It is important that the normal experience is for the cards to be reliably interpretable by experienced people who do not consider themselves programmers, so they gain reliable security benefits they understand. I think scratch-Jessie can be used well in this way for the typical card. Perhaps all we need is @dckc 's scratch Justin? These scratch expressions then translate to Hardened JS expression code for those template substitution holes at the code level.
Security Considerations
Complicated. Whenever natural language prose appears, there is always the hazard that it might be interpreted differently by different participants, leading to dangerous misunderstandings. As LLMs get more reliable, misunderstanding disasters may become rare enough that users become complacent, making the problem worse. At least the card-vs-non-card-prose distinction gives user a consistent clue about what is reliable, independent of whether their counterparty is human, LLM, or a simple exo. This at least makes it possible for users to operate in a secure manner they understand.
Scaling Considerations
Minor. The guards become larger because they carry these prose templates. Calls expressed with mixed prose have to do a bit of surgery to turn a mixed-prose-call into a regular exo call. Certainly as a per-human user interface mechanism, all these costs are trivial. I don't think there's a scaling problem lurking here.
Documentation Considerations
OMG
Testing Considerations
This PR includes rudimentary tests of the mechanism. IMO enough for their current POC stage of development. Because prose must always be considered unreliable, tests should not be written using these mechanisms, except for code testing these mechanisms or the user experience it enables.
Compatibility Considerations
Very little compat concerns. I had to add two optional properties to method guards, and adjust two t.deepEqual tests testing the contents of method guards. Because the presence of these properties is optional, method guards from earlier versions of endo will work fine here. And generally vice versa, aside from t.deepEqual-like tests. FWIW, these two tests were easy to adjust and nothing else broke.
Upgrade Considerations
I think none. But this is always trickier to reason about than we expect. If you think of something, please raise it here.
I have a theory that we can just train the agent to marshal patterns and interface guards as Justin and feed that directly. I don’t think the LLM is particularly hindered by structured representations of data. I am also proposing a convention of providing additional guidance with a help() => string method.
Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.
This PR includes no changesets
When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types
This PR is obsolete, to be replaced with #3075 . The comments on this PR are equally relevant there. However, #3075 is not yet ready for review, so please don't bother looking at either one yet.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes: #XXXX
Refs: #3075
Superseded by #3075
Description
Adds non-semantic prose template metadata to method guards, so that a Familiar-style user interface (and perhaps a future version of the Familiar itself) can enable, by popups and autocomplete, a natural conversational capability interaction with normal exo objects that resembles the conversational capability interactions it supports with humans and LLMs. In all cases, all the prose text, that is, all the text outside of cards, is assumed unreliable. The only text that is visible and reliable is the text within cards. When talking to an exo, whether on not the exo's interface provides prose template metadata, a prose "To @" message sent to an exo just drops the rest of the prose and sends only the values that the text of each card evaluates to. Thus, it doesn't matter if the prose is nonsense.
The purpose of the metadata is so that the user interface can fill in this prose part for the user, once the recipient exo and the method name are known. This produces a dialog record that will usually be somewhat understandable by non-programmers, in the same sense that conversations with humans or LLMs are somewhat understandable. In the context of a Familiar-like system, the non-prose card text is the only thing that is reliable. In the case where you're talking to the exo, the non-prose card text is the only thing that is meaningful.
TODO must draw a mock diagram for how I imagine this PR's trivial test case would look as an interactive user experience.
TODO none of this works with optional or rest args yet. I had a plan going in, but part way through doing this I realized that plan for unworkable. This isn't ready for prime time until we figure that out. But we can do lots of UI experiments before then.
TODO the names
proseCall,proseCallWhen,proseReturnsare too long for the way we imagine these will be used. We need to bikeshed.We need to see how well the
proseCallfunction composes withEfor expressing async sends with similar mixed prose.I acknowledge that this experiment resembles many previous PL experiments that did not turn out well. In some ways, it resembles the spirit of COBOL. (@kriskowal also raises AppleScript.) I don't have any strong reason to believe this one will be better. We'll see.
@kriskowal and I did talk about how the card language that appears in the UI may not be Hardened JS itself, but rather @dckc 's scratch-like interactive Jessie experience + a new block for a petname where a lexical variable name may normally appear. (@dckc , please insert a link here.) It is important that the normal experience is for the cards to be reliably interpretable by experienced people who do not consider themselves programmers, so they gain reliable security benefits they understand. I think scratch-Jessie can be used well in this way for the typical card. Perhaps all we need is @dckc 's scratch Justin? These scratch expressions then translate to Hardened JS expression code for those template substitution holes at the code level.
Security Considerations
Complicated. Whenever natural language prose appears, there is always the hazard that it might be interpreted differently by different participants, leading to dangerous misunderstandings. As LLMs get more reliable, misunderstanding disasters may become rare enough that users become complacent, making the problem worse. At least the card-vs-non-card-prose distinction gives user a consistent clue about what is reliable, independent of whether their counterparty is human, LLM, or a simple exo. This at least makes it possible for users to operate in a secure manner they understand.
Scaling Considerations
Minor. The guards become larger because they carry these prose templates. Calls expressed with mixed prose have to do a bit of surgery to turn a mixed-prose-call into a regular exo call. Certainly as a per-human user interface mechanism, all these costs are trivial. I don't think there's a scaling problem lurking here.
Documentation Considerations
OMG
Testing Considerations
This PR includes rudimentary tests of the mechanism. IMO enough for their current POC stage of development. Because prose must always be considered unreliable, tests should not be written using these mechanisms, except for code testing these mechanisms or the user experience it enables.
Compatibility Considerations
Very little compat concerns. I had to add two optional properties to method guards, and adjust two
t.deepEqualtests testing the contents of method guards. Because the presence of these properties is optional, method guards from earlier versions of endo will work fine here. And generally vice versa, aside fromt.deepEqual-like tests. FWIW, these two tests were easy to adjust and nothing else broke.Upgrade Considerations
I think none. But this is always trickier to reason about than we expect. If you think of something, please raise it here.