Plugin Architecture, Question-answering Costs, and Query Planning #167
-
Hello. Is or will Semantic Kernel be interoperable with the recently announced feature of ChatGPT plugins? I would like to also ask about measuring and estimating the costs of answering specific natural-language questions and subquestions and about the related matter of query planning – including for complex queries utilizing one or more plugins. Question-answering services utilizing GPT and plugins can be envisioned. Architectures can be considered which would allow logged-in users to create new questions and to collaboratively upvote existing questions until those questions accumulated enough "points" for them to be enqueued for processing. It seems reasonable that the number of "points" that a question should accumulate before being processed should meet or exceed some measure of the complexity, or cost, of answering that question. Estimates or measures of question-answering complexity, or costs, might involve totaling any electrical, mechanical, computational, storage, transmission, and administrative costs required to answer the question. To be able to estimate the costs of answering natural-language questions before doing so would be useful. One question might cost $0.002 to answer and another $0.02. Answering a question might only require processing one database table or graph and answering another might involve querying a set of federated resources. It would seemingly be simpler to algorithmically estimate the complexity, or cost, of answering natural-language questions which can be mapped to a query language, e.g., SQL or SPARQL. While I am still exploring the recently announced plugins architecture and its documentation, it seems to me that plugin developers could consider providing functions like Thank you. Hopefully these ideas, comments, questions, and discussion topics are useful for the Semantic Kernel team and community. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
On Plugin/Skills integration: Longer answer: we're figuring out how deep we go w/ integration. One option would be to import Plugins as SK Skills. Another, export SK Skills as Plugins. These aren't mutually exclusive. Another option - aligning on Skill manifest format. Feedback welcome - thoughts? On Cost Estimation |
Beta Was this translation helpful? Give feedback.
-
Cost estimating is easy / hard. Easy when you know what the text being processed is (there's a C# implementation of a tokenizer in the repo), but hard because A better approach is probably to give a maximum cost. We know the maximum token size for the model, we (should) know the cost per 1000 tokens, we can then limit it to say 20 calls and therefore know the maximum processing cost. The actual cost can be tracked fairly trivially for the LLM, as it's just a matter of counting tokens used. |
Beta Was this translation helpful? Give feedback.
-
Circling back on these interesting topics, I found the following publication: White, Ryen W., and Ahmed Hassan Awadallah. "Task duration estimation." In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 636-644. 2019. [PDF] |
Beta Was this translation helpful? Give feedback.
On Plugin/Skills integration:
Short answer: Yes :).
Longer answer: we're figuring out how deep we go w/ integration. One option would be to import Plugins as SK Skills. Another, export SK Skills as Plugins. These aren't mutually exclusive. Another option - aligning on Skill manifest format.
Feedback welcome - thoughts?
On Cost Estimation
We are working on pieces of this but mostly for tracking costs after a call is made. This might be worth spinning up a separate discussion. I suspect the stochastic nature of LLMs would make accurate estimates challenging but I hear your points above and think this would be worthwhile to explore.