|
| 1 | +# Requirements |
| 2 | + |
| 3 | +This document outlines the functional requirements for the PHP AI Client, as well as the bigger picture for how it can eventually be used. The concrete technical architecture is defined and outlined in a separate document, based on these requirements. |
| 4 | + |
| 5 | +## Objective |
| 6 | + |
| 7 | +Enable calling any generative AI implementation using a uniform API in various programming languages. |
| 8 | + |
| 9 | +### Context |
| 10 | + |
| 11 | +* While the initial rationale for building a provider agnostic AI client abstraction for the WordPress AI Team was naturally the lack of such an abstraction being available for WordPress, further research showed that this gap also exists in other PHP CMSs, and even the overall PHP ecosystem. |
| 12 | +* In other words, the background for starting this project is the lack of such an SDK in the PHP ecosystem. Since UI on the web today heavily relies on JavaScript (in addition to whichever server-side language), the SDK's API needs to be centered in PHP, but accessible via JavaScript as well (e.g. through a REST API). |
| 13 | +* The PHP AI Client (this project) will only provide the foundational PHP layer, in a platform agnostic way. Additional future packages or plugins will be implemented to cover CMS specific aspects and the JavaScript layer. |
| 14 | +* While a few noteworthy projects with a related purpose exist in various programming languages, they are not in PHP, or their API is not provider agnostic, or their API lacks flexibility for emerging modalities and features. |
| 15 | +* Ideally, the APIs in this project can eventually be translated from PHP to client-side JavaScript. As such, the project should follow paradigms that can be expressed in both programming languages. |
| 16 | + |
| 17 | +## Architecture requirements |
| 18 | + |
| 19 | +This section lists the key requirements that this project must meet. For explanation on specific terms, see the [glossary](./GLOSSARY.md). |
| 20 | + |
| 21 | +* MUST support any kinds of AI implementation, i.e. cloud-based AI, server-side AI, client-side AI. |
| 22 | +* MUST define clear, language-agnostic data structures for AI inputs and outputs, precise enough to allow for consistent implementation across languages (e.g. express in JSON schema). |
| 23 | +* MUST support arbitrary combinations of input and output modalities, regardless of which combinations or singular modalities generative AI models support today. Such as (non comprehensive list): |
| 24 | + * Text |
| 25 | + * Image |
| 26 | + * Audio |
| 27 | + * Video |
| 28 | +* MUST support additional non-generative features such as (non comprehensive list): |
| 29 | + * Classification |
| 30 | + * Text to Speech |
| 31 | + * Embedding |
| 32 | +* MUST support response streaming for arbitrary output modalities. |
| 33 | +* MUST allow for long-running operations that may take several minutes, _if_ relevant for the selected provider. |
| 34 | +* MUST define standard ways for interacting with optional provider capabilities, such as managing chat history or specifying multimodal inputs/outputs, _if_ the selected provider supports them. |
| 35 | +* MUST support diverse common model parameters, such as temperature, top P, or image aspect ratio, with uniform names and behavior across providers, _if_ the selected provider supports them. |
| 36 | +* MUST define a modular component model that allows for the addition of new providers, models, and features without modifying core functionality. |
| 37 | +* MUST define an API for external packages to register and implement AI model providers. |
| 38 | +* MUST be decoupled from any AI provider's implementation details (e.g. not all providers require HTTP requests or API authentication). |
| 39 | +* MUST allow provider and model discovery based on specific inputs, outputs, and configuration options supported. |
| 40 | +* MUST define data types and interfaces that have direct equivalents in all supported languages (e.g. no multiple inheritance for classes). |
| 41 | +* MUST define separate APIs for SDK usage and provider registration, so that iterations or breaking changes in one don't automatically affect the other. |
| 42 | + |
| 43 | +### Best practices |
| 44 | + |
| 45 | +* SHOULD use concepts and paradigms so that they can be applied in other AI infrastructure projects, either in combination or separate from the PHP AI Client (e.g. MCP, real-time AI abstraction, prompt generation). |
| 46 | +* SHOULD provide middleware that can be used to "polyfill" certain functionality when a provider does not support it (e.g. message history, downloading files from URLs). |
| 47 | +* SHOULD allow for arbitrary request and response parameters for specific providers or models to be passed through even when not formally supported, to cater for provider specific features or to allow for newly added features to be used before official support is added to the SDK. |
| 48 | + |
| 49 | +### Out of scope |
| 50 | + |
| 51 | +* MUST NOT include any common AI features beyond the actual AI client (e.g. no MCP, no agents). |
| 52 | +* MUST NOT include a real-time / live AI abstraction as it requires different infrastructure. |
| 53 | + |
| 54 | +## Credit |
| 55 | + |
| 56 | +This project is heavily based on researching existing AI providers and existing AI client SDKs, and it takes significant learnings from these into account for the specification. All of these products helped inform the aforementioned requirements. |
| 57 | + |
| 58 | +Below is a list of products that were reviewed. The list is non comprehensive, based on a best-effort approach to include all key resources reviewed. It is not an endorsement of any of these products. |
| 59 | + |
| 60 | +### Cloud-based AI providers |
| 61 | + |
| 62 | +_(in alphabetical order)_ |
| 63 | + |
| 64 | +* [Anthropic API](https://docs.anthropic.com/en/api/) |
| 65 | +* [fal API](https://docs.fal.ai/model-endpoints) |
| 66 | +* [Google Generative Language API](https://ai.google.dev/api/all-methods) |
| 67 | +* [Google Vertex AI API](https://cloud.google.com/vertex-ai/docs/reference/rest) |
| 68 | +* [Nvidia LLM API](https://docs.api.nvidia.com/nim/reference/llm-apis) |
| 69 | +* [OpenAI API](https://platform.openai.com/docs/api-reference/) |
| 70 | +* [Perplexity API](https://docs.perplexity.ai/api-reference/) |
| 71 | +* [Replicate API](https://replicate.com/docs/reference/http) |
| 72 | +* [X AI API](https://docs.x.ai/docs/api-reference) |
| 73 | + |
| 74 | +### AI client SDKs |
| 75 | + |
| 76 | +_(in alphabetical order)_ |
| 77 | + |
| 78 | +* [AI Services WordPress plugin](https://github.com/felixarntz/ai-services) |
| 79 | +* [Drupal AI](https://git.drupalcode.org/project/ai) |
| 80 | +* [Firebase Genkit](https://github.com/firebase/genkit) |
| 81 | +* [Google Gen AI SDK for TypeScript and JavaScript](https://github.com/googleapis/js-genai) |
| 82 | +* [LangChain.js](https://github.com/langchain-ai/langchainjs) |
| 83 | +* [LLPhant](https://github.com/LLPhant/LLPhant) |
| 84 | +* [OpenAI PHP](https://github.com/openai-php/client) |
| 85 | +* [Prism (Laravel)](https://github.com/prism-php/prism) |
| 86 | +* [Vercel AI SDK](https://github.com/vercel/ai) |
| 87 | + |
| 88 | +### AI specifications |
| 89 | + |
| 90 | +_(in alphabetical order)_ |
| 91 | + |
| 92 | +* [A2A protocol](https://github.com/google/A2A) |
| 93 | +* [MCP](https://github.com/modelcontextprotocol/modelcontextprotocol) |
| 94 | +* [OpenAI model spec](https://cdn.openai.com/spec/model-spec-2024-05-08.html) |
| 95 | + |
| 96 | +### Client-side AI |
| 97 | + |
| 98 | +_(in alphabetical order)_ |
| 99 | + |
| 100 | +* [Chrome built-in AI Prompt API](https://github.com/webmachinelearning/prompt-api) |
| 101 | +* [transformers.js](https://github.com/huggingface/transformers.js/) |
0 commit comments