generative-computing · jakelorocco · Oct 25, 2025 · Oct 25, 2025 · Oct 26, 2025 · Oct 27, 2025
diff --git a/docs/dev/intrinsics_and_adapters.md b/docs/dev/intrinsics_and_adapters.md
@@ -0,0 +1,38 @@
+# Intrinsics and Adapters
+Note: Mellea currently only supports GraniteCommonAdapters and Intrinsics.
+
+## Basics
+In Mellea, intrinsics are a type of Component that signals one or more of the following to a backend:
+- a special adapter must be used for generation
+- the input/output for generation must be transformed in a particular way
+- the model options must be modified in a particular way
+
+These changes only happen when the intrinsic is the "action" of the request. Intrinsics should usually not be used as an item in the context of generation (in fact, by default, Intrinsics have no string representation).
+
+These changes are specified by the Adapter that corresponds to a given Intrinsic. Matching happens based on the adapter name and type.
+
+## Parts of an Intrinsic
+Intrinsics specify:
+- an adapter name (ie requirement_check)
+- types of adapters suitable to be used (ie alora)
+- any kwargs necessary (ie a requirement like "make sure the last user message is...")
+
+## Parts of an Adapter
+Adapters specify:
+- compatible backends
+- adapter type
+- functions for getting a path to load them
+
+## Using Intrinsics
+Mellea Intrinsics currently utilize the granite-common package for loading adapters and formatting input/outputs (https://github.com/ibm-granite/granite-common). This means Mellea only allows intrinsics/adapters that follow this pattern.
+
+## Needed Future Work
+### Custom Adapters / Intrinsics
+Mellea should support custom intrinsic / adapter implementations. To do this:
+- make backend `_generate_from_intrinsic` functions generic and utilize only common adapter functions
+- adapters must specify a transformation function that encapsulates the input/output modifications necessary for their generation requests
+
+### Concurrency Checks
+Some backends (currently only LocalHFBackend) that allow adapters to be loaded, cannot independently utilize these adapters without impacting other generation requests.
+
+These backends should support a generation lock that ensures requests are only performed when the correct set of adapters (or no adapters) are active.
diff --git a/docs/dev/requirement_aLoRA_rerouting.md b/docs/dev/requirement_aLoRA_rerouting.md
@@ -14,14 +14,14 @@ The actual rule is slightly more complicated.
 
 ## The Actual Rule
 
-If a `Requirement` is validated using a backend that could either use a `constraint` aLoRA or perform an LLMaJ prompt on the underlying model, then the aLoRA is used for validation, even if the `backend.generate_from_context` method is called instead of the `alora.generate_from_strings` method.
+If a `Requirement` is validated using a backend that could either use a `requirement_check` aLoRA or perform an LLMaJ prompt on the underlying model, then the aLoRA is used for validation, even if the `backend.generate_from_context` method is called instead of the `backend._generate_from_intrinsic` method.
 
 There are three exceptions to this rule:
 1. `Backend.default_to_constraint_checking_alora` is set to `False` (this parameter defaults to `True`).
 2. The `Requirement` has a more specific subtype that indicates a more specific intent (`LLMaJRequirement`). 
 3. The `ALoRA` requirement checker throws an exception.
 
-There is an exception (or disambiguation) to the first exception: If the user provides an `ALoRARequirement`, then the `backend.generate_from_context` call is rerouted to the constraint checking LoRA, regardless of the value of `deault_to_constraint_checking_alora`.
+There is an exception (or disambiguation) to the first exception: If the user provides an `ALoRARequirement`, then the `backend.generate_from_context` call is rerouted to the constraint checking LoRA, regardless of the value of `default_to_constraint_checking_alora`.
 
 ## Decision Rationale
 
@@ -33,12 +33,13 @@ Suppose that the user creates a backend and then adds a generic constraint check
 
 ```python
 from mellea import start_session
-from mellea.backends.aloras.granite_aloras import add_granite_aloras
 from mellea.stdlib.requirement import Requirement
 
 m = start_session(
     "huggingface.LocalHFBackend:ibm-granite/granite-3.2-8b-instruct")
-add_granite_aloras(m)  # This will load the Constraint checint aLoRA.
+
+# By default, the AloraRequirement uses a GraniteCommonAdapter with "requirement_check".
+m.backend.add_adapter(GraniteCommonAdapter("requirement_check"))
 
 m.instruct(
     "Corporate wants you to find the difference between these two strings:\n\naaa\naba")

diff --git a/docs/examples/intrinsics/intrinsics.py b/docs/examples/intrinsics/intrinsics.py
@@ -0,0 +1,45 @@
+from mellea.backends.openai import OpenAIBackend, _ServerType
+from mellea.backends.adapters.adapter import AdapterType, GraniteCommonAdapter
+from mellea.stdlib.base import ChatContext, ModelOutputThunk
+from mellea.stdlib.chat import Message
+import mellea.stdlib.funcs as mfuncs
+from mellea.stdlib.intrinsics.intrinsic import Intrinsic
+
+# Create the Adapter. GraniteCommonAdapter's default to ALORAs.
+req_adapter = GraniteCommonAdapter("requirement_check")
+
+# Create the backend. Assumes a locally running VLLM server.
+backend = OpenAIBackend(
+    model_id="ibm-granite/granite-3.3-8b-instruct",
+    base_url="http://0.0.0.0:8000/v1",
+    api_key="EMPTY",
+)
+
+# If using a remote VLLM server, utilize the `test/backends/test_openai_vllm/serve.sh`
+# script with `export VLLM_DOWNLOAD_RAG_INTRINSICS=True`. This will download the granite_common
+# adapters on the server.
+backend._server_type = _ServerType.REMOTE_VLLM
+
+# Add the adapter to the backend.
+backend.add_adapter(req_adapter)
+
+ctx = ChatContext()
+ctx = ctx.add(Message("user", "Hi, can you help me?"))
+ctx = ctx.add(Message("assistant", "Hello; yes! What can I help with?"))
+
+# Generate from an intrinsic with the same name as the adapter. By default, it will look for
+# ALORA and then LORA adapters.
+out, new_ctx = mfuncs.act(
+    Intrinsic(
+        "requirement_check",
+        intrinsic_kwargs={"requirement": "The assistant is helpful."},
+    ),
+    ctx,
+    backend,
+)
+
+# Print the output. The requirement_check adapter has a specific output format:
+print(out)  # {"requirement_likelihood": 1.0}
+
+# The AloraRequirement uses this adapter. It automatically parses that output
+# when validating the output.
diff --git a/mellea/backends/_utils.py b/mellea/backends/_utils.py
@@ -4,7 +4,6 @@
 from collections.abc import Callable
 from typing import Any, Literal
 
-from mellea.backends.aloras import Alora
 from mellea.backends.formatter import Formatter
 from mellea.backends.tools import parse_tools
 from mellea.helpers.fancy_logger import FancyLogger
@@ -57,30 +56,6 @@ def to_chat(
     return ctx_as_conversation
 
 
-def use_alora(
-    action: Component | CBlock,
-    alora: Alora | None,
-    default_to_constraint_checking_alora: bool,
-) -> bool:
-    """Returns True when the condition for using alora is met.
-
-    See `docs/dev/requirement_aLoRA_rerouting.md` for an explanation of the following code block.
-    """
-    if issubclass(type(action), Requirement):
-        # The general rule is that we reroute to the alora if it exists.
-        reroute_to_alora = alora is not None
-        # However, there are some exceptions:
-        if not default_to_constraint_checking_alora:
-            reroute_to_alora = False
-        if issubclass(type(action), LLMaJRequirement):
-            reroute_to_alora = False
-        if issubclass(type(action), ALoraRequirement):
-            reroute_to_alora = True
-        return reroute_to_alora
-    else:
-        return False
-
-
 def to_tool_calls(
     tools: dict[str, Callable], decoded_result: str
 ) -> dict[str, ModelToolCall] | None:

diff --git a/mellea/backends/adapters/adapter.py b/mellea/backends/adapters/adapter.py
@@ -0,0 +1,224 @@
+"""Module for adapters to backends."""
+
+import abc
+import pathlib
+from enum import Enum
+from typing import Any, TypeVar
+
+import granite_common
+from litellm import cast
+
+from mellea.backends import Backend
+from mellea.backends.types import _ServerType
+
+
+class AdapterType(Enum):
+    """Possible types of adapters for a backend."""
+
+    LORA = "lora"
+    ALORA = "alora"
+
+
+class Adapter(abc.ABC):
+    """An adapter that can be added to a single backend."""
+
+    def __init__(self, name: str, adapter_type: AdapterType):
+        """An adapter that can be added to a backend.
+
+        Note: An adapter can only be added to a single backend.
+
+        Args:
+            name: name of the adapter; when referencing this adapter, use adapter.qualified_name
+            adapter_type: enum describing what type of adapter it is (ie LORA / ALORA)
+        """
+        self.name = name
+        self.adapter_type = adapter_type
+        self.qualified_name = name + "_" + adapter_type.value
+        """the name of the adapter to use when loading / looking it up"""
+
+        self.backend: Backend | None = None
+        """set when the adapter is added to a backend"""
+
+        self.path: str | None = None
+        """set when the adapter is added to a backend"""
+
+
+class OpenAIAdapter(Adapter):
+    """Adapter for OpenAIBackends."""
+
+    @abc.abstractmethod
+    def get_open_ai_path(
+        self,
+        base_model_name: str,
+        server_type: _ServerType = _ServerType.LOCALHOST,
+        remote_path: str | None = None,
+    ) -> str:
+        """Returns the path needed to load the adapter.
+
+        Args:
+            base_model_name: the base model; typically the last part of the huggingface model id like "granite-3.3-8b-instruct"
+            server_type: the server type (ie LOCALHOST / OPENAI); usually the backend has information on this
+            remote_path: optional; used only if the server_type is REMOTE_VLLM; base path at which to find the adapter
+        """
+        ...
+
+
+class LocalHFAdapter(Adapter):
+    """Adapter for LocalHFBackends."""
+
+    @abc.abstractmethod
+    def get_local_hf_path(self, base_model_name: str) -> str:
+        """Returns the path needed to load the adapter.
+
+        Args:
+            base_model_name: the base model; typically the last part of the huggingface model id like "granite-3.3-8b-instruct"
+        """
+        ...
+
+
+class GraniteCommonAdapter(OpenAIAdapter, LocalHFAdapter):
+    """Adapter for intrinsics that utilize the GraniteCommon library."""
+
+    def __init__(
+        self,
+        name: str,
+        adapter_type: AdapterType = AdapterType.ALORA,
+        config_file: str | pathlib.Path | None = None,
+        config_dict: dict | None = None,
+        base_model_name: str | None = None,
+    ):
+        """An adapter that can be added to either an `OpenAIBackend` or a `LocalHFBackend`. Most rag-lib-intrinsics support lora or alora adapter types.
+
+        Args:
+            name: name of the adapter; when referencing this adapter, use adapter.qualified_name
+            adapter_type: enum describing what type of adapter it is (ie LORA / ALORA)
+            config_file: optional; file for defining the intrinsic / transformations
+            config_dict: optional; dict for defining the intrinsic / transformations
+            base_model_name: optional; if provided with no config_file/config_dict, will be used to lookup the granite_common config for this adapter
+        """
+        assert adapter_type == AdapterType.ALORA or adapter_type == AdapterType.LORA, (
+            f"{adapter_type} not supported"
+        )
+        super().__init__(name, adapter_type)
+
+        self.base_model_name = base_model_name
+
+        # If any of the optional params are specified, attempt to set up the
+        # config for the intrinsic here.
+        config: dict | None = None
+        if config_file is not None or config_dict is not None:
+            config = granite_common.intrinsics.util.make_config_dict(
+                config_file=config_file, config_dict=config_dict
+            )
+            config = cast(
+                dict, config
+            )  # Can remove if util function gets exported properly.
+
+        if config is None and self.base_model_name is not None:
+            is_alora = True if self.adapter_type == AdapterType.ALORA else False
+            io_yaml_file = granite_common.intrinsics.util.obtain_io_yaml(
+                self.name, self.base_model_name, alora=is_alora
+            )
+            config = granite_common.intrinsics.util.make_config_dict(
+                config_file=io_yaml_file
+            )
+            config = cast(
+                dict, config
+            )  # Can remove if util function gets exported properly.
+
+        self.config: dict | None = config
+
+    def get_open_ai_path(
+        self,
+        base_model_name: str,
+        server_type: _ServerType = _ServerType.LOCALHOST,
+        remote_path: str | None = None,
+    ) -> str:
+        """Returns the path needed to load the adapter.
+
+        Args:
+            base_model_name: the base model; typically the last part of the huggingface model id like "granite-3.3-8b-instruct"
+            server_type: the server type (ie LOCALHOST / OPENAI); usually the backend has information on this
+            remote_path: optional; used only if the server_type is REMOTE_VLLM; base path at which to find the adapter
+        """
+        if server_type == _ServerType.LOCALHOST:
+            path = self.download_and_get_path(base_model_name)
+        elif server_type == _ServerType.REMOTE_VLLM:
+            if remote_path is None:
+                remote_path = "rag-intrinsics-lib"
+            path = self.get_path_on_remote(base_model_name, remote_path)
+        else:
+            raise ValueError(
+                f"{self} not supported for OpenAIBackend with server_type: {server_type}"
+            )
+
+        return path
+
+    def get_local_hf_path(self, base_model_name: str) -> str:
+        """Returns the path needed to load the adapter.
+
+        Args:
+            base_model_name: the base model; typically the last part of the huggingface model id like "granite-3.3-8b-instruct"
+        """
+        return self.download_and_get_path(base_model_name)
+
+    def download_and_get_path(self, base_model_name: str) -> str:
+        """Downloads the required rag intrinsics files if necessary and returns the path to the them.
+
+        Args:
+            base_model_name: the base model; typically the last part of the huggingface model id like "granite-3.3-8b-instruct"
+
+        Returns:
+            a path to the files
+        """
+        is_alora = self.adapter_type == AdapterType.ALORA
+        return str(
+            granite_common.intrinsics.util.obtain_lora(
+                self.name, base_model_name, alora=is_alora
+            )
+        )
+
+    def get_path_on_remote(self, base_model_name: str, base_path: str) -> str:
+        """Assumes the files have already been downloaded on the remote server."""
+        return f"./{base_path}/{self.name}/{self.adapter_type.value}/{base_model_name}"
+
+
+T = TypeVar("T")
+
+
+def get_adapter_for_intrinsic(
+    intrinsic_name: str,
+    intrinsic_adapter_types: list[AdapterType],
+    available_adapters: dict[str, T],
+) -> T | None:
+    """Finds an adapter from a dict of available adapters based on the intrinsic name and its allowed adapter types.
+
+    Args:
+        intrinsic_name: the name of the intrinsic, like "answerability"
+        intrinsic_adapter_types: the adapter types allowed for this intrinsic, like ALORA / LORA
+        available_adapters: the available adapters to choose from; maps adapter.qualified_name to the Adapter
+
+    Returns:
+        an Adapter if found; else None
+    """
+    adapter = None
+    for adapter_type in intrinsic_adapter_types:
+        qualified_name = intrinsic_name + "_" + adapter_type.value
+        adapter = available_adapters.get(qualified_name, None)
+        if adapter is not None:
+            break
+
+    return adapter
+
+
+class AdapterMixin(abc.ABC):
+    """Mixin class for backends capable of utilizing adapters."""
+
+    def add_adapter(self, *args, **kwargs):
+        """Adds the given adapter to the backend. Must not have been added to a different backend."""
+
+    def load_adapter(self, adapter_qualified_name: str):
+        """Loads the given adapter for the backend. Must have previously been added."""
+
+    def unload_adapter(self, adapter_qualified_name: str):
+        """Unloads the given adapter from the backend."""