Architecture

BrainAPI uses a modular adapter architecture that allows you to plug in different drivers for each component:

Core Adapters

Adapter	File	Purpose
DataORMAdapter	`/adapters/data.py`	Textual information storage
EmbeddingsAdapter	`/adapters/vectors.py`	Vector embeddings storage
GraphDBAdapter	`/adapters/graph.py`	Graph database operations
CacheAdapter	`/adapters/cache.py`	High-speed caching
LLMProviderAdapter	`/adapters/llm.py`	Large Language Model integration
PromptsAdapter	`/adapters/prompts.py`	Prompt injection and management

Why Adapters?

The adapter pattern enables:

Easy Driver Swapping: Switch between different databases, vector stores, or LLM providers
Testing: Mock adapters for unit testing
Flexibility: Mix and match different technologies based on your needs
Maintainability: Clean separation of concerns

Injection Process

The data is processed and added to the brain with the following steps:

Chunking - Break down input text into manageable pieces
Ensure Memory Existence - Verify memory storage is available
Save Chunks (can be concurrent) - Store text chunks in database
Extract Facts & Observations (can be concurrent) - Identify key information
Embed Facts (can be concurrent) - Generate vector embeddings for facts
Extract Language - Detect the language of the content
Save Vector - Store vector embeddings in vector database
Get Relationship Extractor - Select appropriate extractor based on language
Content Type Extraction - Determine the type of content being processed
Retrieve Relevant Memories - Find related information from existing memories
Resolve Coreferences - Link pronouns and references to their entities
Extract Relationships with LLM - Use language model to identify entity relationships
Wikification - Link entities to Wikipedia or knowledge base entries
Save Triplets - Store subject-predicate-object relationships in graph database

How multiple processes are handled

BrainAPI uses Celery with Redis to manage the tasks and the queues during the injection and writing, make sure you have a Redis instance running and accessible and before running the project with the make dev-custom command. Concurrent operations are handled with semaphores, you can configure the semaphores in the brainapi/config.py file, the default settings are:

class Config:
    class ConcurrencyConfig:
        def __init__(self):
            self.fastcoref_semaphore = 100
            self.azure_llm_large_semaphore = 1000
            self.triplet_extraction_llm_semaphore = 1000
            self.embedding_llm_semaphore = 1000
            self.coref_model_semaphore = 100
            self.ce_tokenizer_semaphore = 30

Development

Below you'll find everything you need to have your BrainAPI instance running and working with your own databases locally or hosted in the cloud.

Before starting the project make sure you've installed the python packages (managed with Poetry poetry install) and spacy dependencies with make download-spacy command, used for the named entity recognition and coreference resolution.

BrainAPI uses Celery with Redis to manage the tasks and the queues during the injection and writing, make sure you have a Redis instance running and accessible and before running the project with the make dev-custom command.

Adapters Implementation

Before starting the project you'll need to add your implementation of the adapters, the following instructions will help you do that.

DataORMAdapter

This adapter is responsable for the storage of the textual information. Inside the brainapi/server/adapters/data.py file, you'll need to implement an instance of a class that inherits from the AbstractDataORMAdapter class and implements the methods defined in the interface.

You'll need to return the driver in the place of the commented line and delete the raise NotImplementedError("You need to implement the get_async_driver method") line:

@classmethod
    def get_async_driver(cls):
        """Get a fresh async driver instance"""
        # return your implementation of the async driver instance here
        raise NotImplementedError("You need to implement the get_async_driver method")

EmbeddingsAdapter

This adapter is rasponsable for the creation, storagem and retrieval of vector embeddings, brainapi uses two adapters to handle two different vector dimensions, it's your choice to implement a single vector store and embeddings generator or two separate ones.

You'll need to implement two instances of two different classes, one for the embeddings generator (inherits from EmbeddingEncoderProvider) and one for the vector store (inherits from EmbeddingDBProvider).

# Example
class EmbeddingEncoderDriv(EmbeddingEncoderProvider):
    def encode(self, text: str) -> list[float]:
        # return your implementation of the encode method here
        raise NotImplementedError("You need to implement the encode method")

embeddings_encoder_driver = EmbeddingEncoderDriver()

class EmbeddingDBDriver(EmbeddingDBProvider):
    def upsert(self, vectors: list[Vector], namespace: str) -> None:
        # return your implementation of the upsert method here
        raise NotImplementedError("You need to implement the upsert method")
    ...

embeddings_db_driver = EmbeddingDBDriver()

And implement them in the brainapi/server/adapters/vectors.py file like this:

embeddings_adapter = EmbeddingsAdapter(
    embeddings_encoder_driver,
    embeddings_db_driver
)
embeddings_adapter_nodes = EmbeddingsAdapter(
    embeddings_encoder_driver, # Or another node-specific instance of the EmbeddingEncoderDriver class
    embeddings_db_driver # Or another node-specific instance of the EmbeddingDBDriver class
)

GraphDBAdapter

This adapter is responsable of the graph management, write/edit/retrieve operations on the knowledge graph are done through this class. You'll need to choose a graph db and create a class that implements the operations of the abstract class GraphDB class inside brainapi/server/adapters/graph.py.

def get_graph_adapter():
    # from brainapi.server.lib.your_graph_db_driver_class import graphdb_client

    # return GraphAdapter(graphdb_client)
    raise NotImplementedError("You need to implemnent a graphdb client/driver")

CacheAdapter

This adapter is responsable of managing the cache operations, you'll need to implement a class that inherits from the CacheDriver class inside brainapi/server/adapters/cache.py and implements the methods defined in the interface.

def _get_client(self):
    """Get cache client for current event loop"""
    # return your cache client instance here
    raise NotImplementedError("You need to implement the _get_client method")

LLMProviderAdapter

This is the simplest adapter to implement, you'll need to just create a class that inherits from the LLM class inside brainapi/server/interfaces.py and implements the methods defined in the interface.

class LLMAdapter:
    def __init__(self, llm: LLM):
        self.llm = # your llm instance here

    async def generate_text(self, prompt: str, max_new_tokens: int = None) -> str:
        return await self.llm.generate_text(prompt, max_new_tokens)


llm_adapter = LLMAdapter(_llm_large)

PromptsAdapter

This adapter is responsable for the management of the prompts that will be used to interact with the LLM, you'll need to implement a class that inherits from the PromptsAdapter class inside brainapi/server/adapters/prompts.py and implements the methods defined in the interface.

Create a class that is responsable of retrieving and parsing the prompts, and make sure that the llm responses return the correct type of data based on the registered result types.

prompts_adapter = PromptsAdapter(your_prompts_provider_class)

prompts_adapter.register_type(
    "relationship_extractor", # the prompt with this key
    RelationshipExtractedResult # will return this data type
)

Config settings

You can use the brainapi/config.py file to set the configuration settings for the project, the file is already populated with the default settings for the development environment, you can change the settings to your needs.

The convention in the project is to instatiate sub classes inside the Config class for each configuration setting, you can add your own settings by creating a new sub class and adding the settings to the __init__ method, throwing an error if any of the required settings are not set.

class Config:

    async def __init__(self):
        self.redis = self.RedisConfig()

    class RedisConfig:
      def __init__(self):
          self.host = os.getenv("REDIS_HOST")
          self.port = os.getenv("REDIS_PORT")
          self.db = os.getenv("REDIS_DB")

          if self.host is None or self.port is None or self.db is None:
              raise ValueError(
                  "[Config:RedisConfig] REDIS_HOST, REDIS_PORT, and REDIS_DB must be set"
              )
    ...

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
brainapi		brainapi
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Architecture

Core Adapters

Why Adapters?

Injection Process

How multiple processes are handled

Development

Adapters Implementation

DataORMAdapter

EmbeddingsAdapter

GraphDBAdapter

CacheAdapter

LLMProviderAdapter

PromptsAdapter

Config settings

About

Uh oh!

Releases

Packages

Languages

License

Lumen-Labs/brainapi

Folders and files

Latest commit

History

Repository files navigation

Architecture

Core Adapters

Why Adapters?

Injection Process

How multiple processes are handled

Development

Adapters Implementation

DataORMAdapter

EmbeddingsAdapter

GraphDBAdapter

CacheAdapter

LLMProviderAdapter

PromptsAdapter

Config settings

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages