Skip to content

Latest commit

 

History

History
170 lines (139 loc) · 7.89 KB

File metadata and controls

170 lines (139 loc) · 7.89 KB

RMR REST API

This project is a simple RMR REST API built using Flask. It provides basic endpoints to demonstrate how RMR (Recursive Memory Retrieval) can be implemented as a service. The API allows users to create a database from raw text files and perform recursive queries.

Setup Instructions

  1. Clone the repository

    git clone https://github.com/phatware/RMR.git
    cd RMR/rmr-api
  2. Create a virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  3. Install the dependencies

    pip install -r requirements.txt
    brew install poppler
    brew install tesseract

Additional dependencies may be required by the unstructured python package. Please refer to the unstructured installation guide for more details.

  1. Install Redis server
  1. Create .env file Copy .env.example to .env and adjust the settings as needed, especially the DATABASE_FOLDER path.

    Example .env content:

    OPENAI_API_KEY="<your-openai-api-key-here>"
    DATABASE_FOLDER="databases"
    API_TOKEN="<your-auth-key-here>"
    

    Ensure the DATABASE_FOLDER exists or create it:

    mkdir databases
  2. Run the application

    python run.py

Usage

Once the application is running, you can access the API endpoints defined in app/routes.py. Use included frontend or tools like Postman to interact with the API.

REST API Endpoints

1. Health Check (/health)

  • Method: GET
  • Purpose: Verifies the operational status of the RMR API service.
  • Input: None
  • Output: JSON with service status, name ("Recursive Memory Retrieval"), and version ("1.0.2").
    • Example: {"status": "healthy", "name": "Recursive Memory Retrieval", "version": "1.0.2"}
  • Status Code: 200 (OK)
  • RMR Relevance: Ensures the API is running, critical for maintaining availability in cognitive agents or knowledge-intensive applications.

2. List Databases (/items)

  • Method: GET
  • Purpose: Retrieves a list of available database names (without .db extension) stored in the configured database folder.
  • Input: None
  • Output: JSON with a list of database names.
    • Example: {"items": ["db1", "db2"]}
  • Status Code:
    • 200 (OK)
  • RMR Relevance: Supports RMR’s model-agnostic storage by listing available memory event databases, enabling users to select contexts for querying or management.

3. Upload File (/add)

  • Method: POST
  • Purpose: Uploads a file (CSV, Markdown, or text) to create a new database of memory events, processed asynchronously.
  • Input: Multipart/form-data with:
    • file: The file to process (CSV, .md, .txt, etc.).
    • name: Database name.
    • threshold (optional, default: 0.8): Similarity threshold for memory event storage.
  • Output: JSON with a task ID for tracking processing status.
    • Example: {"message": "Upload received. Processing started.", "task_id": "abc123"}
  • Status Code:
    • 202 (Accepted) for successful upload initiation.
    • 400 (Bad Request) if file or name is missing.
    • 409 (Conflict) if the database already exists.
  • RMR Relevance: Facilitates memory event ingestion, a core RMR feature. The asynchronous processing of files into memory events (via extract_memory_events_from_csv or extract_memory_events_from_markdown) supports scalable retrieval and clustering, with the threshold enabling stigmatization-like prioritization.

4. Check Upload Status (/add/status/<task_id>)

  • Method: GET
  • Purpose: Retrieves the processing status and progress of a file upload task.
  • Input: URL parameter task_id (UUID hex string).
  • Output: JSON with task ID, status ("processing", "completed", "error", or "not found"), and progress percentage (0–100).
    • Example: {"task_id": "abc123", "status": "processing", "progress": 50}
  • Status Code: 200 (OK)
  • RMR Relevance: Provides visibility into the asynchronous memory event creation process, ensuring users can monitor the construction of context graphs.

5. Delete Database (/items/<name>)

  • Method: DELETE
  • Purpose: Deletes a specified database file.
  • Input: URL parameter name (database name without .db extension).
  • Output: JSON with a success or error message.
    • Example: {"message": "Database db1 deleted"} or {"error": "Database not found"}
  • Status Code:
    • 200 (OK) for successful deletion.
    • 404 (Not Found) if the database does not exist.
  • RMR Relevance: Supports memory management by allowing removal of obsolete or incorrect memory event databases, aligning with RMR’s continual learning and knowledge refinement.

6. Query Database (/query)

  • Method: POST
  • Purpose: Submits a query to a specified database, retrieving answers via RMR’s recursive querying mechanism, processed asynchronously.
  • Input: JSON with:
    • query: The query string.
    • name: Database name.
    • Optional parameters (with defaults):
      • top_k_memory (8): Number of memory events to retrieve.
      • top_k_clusters (5): Number of clusters to consider.
      • max_nodes (10): Maximum nodes in the context graph.
      • cluster_diversity (true): Enable diversity in clustering.
      • graph_depth (5): Depth of recursive querying.
      • neighbors_per_node (5): Neighbors per node in the graph.
      • summary_top_k (7): Top-k for summarization.
      • summary_threshold (0.85): Summarization threshold.
      • dedup_threshold (0.8): Deduplication threshold.
      • llm_model ("gpt-5.2"): LLM model for response generation.
      • add_memory (false): Add query to memory.
      • use_llm (true): Use LLM for summarization.
      • use_rmr_agent (true): Use the RMR Agent path by default when LLM synthesis is enabled.
  • Output: JSON with a task ID for tracking query status.
    • Example: {"message": "Query received. Processing started.", "task_id": "xyz789"}
  • Status Code:
    • 202 (Accepted) for successful query initiation.
    • 400 (Bad Request) if query or name is missing.
    • 404 (Not Found) if the database does not exist.
  • RMR Relevance: Core endpoint for RMR’s recursive querying, leveraging clustering, context graph construction, and dual summarization (heuristic and LLM-based). Parameters allow fine-tuning of retrieval and reasoning, supporting multi-hop reasoning and scalability.

7. Check Query Status (/query/status/<task_id>)

  • Method: GET
  • Purpose: Retrieves the status and result of a query task.
  • Input: URL parameter task_id (UUID hex string).
  • Output: JSON with task ID, status ("processing", "completed", or "error"), and result (if completed) or error message.
    • Example: {"task_id": "xyz789", "status": "completed", "result": {"answer": "...", "database": "db1"}}
  • Status Code:
    • 200 (OK) for valid tasks.
    • 404 (Not Found) if the task ID is invalid.
  • RMR Relevance: Enables asynchronous result retrieval, critical for handling complex queries in knowledge-intensive domains like legal or biomedical research.

8. Submit Feedback (/feedback)

  • Method: POST
  • Purpose: Accepts user feedback for a query task (placeholder implementation).
  • Input: JSON with:
    • task_id: Task ID of the query.
    • feedback: Feedback string.
  • Output: JSON confirming feedback receipt.
    • Example: {"message": "Feedback received"}
  • Status Code:
    • 200 (OK) for valid input.
    • 400 (Bad Request) if task_id or feedback is missing.
  • RMR Relevance: Placeholder for supporting continual learning by collecting user feedback, which could adjust memory event priorities (stigmatization) or refine retrieval in future iterations.