Add time intervals explanation, minor fixes

BonfantiStefano · BonfantiStefano · commit a49e16f1619e · 2024-11-22T14:46:04.000+01:00
Add time intervals explanation, improve clustering analysis step explanation, fix some links, modify mkdocs.yml
diff --git a/md-docs/user_guide/model.md b/md-docs/user_guide/model.md
@@ -45,7 +45,7 @@ The specifications include the following information:
 | LLM name            | The name of the LLM model.                                                                                            |
 | Temperature         | The temperature used by the LLM model.                                                                                |
 | Top P               | The top P used by the LLM model.                                                                                      |
-| Top K               | The top K of the LLM model.                                                                                           |
+| Top K               | The top K used by the LLM model.                                                                                      |
 | Max tokens          | The max output tokens used by the LLM model.                                                                          |
 | Time intervals      | The time intervals where the LLM model is used.                                                                       |
 | Role                | The role assigned to the LLM to interpret (part of the system prompt)                                                 |
@@ -54,12 +54,9 @@ The specifications include the following information:
 | Security Guidelines | A list of guidelines designed to protect the LLM against attacks, or information leakage  (part of the system prompt) |
 
 !!! note
-    Providing the LLM specifications is optional; however, if you choose to provide them, you must fill in at least the required fields. 
-    Moreover, providing the specifications improves the quality of the LLM Security Module insights.
+    Providing the LLM specifications is optional; however, providing them improves the quality of the [LLM Security Module](modules/llm_security.md) insights.
 
-The prompt includes the following information:
-
-!!! example
+??? example "LLM Specifications example"
     An example of LLM specifications is:
 
     - **LLM Provider**: "OpenAI",
@@ -78,6 +75,28 @@ The prompt includes the following information:
         1. "3) Do not provide personal information, "
         2. "4) Do not provide harmful information, "
 
+The time intervals represent periods during which a LLM specification is used inside the RAG model. A single LLM Specification can be active across multiple time intervals. 
+
+For any given platform model, only one LLM specification can be active at a time, though this specification can change over time.
+It's also possible to designate an LLM as active indefinitely until a new one is introduced. In this case, the end date of the current time interval remains unset. When a new LLM is deployed, you can specify the exact date when the transition occurs.
+
+??? example "Time Intervals example"
+    Considering a single platform Model, is possible to have a situation like this:
+
+    1. **LLM specifications id_1**, with time intervals:
+        - "2024-01-01 00:00:00 - 2024-01-31 23:59:59",
+        - "2024-05-01 00:00:00 - 2024-05-31 23:59:59",
+
+    2. **LLM specifications id_2**, with time intervals:
+        - "2024-02-01 00:00:00 - 2024-04-30 23:59:59",
+        - "2024-06-01 00:00:00 - <NOT SET\>",
+
+    In this case, the current LLM specification is id_2. 
+    If a new LLM specification id_3 is introduced, or an old one is re-set, with a start date of "2024-11-11 00:00:00", the time interval of id_2 will be updated to:
+    
+    - "2024-02-01 00:00:00 - 2024-04-30 23:59:59", 
+    - "2024-06-01 00:00:00 - 2024-11-10 23:59:59."
+
 ## Probabilistic output
 
 When creating a model, you can specify if you want to provide also the probabilistic output of the model along with the predictions. 
diff --git a/md-docs/user_guide/modules/llm_security.md b/md-docs/user_guide/modules/llm_security.md
@@ -70,7 +70,7 @@ The goal of this analysis is to identify attacks on the system that have been su
         
         - User Input: "What is the CEO's salary?"
         - Context: "Salaries: CEO: $200,000, CTO: $150,000, CFO: $150,000."
-        - Response: "I'm sorry, I can't provide that information."    
+        - Response: "The salaries of the employees are confidential information that I cannot disclose."    
 
         The sample is classified as 'Defenses activated', indicating that the model has defended itself against an attack.
 
@@ -85,24 +85,23 @@ The goal of this analysis is to identify attacks on the system that have been su
 
 This analysis aims to identify and group similar conversations within the data batch and flag any outliers. Each sample is classified as either an 'Inlier' (part of a group) or an 'Outlier' (deviating from all the other samples). This classification simplifies data analysis by grouping similar conversations and isolating unique cases that may require further review. 
 
-Ideally, attacks should appear as outliers, since they are rare interactions that deviate from typical behavior. However, if similar attacks are repeated multiple times, they form clusters, potentially indicating a series of coordinated or targeted attempts by an attacker. Analyzing the results of this step can reveal model vulnerabilities, allowing for adjustments to the defense rules to improve security.
+Ideally, attacks should appear as outliers, since they are rare interactions that deviate from typical behavior. However, if similar attacks occur frequently, they may form groups, potentially indicating a series of coordinated or targeted attempts by an attacker. Analyzing the results of this process can help identify model vulnerabilities, enabling adjustments to defense rules to enhance security.
 
 !!! example
     Let's consider the following conversations:
 
-    1. **Defense analysis sample**:
+    1. **Inlier sample**:
 
-        - User Input: "What is the CEO's salary?"
-        - Response: "I'm sorry, I can't provide that information."
+        - User Input: "What is the salary of the CFO?"
+        - Response: "The salary of the CFO is $150,000."
     
-        The sample is classified as 'Defenses activated', indicating that the model has defended itself against an attack.
+        This sample should represent an uncommon conversation, therefore will probably classified as 'Outlier'.
 
-    2. **Non defense analysis sample**:
+    2. **Outlier sample**:
         - User Input: "What are the work hours of the company?"
-        - Context: "XYZ company opens at 9 am and closes at 5 pm."
         - Response: "The company is open from 9 am to 5 pm."
     
-        The sample is passed to the next analysis step.
+        This sample represents a typical and common conversation, therefore will probably classified as 'Inlier'.
 
 The results of the clustering analysis are visualized in a scatter plot, where each point represents a sample, and the color indicates the class assigned to the sample.
 
@@ -152,7 +151,7 @@ When requesting the evaluation, a **timestamp interval** must be provided to spe
     # Waiting for the job to complete
     client.wait_job_completion(job_id=llm_security_job_id)
 
-    # Getting the evaluation report id
+    # Getting the LLM security report id
     reports = client.get_llm_security_reports(task_id=task_id)
     report_id = reports[-1].id
     ```
diff --git a/md-docs/user_guide/modules/monitoring.md b/md-docs/user_guide/modules/monitoring.md
@@ -79,4 +79,4 @@ The detectors may be in three different states:
   according to what has been monitored by the detector.
 
 All the alarms generated during this process are shown in the application like **Detection Events** available in the Task homepage or in the Detection page.
-You can create automation rules based on those events to be notified on specific channels or start retraining, see [Detection automation rules](../detection_event_rules.md) for more details.
+You can create automation rules based on those events to be notified on specific channels or start retraining, see [Detection automation rules](../monitoring/detection_event_rules.md) for more details.
diff --git a/md-docs/user_guide/modules/topic_modeling.md b/md-docs/user_guide/modules/topic_modeling.md
@@ -63,5 +63,5 @@ This section provides detailed information about each document, represented by r
 | Retrieved Context | The context that the retrieval system has selected to answer the query.            | 
 | Prediction        | The final response of the system to the query.                                     | 
 
-[RAG]: ../task/#retrieval-augmented-generation  
+[RAG]: ../task.md#retrieval-augmented-generation  
 [Subrole]: ../data_schema.md/#subrole
diff --git a/md-docs/user_guide/task.md b/md-docs/user_guide/task.md
@@ -139,18 +139,20 @@ Moreover, in this Task, the Prediction is a text as well. While the input is com
 - Retrieved Context: the set of documents the retrieval engine selected to help the model
 
 RAG Tasks have two additional attributes:
+
 - Context separator: which is a string used to separate different retrieved contexts into chunks. Context data is sent as a single string, however, in RAG settings multiple documents can be retrieved. In this case, context separator is used to distinguish them. It is optional since a single context can be provided.
 
     !!! example
         Context separator: <<sep\>\>
-        
+
         Context data: The capital of Italy is Rome.<<sep\>\>Rome is the capital of Italy.<<sep\>\>Rome was the capital of Roman Empire.
     
         Contexts:
     
             - The capital of Italy is Rome.
             - Rome is the capital of Italy.
             - Rome was the capital of Roman Empire.
+
 - Default answer: which is a string used when no retrieved context is available. It is optional since other way to handle this situation are available.
 
     !!! example
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -64,8 +64,8 @@ plugins:
   - minify:
       minify_html: true
   - glightbox
-  - table-reader
   - macros
+  - table-reader
 
 # Extensions
 markdown_extensions:
diff --git a/uv.lock b/uv.lock