LocalLab Server Package v0.1.1 Released and Updated Docs

UtkarshTheDev · UtkarshTheDev · commit 70c98b611ec2 · 2025-02-25T10:15:27.000+05:30
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,6 +1,14 @@
 # Changelog
 
-All notable changes for the version update from 0.0 to 0.1.0.
+All notable changes for version updates.
+
+## [0.1.1] - 2024-02-25
+
+### Fixed
+
+- Fixed RuntimeError related to SemLock sharing in multiprocessing by clearing logger handlers in run_server_proc.
+- Updated Mermaid diagrams in README.md and docs/colab/README.md to wrap node labels in double quotes, improving compatibility with GitHub rendering.
+- Improved build status badge aesthetics in the README.
 
 ## [0.1.0] - 2024-02-23
 
diff --git a/README.md b/README.md
@@ -51,39 +51,39 @@ Below is an illustration of LocalLab's architecture:
 
 ```mermaid
 graph TD;
-    A[User] --> B[LocalLab Client (Python and Node.js)];
-    B --> C[LocalLab Server];
-    C --> D[Model Manager];
-    D --> E[Hugging Face Models];
+    A[User] --> B["LocalLab Client (Python and Node.js)"];
+    B --> C["LocalLab Server"];
+    C --> D["Model Manager"];
+    D --> E["Hugging Face Models"];
     C --> F[Optimizations];
-    C --> G[Resource Monitoring];
+    C --> G["Resource Monitoring"];
 ```
 
 ### Model Loading & Optimization Flow
 
 ```mermaid
 graph TD;
-    A[Load Model Request] --> B{Check Resources};
-    B -->|Sufficient| C[Load Model];
-    B -->|Insufficient| D[Apply Optimizations];
+    A["Load Model Request"] --> B[{"Check Resources"}];
+    B -->|Sufficient| C["Load Model"];
+    B -->|Insufficient| D["Apply Optimizations"];
     D --> E[Quantization];
-    D --> F[Attention Slicing];
-    D --> G[CPU Offloading];
-    E & F & G --> H[Load Optimized Model];
-    C & H --> I[Ready for Inference];
+    D --> F["Attention Slicing"];
+    D --> G["CPU Offloading"];
+    E & F & G --> H["Load Optimized Model"];
+    C & H --> I["Ready for Inference"];
 ```
 
 ### Resource Management Flow
 
 ```mermaid
 graph TD;
-    A[Client Request] --> B[Resource Monitor];
-    B --> C{Check Resources};
-    C -->|OK| D[Process Request];
-    C -->|Low Memory| E[Optimize/Unload];
-    C -->|GPU Full| F[CPU Fallback];
-    E & F --> G[Continue Processing];
-    D & G --> H[Return Response];
+    A["Client Request"] --> B["Resource Monitor"];
+    B --> C[{"Check Resources"}];
+    C -->|OK| D["Process Request"];
+    C -->|Low Memory| E["Optimize/Unload"];
+    C -->|GPU Full| F["CPU Fallback"];
+    E & F --> G["Continue Processing"];
+    D & G --> H["Return Response"];
 ```
 
 ## Google Colab Workflow
@@ -92,9 +92,9 @@ When deploying on Google Colab, LocalLab uses ngrok to create a public tunnel. T
 
 ```mermaid
 sequenceDiagram
-    participant U as User (Colab)
-    participant S as LocalLab Server
-    participant N as Ngrok Tunnel
+    participant U as "User (Colab)"
+    participant S as "LocalLab Server"
+    participant N as "Ngrok Tunnel"
     U->>S: Run start_server(ngrok=True)
     S->>N: Establish public tunnel
     N->>U: Return public URL
diff --git a/docs/colab/README.md b/docs/colab/README.md
@@ -40,10 +40,10 @@ The fastest way to get started is to use our [Interactive Colab Guide](./localla
 
 ```mermaid
 graph TD;
-    A[User] --> B[LocalLab Client (Python and Node.js)];
-    B --> C[LocalLab Server];
-    C --> D[Model Manager];
-    D --> E[Hugging Face Models];
+    A[User] --> B["LocalLab Client (Python and Node.js)"];
+    B --> C["LocalLab Server"];
+    C --> D["Model Manager"];
+    D --> E["Hugging Face Models"];
     C --> F[Optimizations];
-    C --> G[Resource Monitoring];
+    C --> G["Resource Monitoring"];
 ```
diff --git a/locallab/__init__.py b/locallab/__init__.py
@@ -2,7 +2,7 @@
 LocalLab - A lightweight AI inference server
 """
 
-__version__ = "0.1.0" 
+__version__ = "0.1.1" 
 
 from typing import Dict, Any, Optional
 
diff --git a/locallab/main.py b/locallab/main.py
@@ -688,6 +688,9 @@ def run_server_proc(log_queue):
         sys.stdout = log_writer
         sys.stderr = log_writer
         
+        # Clear any existing logger handlers to avoid sharing SemLocks from a fork context
+        logger.handlers.clear()
+        
         # Attach a logging handler to send log messages to the queue
         handler = logging.StreamHandler(log_writer)
         handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
diff --git a/setup.py b/setup.py
@@ -5,7 +5,7 @@
 
 setup(
     name="locallab",
-    version="0.1.0",
+    version="0.1.1",
     packages=find_packages(include=["locallab", "locallab.*"]),
     install_requires=[
         "fastapi>=0.68.0,<1.0.0",