Add pytorch LLM by kmgowda · Pull Request #4 · kmgowda/sbk-charts

kmgowda · 2026-01-15T08:52:07Z

Summary by CodeRabbit

New Features
- Local PyTorch LLM for on-device inference, optional training, and performance analyses (throughput, latency, memory, percentiles)
- AI backend lifecycle support (open/close) for cleaner startup/shutdown
Documentation
- Added venv and conda environment setup instructions and a comprehensive PyTorch LLM README
Chores
- Updated Python requirements and bumped version to 3.26.2.0

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

the training fails with memory error Error during training: MPS backend out of memory (MPS allocated: 181.43 GiB, other allocations: 107.86 MiB, max allowed: 182.78 GiB). Tried to allocate 1.53 GiB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

the training fails withe error on macos with 128GB memory RuntimeError: MPS backend out of memory (MPS allocated: 178.54 GiB, other allocations: 3.84 GiB, max allowed: 182.78 GiB). Tried to allocate 506.25 MiB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai · 2026-01-15T08:52:18Z

📝 Walkthrough

Walkthrough

Adds a local PyTorch LLM backend, lifecycle hooks (open/close) on AI interfaces, Excel color constants and a default timeout, dependency updates and README setup additions, and bumps the package version.

Changes

Cohort / File(s)	Summary
Documentation & Setup `README.md`, `src/custom_ai/pytorch_llm/README.md`	Added venv and conda setup blocks to `README.md`; new comprehensive README for the PyTorch LLM backend.
Dependencies `requirements.txt`	Updated requirements: downgraded `huggingface_hub` (1.2.3 → 0.34.0), added `torch==2.9.1`, `transformers==4.57.3`, `accelerate>=0.20.0`; minor formatting change.
New PyTorch LLM Backend `src/custom_ai/pytorch_llm/pytorch_llm.py`	New `PyTorchLLM` class implementing local HF model loading, device/dtype handling, generation, optional per-output training & persistence, analysis methods, CLI arg wiring, open/close lifecycle, and destructor.
AI Interface & Formatting `src/ai/sbk_ai.py`	Added `DEFAULT_TIMEOUT_SECONDS` and `ExcelColors` constants; replaced inline hex colors with constants; added `open(self, args)` and `close(self, args)` delegating to active backend; CLI timeout parsing adjusted.
AI Interface Placeholder `src/genai/genai.py`	Added placeholder `open(self, args)` and `close(self, args)` on `SbkGenAI` and minor prompt text tweak.
Integration / Invocation `src/main/sbk_charts.py`	Calls `ch.open(args)` before AI performance augmentation and `ch.close(args)` afterwards.
Versioning `src/version/sbk_version.py`	Bumped `__sbk_version__` from `"3.26.1.0"` to `"3.26.2.0"`.

Sequence Diagram

sequenceDiagram
    participant CLI as Application / CLI
    participant SbkAI as SbkAI Instance
    participant PyLLM as PyTorchLLM Backend
    participant Model as Local Model (HF / CUDA / MPS / CPU)

    CLI->>SbkAI: open(args)
    SbkAI->>PyLLM: open(args)
    PyLLM->>Model: _initialize_model() / load weights to device

    CLI->>SbkAI: generate(prompt)
    SbkAI->>PyLLM: _generate_text(prompt)
    PyLLM->>Model: tokenize & run generation (sampling params)
    Model-->>PyLLM: tokens / logits
    PyLLM->>PyLLM: _clean_generated_text()
    PyLLM-->>SbkAI: (success, text)
    SbkAI-->>CLI: return generated text

    CLI->>SbkAI: close(args)
    SbkAI->>PyLLM: close(args)
    PyLLM->>Model: optional _train_on_output() / _save_model()
    PyLLM->>Model: free resources

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Kmg release 2026 1 #3: Overlaps on SbkAI timeout and open/close lifecycle changes in src/ai/sbk_ai.py.
Content for sbk-charts release in 2026 #2: Related AI integration and lifecycle hook edits touching src/ai/sbk_ai.py and src/main/sbk_charts.py.

Poem

🐇 I hop where tensors softly glow,
I open gates and let prompts flow.
I whisper to weights, they hum and grow,
I close the loop — save seeds I sow.
Tiny rabbit bounds: infer, train, go!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.76% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add pytorch LLM' accurately reflects the main addition in this changeset: a new PyTorchLLM backend implementation with supporting documentation, dependencies, and integration points.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch kmg-release-2026-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 10

🤖 Fix all issues with AI agents

In `@README.md`:
- Around line 131-132: The README currently pins a non-existent Python version
in the conda create command ("conda create -n conda-sbk-charts python=3.14 -y");
update that command to use a released interpreter (e.g., python=3.12 or
python=3.13) so environment creation succeeds, and optionally note a range like
"python>=3.12,<3.14" or "python=3.12" for clarity.

In `@requirements.txt`:
- Around line 13-14: The requirements.txt entry for transformers is pinned to a
non-existent version "transformers~=4.57.3"; update that token to a valid
release (e.g., change "transformers~=4.57.3" to "transformers~=4.57.4" or a
later valid version) while leaving "torch~=2.9.1" unchanged so pip can resolve
packages correctly.

In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 100-105: The call to AutoModelForCausalLM.from_pretrained is using
the wrong keyword argument (dtype) so the dtype is ignored; change the parameter
name to torch_dtype when calling
AutoModelForCausalLM.from_pretrained(saved_model_dir, ...) and pass the same
dtype variable as torch_dtype; apply the identical change to the second
occurrence of the same call (the block around the lines referenced 116-121) so
both places use torch_dtype instead of dtype.
- Around line 100-105: Add a new boolean CLI flag (e.g., --pt-trust-remote-code)
to the existing argument parser with default False and store it in a variable
(e.g., pt_trust_remote_code); then replace the hardcoded trust_remote_code=True
in every model load call (the AutoModelForCausalLM.from_pretrained and any other
.from_pretrained usage in pytorch_llm.py) to use that variable
(trust_remote_code=pt_trust_remote_code) so users must explicitly opt in to
remote code execution.
- Around line 476-478: The autocast device selection in _train_on_example
currently only checks for 'cuda' and falls back to 'cpu'; update it to mirror
_train_on_output by including 'mps' detection: when forming the torch.autocast
call inside _train_on_example, set device_type to 'mps' if 'mps' is in
str(self.device), otherwise 'cuda' if 'cuda' is in str(self.device), else 'cpu',
and keep passing model_dtype as dtype so mixed-precision works correctly on
Apple Silicon.
- Around line 191-195: In open(), _is_initialized is being set to True
unconditionally after calling _initialize_model(); change the logic so you
capture the return value of _initialize_model() and only set
self._is_initialized = True when that call indicates success (e.g., returned
True), otherwise leave it False and handle the failure (log/error/raise or
return a failure status) so retries are possible; update the open() method to
use the boolean result from _initialize_model() and ensure callers can
observe/initiate retry behavior.
- Around line 279-281: The attention_mask is incorrectly cast to the model's
floating dtype; instead, keep it as an integer tensor and only move it to the
model device (or explicitly cast to torch.long). Update the code that currently
does inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)
to something like inputs['attention_mask'] =
inputs['attention_mask'].to(device=model.device, dtype=torch.long) (or preserve
original integer dtype while moving device), and make the same change in the
_train_on_example function so 'attention_mask' remains an integer (long) tensor
rather than being converted to model_dtype.
- Around line 440-493: The _train_on_example method returns inconsistent types
(float/None in docstring but (False, str) on exception); change the except block
to return None (or re-raise the exception) and log the error instead of
returning a tuple, and update the method docstring to clearly state it returns
float or None on failure; specifically modify the except block in
_train_on_example to call your logger (e.g., self.logger.error or similar) with
the exception message and then return None, removing the (False, str) tuple.
- Line 123: The call self.model = self.model.to(self.device) is redundant and
harmful when using device_map="auto" because the model is already placed across
devices; remove that unconditional .to(...) or guard it so it only runs when
device_map is not "auto" (e.g., check the loaded config/argument for device_map
before applying .to). Update the model-loading path where self.model is assigned
to avoid forcing .to(self.device) when device_map=="auto", leaving placement to
the loader to preserve model parallelism.

🧹 Nitpick comments (2)

README.md (1)
130-130: Add language specifiers to fenced code blocks.

Per markdownlint MD040, fenced code blocks should have a language specified for proper syntax highlighting.
Proposed fix
-```
+```bash
 # Create a new conda environment with Python 3.14 or higher
-```
+```bash
 conda deactivate
Also applies to: 149-149
src/custom_ai/pytorch_llm/pytorch_llm.py (1)
334-334: Remove unused variable was_training.

The variable was_training is assigned but never used, as flagged by Ruff F841.
Proposed fix
-        was_training = self.model.training

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffa1190 and e3d801a.

📒 Files selected for processing (7)

README.md
requirements.txt
src/ai/sbk_ai.py
src/custom_ai/pytorch_llm/__init__.py
src/custom_ai/pytorch_llm/pytorch_llm.py
src/genai/genai.py
src/main/sbk_charts.py

🧰 Additional context used

🧬 Code graph analysis (4)

src/genai/genai.py (2)

src/custom_ai/pytorch_llm/pytorch_llm.py (2)

open (191-195)

close (198-219)

src/ai/sbk_ai.py (2)

open (194-196)

close (198-200)

src/ai/sbk_ai.py (4)

src/custom_ai/pytorch_llm/pytorch_llm.py (3)

parse_args (179-188)

open (191-195)

close (198-219)

src/genai/genai.py (3)

parse_args (55-56)

open (58-59)

close (61-62)

src/custom_ai/lm_studio/lm_studio.py (1)

parse_args (98-103)

src/custom_ai/ollama/ollama.py (1)

parse_args (113-119)

src/main/sbk_charts.py (3)

src/custom_ai/pytorch_llm/pytorch_llm.py (2)

open (191-195)

close (198-219)

src/ai/sbk_ai.py (2)

open (194-196)

close (198-200)

src/genai/genai.py (2)

open (58-59)

close (61-62)

src/custom_ai/pytorch_llm/pytorch_llm.py (2)

src/genai/genai.py (13)

SbkGenAI (25-356)

add_args (52-53)

parse_args (55-56)

open (58-59)

get_model_description (65-73)

get_throughput_analysis (327-336)

get_throughput_prompt (75-122)

get_latency_analysis (339-348)

get_latency_prompt (124-215)

get_total_mb_analysis (351-352)

get_total_mb_prompt (218-247)

get_percentile_histogram_analysis (355-356)

get_percentile_histogram_prompt (249-323)

src/ai/sbk_ai.py (3)

add_args (141-168)

parse_args (171-192)

open (194-196)

🪛 markdownlint-cli2 (0.18.1)

README.md

130-130: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

149-149: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.11)

src/genai/genai.py

58-59: SbkGenAI.open is an empty method in an abstract base class, but has no abstract decorator

(B027)

61-62: SbkGenAI.close is an empty method in an abstract base class, but has no abstract decorator

(B027)

src/custom_ai/pytorch_llm/pytorch_llm.py

133-133: Consider moving this statement to an else block

(TRY300)

135-135: Do not catch blind exception: Exception

(BLE001)

136-136: Use explicit conversion flag

Replace with conversion flag

(RUF010)

191-191: Unused method argument: args

(ARG002)

198-198: Unused method argument: args

(ARG002)

222-222: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

243-243: Consider moving this statement to an else block

(TRY300)

245-245: Do not catch blind exception: Exception

(BLE001)

246-246: Use explicit conversion flag

Replace with conversion flag

(RUF010)

314-314: Consider moving this statement to an else block

(TRY300)

316-316: Do not catch blind exception: Exception

(BLE001)

317-317: Use explicit conversion flag

Replace with conversion flag

(RUF010)

334-334: Local variable was_training is assigned to but never used

Remove assignment to unused variable was_training

(F841)

389-389: Consider moving this statement to an else block

(TRY300)

391-391: Do not catch blind exception: Exception

(BLE001)

392-392: Use explicit conversion flag

Replace with conversion flag

(RUF010)

490-490: Consider moving this statement to an else block

(TRY300)

492-492: Do not catch blind exception: Exception

(BLE001)

493-493: Use explicit conversion flag

Replace with conversion flag

(RUF010)

500-500: Do not catch blind exception: Exception

(BLE001)

501-501: Use explicit conversion flag

Replace with conversion flag

(RUF010)

508-508: Do not catch blind exception: Exception

(BLE001)

509-509: Use explicit conversion flag

Replace with conversion flag

(RUF010)

516-516: Do not catch blind exception: Exception

(BLE001)

517-517: Use explicit conversion flag

Replace with conversion flag

(RUF010)

524-524: Do not catch blind exception: Exception

(BLE001)

525-525: Use explicit conversion flag

Replace with conversion flag

(RUF010)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (5)

requirements.txt (1)

8-8: Verify huggingface_hub version compatibility.

The huggingface_hub was downgraded from 1.2.3 to 0.34.0. Ensure this version is compatible with the transformers version being used, as transformers may have minimum version requirements for huggingface_hub.

src/genai/genai.py (1)

58-62: LGTM - Optional lifecycle hooks.

The open() and close() methods are intentionally non-abstract to serve as optional lifecycle hooks. Backends like PyTorchLLM can override them for resource initialization and cleanup (e.g., model loading, training on close), while other backends can use the default no-op implementation. The Ruff B027 warning can be safely ignored as this is the intended design.

src/ai/sbk_ai.py (2)

54-54: LGTM - Clean timeout constant extraction.

Good refactor to extract DEFAULT_TIMEOUT_SECONDS as a module-level constant, making the default value explicit and reusable. The int() conversion in parse_args properly handles string input from command-line arguments.

Also applies to: 138-138, 187-187

194-200: LGTM - Lifecycle delegation pattern.

The open() and close() methods correctly delegate to the active AI backend instance when present. This enables proper resource management for backends like PyTorchLLM that need initialization and cleanup.

src/custom_ai/pytorch_llm/pytorch_llm.py (1)

37-37: No changes needed. openai/gpt-oss-20b is a legitimate Hugging Face model—OpenAI's open-weight GPT-OSS model (~21B parameters) released under Apache 2.0 license and available on Hugging Face in multiple formats.

Likely an incorrect or invalid review comment.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-15T08:55:36Z

README.md

+# Create a new conda environment with Python 3.14 or higher
+conda create -n conda-sbk-charts python=3.14 -y


⚠️ Potential issue | 🟡 Minor

Python 3.14 does not exist yet.

Python 3.14 is not released (current stable versions are 3.12/3.13). This will cause conda environment creation to fail. Consider using Python 3.12 or 3.13.

Proposed fix

-# Create a new conda environment with Python 3.14 or higher -conda create -n conda-sbk-charts python=3.14 -y +# Create a new conda environment with Python 3.12 or higher +conda create -n conda-sbk-charts python=3.12 -y

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Create a new conda environment with Python 3.14 or higher

conda create -n conda-sbk-charts python=3.14 -y

# Create a new conda environment with Python 3.12 or higher

conda create -n conda-sbk-charts python=3.12 -y

🤖 Prompt for AI Agents

In `@README.md` around lines 131 - 132, The README currently pins a non-existent Python version in the conda create command ("conda create -n conda-sbk-charts python=3.14 -y"); update that command to use a released interpreter (e.g., python=3.12 or python=3.13) so environment creation succeeds, and optionally note a range like "python>=3.12,<3.14" or "python=3.12" for clarity.

coderabbitai · 2026-01-15T08:55:36Z

requirements.txt

+torch~=2.9.1
+transformers~=4.57.3


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check latest available versions of torch and transformers on PyPI echo "Checking torch versions:" curl -s https://pypi.org/pypi/torch/json | jq -r '.releases | keys | .[-15:]' | sort -V echo -e "\nChecking transformers versions:" curl -s https://pypi.org/pypi/transformers/json | jq -r '.releases | keys | .[-15:]' | sort -V

Repository: kmgowda/sbk-charts

Length of output: 469

Fix invalid transformers version in requirements.txt.

Line 14 specifies transformers~=4.57.3, but this version does not exist on PyPI. Available versions jump from 4.9.2 to 4.57.4. Update to transformers~=4.57.4 or later.

The torch~=2.9.1 version is valid and exists on PyPI.

🤖 Prompt for AI Agents

In `@requirements.txt` around lines 13 - 14, The requirements.txt entry for transformers is pinned to a non-existent version "transformers~=4.57.3"; update that token to a valid release (e.g., change "transformers~=4.57.3" to "transformers~=4.57.4" or a later valid version) while leaving "torch~=2.9.1" unchanged so pip can resolve packages correctly.

coderabbitai · 2026-01-15T08:55:37Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+                self.model = AutoModelForCausalLM.from_pretrained(
+                    saved_model_dir,
+                    device_map="auto",
+                    dtype=dtype,
+                    trust_remote_code=True
+                )


⚠️ Potential issue | 🟠 Major

Wrong parameter name: use torch_dtype instead of dtype.

The AutoModelForCausalLM.from_pretrained() method expects torch_dtype, not dtype. This will cause the dtype setting to be silently ignored.

Proposed fix

self.model = AutoModelForCausalLM.from_pretrained( saved_model_dir, device_map="auto", - dtype=dtype, + torch_dtype=dtype, trust_remote_code=True )

Apply the same fix at lines 116-121.

Also applies to: 116-121

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 100 - 105, The call to AutoModelForCausalLM.from_pretrained is using the wrong keyword argument (dtype) so the dtype is ignored; change the parameter name to torch_dtype when calling AutoModelForCausalLM.from_pretrained(saved_model_dir, ...) and pass the same dtype variable as torch_dtype; apply the identical change to the second occurrence of the same call (the block around the lines referenced 116-121) so both places use torch_dtype instead of dtype.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's verify the file exists and get its context fd src/custom_ai/pytorch_llm/pytorch_llm.py

Repository: kmgowda/sbk-charts

Length of output: 44

🏁 Script executed:

# Search for pytorch_llm files fd pytorch_llm # Also search for files containing AutoModelForCausalLM to understand context rg AutoModelForCausalLM --type py -l

Repository: kmgowda/sbk-charts

Length of output: 170

🏁 Script executed:

# Check file size wc -l src/custom_ai/pytorch_llm/pytorch_llm.py # Read the file to understand context cat -n src/custom_ai/pytorch_llm/pytorch_llm.py

Repository: kmgowda/sbk-charts

Length of output: 24290

Add configurable --pt-trust-remote-code flag to control arbitrary code execution.

The code hardcodes trust_remote_code=True when loading models (lines 100-105, 106-109, 112-115, 116-121), which enables arbitrary code execution from model repositories. Add a command-line flag --pt-trust-remote-code to the existing argument parser that defaults to False, allowing users to explicitly opt-in to this security-critical feature only when needed.

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 100 - 105, Add a new boolean CLI flag (e.g., --pt-trust-remote-code) to the existing argument parser with default False and store it in a variable (e.g., pt_trust_remote_code); then replace the hardcoded trust_remote_code=True in every model load call (the AutoModelForCausalLM.from_pretrained and any other .from_pretrained usage in pytorch_llm.py) to use that variable (trust_remote_code=pt_trust_remote_code) so users must explicitly opt in to remote code execution.

coderabbitai · 2026-01-15T08:55:37Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+                    trust_remote_code=True
+                )
+
+            self.model = self.model.to(self.device)


⚠️ Potential issue | 🟠 Major

Redundant .to(device) after device_map="auto".

When using device_map="auto", the model is automatically placed on appropriate devices. Calling .to(self.device) afterwards can cause issues with model parallelism and is unnecessary.

Proposed fix

- self.model = self.model.to(self.device) + # Note: device_map="auto" handles device placement automatically

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` at line 123, The call self.model = self.model.to(self.device) is redundant and harmful when using device_map="auto" because the model is already placed across devices; remove that unconditional .to(...) or guard it so it only runs when device_map is not "auto" (e.g., check the loaded config/argument for device_map before applying .to). Update the model-loading path where self.model is assigned to avoid forcing .to(self.device) when device_map=="auto", leaving placement to the loader to preserve model parallelism.

src/custom_ai/pytorch_llm/pytorch_llm.py

coderabbitai · 2026-01-15T08:55:37Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+            # Ensure attention mask is the same dtype as the model
+            if 'attention_mask' in inputs:
+                inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)


⚠️ Potential issue | 🟠 Major

Incorrect dtype for attention_mask.

Attention masks should remain as integer tensors (long/int), not be cast to the model's floating-point dtype. This can cause errors or incorrect behavior.

Proposed fix

- # Ensure attention mask is the same dtype as the model - if 'attention_mask' in inputs: - inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype) + # Note: attention_mask should remain as integer type, no conversion needed

The same issue exists in _train_on_example at lines 472-474.

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 279 - 281, The attention_mask is incorrectly cast to the model's floating dtype; instead, keep it as an integer tensor and only move it to the model device (or explicitly cast to torch.long). Update the code that currently does inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype) to something like inputs['attention_mask'] = inputs['attention_mask'].to(device=model.device, dtype=torch.long) (or preserve original integer dtype while moving device), and make the same change in the _train_on_example function so 'attention_mask' remains an integer (long) tensor rather than being converted to model_dtype.

coderabbitai · 2026-01-15T08:55:37Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+    def _train_on_example(self, prompt: str, target: str) -> None:
+        """Train the model on a single example.
+
+        Args:
+            prompt: The input prompt
+            target: The target response
+
+        Returns:
+            float: The loss value if successful, None otherwise
+        """
+        try:
+            # Combine prompt and target with separator
+            text = f"{prompt}{target}{self.tokenizer.eos_token}"
+
+            # Get model's dtype for consistent typing
+            model_dtype = next(self.model.parameters()).dtype
+
+            # Tokenize the input
+            inputs = self.tokenizer(
+                text,
+                return_tensors="pt",
+                padding="max_length",
+                max_length=self.max_length,
+                truncation=True
+            )
+
+            # Move inputs to the correct device and type
+            inputs = {k: v.to(self.device) for k, v in inputs.items()}
+
+            # Create labels (shifted input) and ensure correct dtype
+            labels = inputs["input_ids"].clone()
+
+            # Ensure attention mask is the same dtype as the model
+            if 'attention_mask' in inputs:
+                inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)
+
+            # Forward pass with autocast for mixed precision training
+            with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', 
+                              dtype=model_dtype):
+                outputs = self.model(
+                    input_ids=inputs["input_ids"],
+                    attention_mask=inputs.get("attention_mask", None),
+                    labels=labels
+                )
+
+            # Backward pass and optimize
+            loss = outputs.loss
+            if loss is not None:
+                loss.backward()
+                return loss.item()
+            return None
+
+        except Exception as e:
+            return False, f"Error in text generation: {str(e)}"


⚠️ Potential issue | 🟡 Minor

Inconsistent return type in _train_on_example.

The docstring states the method returns float or None, but line 493 returns a tuple (False, str). This type inconsistency can cause issues for callers.

Proposed fix

except Exception as e: - return False, f"Error in text generation: {str(e)}" + print(f"Error in training: {e!r}") + return None

🧰 Tools

🪛 Ruff (0.14.11)

490-490: Consider moving this statement to an else block

(TRY300)

492-492: Do not catch blind exception: Exception

(BLE001)

493-493: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 440 - 493, The _train_on_example method returns inconsistent types (float/None in docstring but (False, str) on exception); change the except block to return None (or re-raise the exception) and log the error instead of returning a tuple, and update the method docstring to clearly state it returns float or None on failure; specifically modify the except block in _train_on_example to call your logger (e.g., self.logger.error or similar) with the exception message and then return None, removing the (False, str) tuple.

coderabbitai · 2026-01-15T08:55:37Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+            # Forward pass with autocast for mixed precision training
+            with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', 
+                              dtype=model_dtype):


⚠️ Potential issue | 🟡 Minor

Missing MPS device handling in _train_on_example autocast.

The _train_on_example method doesn't handle 'mps' device type in autocast, unlike _train_on_output (line 292). This inconsistency could cause issues on Apple Silicon.

Proposed fix

- with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', + device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu' + with torch.autocast(device_type=device_type, dtype=model_dtype):

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Forward pass with autocast for mixed precision training

with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu',

dtype=model_dtype):

# Forward pass with autocast for mixed precision training

device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu'

with torch.autocast(device_type=device_type,

dtype=model_dtype):

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 476 - 478, The autocast device selection in _train_on_example currently only checks for 'cuda' and falls back to 'cpu'; update it to mirror _train_on_output by including 'mps' detection: when forming the torch.autocast call inside _train_on_example, set device_type to 'mps' if 'mps' is in str(self.device), otherwise 'cuda' if 'cuda' is in str(self.device), else 'cpu', and keep passing model_dtype as dtype so mixed-precision works correctly on Apple Silicon.

coderabbitai · 2026-01-15T08:55:37Z

src/main/sbk_charts.py

    ch.parse_args(args)
+    ch.open(args)
    ch.add_performance_details()
+    ch.close(args)


⚠️ Potential issue | 🟠 Major

Wrap open/close lifecycle in try/finally for exception safety.

If add_performance_details() raises an exception, close() will never be called. This is particularly important for the PyTorchLLM backend where close() handles training completion and model persistence.

🛠️ Suggested fix

ch.parse_args(args) - ch.open(args) - ch.add_performance_details() - ch.close(args) + ch.open(args) + try: + ch.add_performance_details() + finally: + ch.close(args)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

ch.parse_args(args)

ch.open(args)

ch.add_performance_details()

ch.close(args)

ch.parse_args(args)

ch.open(args)

try:

ch.add_performance_details()

finally:

ch.close(args)

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 485-493: The method currently calls loss.backward() but never
updates parameters; after loss.backward() call in the function containing
loss.backward() add a call to optimizer.step() and then optimizer.zero_grad()
(or zero gradients before the backward as per existing pattern) to perform the
weight update; also fix the except block that returns (False, str) so the
function adheres to its docstring by returning None on error (or re-raising) and
optionally logging the exception instead of returning a tuple—update references
to loss.backward(), optimizer.step(), optimizer.zero_grad(), and the except
block return to ensure a consistent float|None return type.

♻️ Duplicate comments (5)

src/custom_ai/pytorch_llm/pytorch_llm.py (5)
100-105: Wrong parameter name: use torch_dtype instead of dtype.

The AutoModelForCausalLM.from_pretrained() method expects torch_dtype, not dtype. This causes the dtype setting to be silently ignored, and the model will load with default precision.
Proposed fix
             self.model = AutoModelForCausalLM.from_pretrained(
                 saved_model_dir,
                 device_map="auto",
-                dtype=dtype,
+                torch_dtype=dtype,
                 trust_remote_code=True
             )
Apply the same fix at lines 116-121.
123-123: Redundant .to(device) after device_map="auto".

When using device_map="auto", the model is automatically distributed across devices. Calling .to(self.device) can break model parallelism and cause device mismatch errors for large models.
Proposed fix
-            self.model = self.model.to(self.device)
+            # Note: device_map="auto" handles device placement automatically
191-196: Bug: _is_initialized set to True regardless of initialization result.

If _initialize_model() returns False (initialization failed), _is_initialized is still set to True on line 195, masking the failure and preventing retry attempts.
Proposed fix
 def open(self, args):
     # Reinitialize model if needed
     if not self._is_initialized:
-        self._initialize_model()
-        self._is_initialized = True
+        if not self._initialize_model():
+            print("Warning: Model initialization failed in open()")
+            return
279-281: Incorrect dtype for attention_mask.

Attention masks should remain as integer tensors (torch.long), not be cast to the model's floating-point dtype. This can cause errors or incorrect behavior during forward pass.
Proposed fix
-            # Ensure attention mask is the same dtype as the model
-            if 'attention_mask' in inputs:
-                inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)
+            # Note: attention_mask should remain as integer type, no conversion needed
The same issue exists in _train_on_example at lines 472-474.

476-478: Missing MPS device handling in autocast.

Unlike _train_on_output (line 292), this method doesn't handle 'mps' device type in autocast, causing inconsistent behavior on Apple Silicon.
Proposed fix
-            with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', 
+            device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu'
+            with torch.autocast(device_type=device_type,
                               dtype=model_dtype):

🧹 Nitpick comments (4)

src/custom_ai/pytorch_llm/pytorch_llm.py (1)

334-334: Unused variable was_training.

The variable was_training is assigned but never used. This appears to be leftover code from an incomplete implementation to restore training mode after generation.

Either remove it or implement the restoration logic:
Option 1: Remove unused variable
-        was_training = self.model.training
Option 2: Implement training mode restoration
         was_training = self.model.training
         try:
+            self.model.eval()
             # ... generation code ...
+        finally:
+            if was_training:
+                self.model.train()

src/custom_ai/pytorch_llm/README.md (3)

128-140: Python code should use Python code block, not bash.

The model download example contains Python code but is wrapped in a bash code block. This affects syntax highlighting and copy-paste usability.

Proposed fix

 1. Download a model from Hugging Face Hub:
-   ```bash
+   ```python
    from transformers import AutoModelForCausalLM, AutoTokenizer

175-184: Add language specifier to fenced code block.

The directory structure code block should specify a language (e.g., text or plaintext) for consistent formatting.

Proposed fix

-```
+```text
 src/
 └── custom_ai/
     └── pytorch_llm/
         ├── __init__.py
         ├── pytorch_llm.py      # Main implementation
         └── README.md           # This document
 ```

184-187: Add blank line before License heading.

Missing blank line before the ## License heading affects Markdown rendering consistency.

Proposed fix

         └── README.md           # This document

License

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between e3d801a6a028eb836baa8cff6e72abbaa17048c2 and 37af43e01b81b45b042ff6d7c98fdcb12cd0fd39.

</details>

<details>
<summary>⛔ Files ignored due to path filters (2)</summary>

* `samples/charts/.DS_Store` is excluded by `!**/.DS_Store`
* `samples/charts/sbk-file-rocksdb-read-pytorch-llm.xlsx` is excluded by `!**/*.xlsx`

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `src/custom_ai/pytorch_llm/README.md`
* `src/custom_ai/pytorch_llm/pytorch_llm.py`
* `src/version/sbk_version.py`

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧬 Code graph analysis (1)</summary>

<details>
<summary>src/custom_ai/pytorch_llm/pytorch_llm.py (2)</summary><blockquote>

<details>
<summary>src/genai/genai.py (14)</summary>

* `SbkGenAI` (25-356)
* `add_args` (52-53)
* `parse_args` (55-56)
* `open` (58-59)
* `close` (61-62)
* `get_model_description` (65-73)
* `get_throughput_analysis` (327-336)
* `get_throughput_prompt` (75-122)
* `get_latency_analysis` (339-348)
* `get_latency_prompt` (124-215)
* `get_total_mb_analysis` (351-352)
* `get_total_mb_prompt` (218-247)
* `get_percentile_histogram_analysis` (355-356)
* `get_percentile_histogram_prompt` (249-323)

</details>
<details>
<summary>src/ai/sbk_ai.py (4)</summary>

* `add_args` (141-168)
* `parse_args` (171-192)
* `open` (194-196)
* `close` (198-200)

</details>

</blockquote></details>

</details><details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

<details>
<summary>src/custom_ai/pytorch_llm/README.md</summary>

177-177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>
<details>
<summary>🪛 Ruff (0.14.11)</summary>

<details>
<summary>src/custom_ai/pytorch_llm/pytorch_llm.py</summary>

133-133: Consider moving this statement to an `else` block

(TRY300)

---

135-135: Do not catch blind exception: `Exception`

(BLE001)

---

136-136: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

191-191: Unused method argument: `args`

(ARG002)

---

198-198: Unused method argument: `args`

(ARG002)

---

222-222: PEP 484 prohibits implicit `Optional`

Convert to `T | None`

(RUF013)

---

243-243: Consider moving this statement to an `else` block

(TRY300)

---

245-245: Do not catch blind exception: `Exception`

(BLE001)

---

246-246: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

314-314: Consider moving this statement to an `else` block

(TRY300)

---

316-316: Do not catch blind exception: `Exception`

(BLE001)

---

317-317: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

334-334: Local variable `was_training` is assigned to but never used

Remove assignment to unused variable `was_training`

(F841)

---

389-389: Consider moving this statement to an `else` block

(TRY300)

---

391-391: Do not catch blind exception: `Exception`

(BLE001)

---

392-392: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

490-490: Consider moving this statement to an `else` block

(TRY300)

---

492-492: Do not catch blind exception: `Exception`

(BLE001)

---

493-493: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

500-500: Do not catch blind exception: `Exception`

(BLE001)

---

501-501: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

508-508: Do not catch blind exception: `Exception`

(BLE001)

---

509-509: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

516-516: Do not catch blind exception: `Exception`

(BLE001)

---

517-517: Use explicit conversion flag

Replace with conversion flag

(RUF010)

---

524-524: Do not catch blind exception: `Exception`

(BLE001)

---

525-525: Use explicit conversion flag

Replace with conversion flag

(RUF010)

</details>

</details>

</details>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary>

* GitHub Check: build

</details>

<details>
<summary>🔇 Additional comments (6)</summary><blockquote>

<details>
<summary>src/version/sbk_version.py (1)</summary><blockquote>

`11-11`: **LGTM!**

Version bump to `3.26.1.1` is appropriate for the new PyTorch LLM backend addition.

</blockquote></details>
<details>
<summary>src/custom_ai/pytorch_llm/pytorch_llm.py (4)</summary><blockquote>

`198-220`: **LGTM on training lifecycle in `close()`.**

The training flow correctly sets the model to training mode, iterates over outputs, and conditionally saves the model. The unused `args` parameter matches the interface contract from `SbkGenAI`.

---

`495-526`: **LGTM on analysis methods.**

The analysis methods correctly delegate to prompt generation and text generation, following the interface pattern from `SbkGenAI`. Exception handling provides graceful error reporting.

---

`527-534`: **LGTM on destructor cleanup.**

The `__del__` method properly cleans up model and tokenizer resources, and clears CUDA cache when available.

---

`36-46`: This comment is incorrect. The `openai/gpt-oss-20b` model is a legitimate, real Hugging Face model—OpenAI's open-weight 21B-parameter language model released under Apache 2.0 and available on the Hugging Face Hub. Users will not encounter download failures; the model can be loaded using `AutoModelForCausalLM.from_pretrained()` as intended.

The README acknowledges that this default model consumes over 200GB of RAM, which may be impractical for many setups. If resource constraints are a concern, users can specify alternative models via the `--pt-model` flag (e.g., `mistralai/Mistral-7B-v0.1`).



> Likely an incorrect or invalid review comment.

</blockquote></details>
<details>
<summary>src/custom_ai/pytorch_llm/README.md (1)</summary><blockquote>

`19-19`: **Verify the default model claim.**

The documentation mentions `openai/gpt-oss-20b` consuming more than 200GB RAM. If this model doesn't exist on Hugging Face (as noted in the code review), this claim may be misleading. Consider updating with a verified model and accurate memory requirements.

</blockquote></details>

</blockquote></details>

<sub>✏️ Tip: You can disable this entire section by setting `review_details` to `false` in your review settings.</sub>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

src/custom_ai/pytorch_llm/pytorch_llm.py

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 56-61: Update the class/module docstring to reflect the real
default model: replace the mention of "gpt2" with the actual DEFAULT_MODEL value
("openai/gpt-oss-20b") so the docstring matches the constant DEFAULT_MODEL used
in this file (refer to DEFAULT_MODEL in src/custom_ai/pytorch_llm/pytorch_llm.py
and the surrounding class/docstring near the top of the file).
- Around line 357-358: The max_new_tokens calculation currently uses
max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1]) which can let
input+generation exceed the model context; change the calculation in the method
where max_new_tokens is set (referencing max_new_tokens, DEFAULT_MAX_LENGTH,
inputs['input_ids'] and self.max_length) to compute max_new_tokens = max(256,
self.max_length - inputs['input_ids'].shape[1]) (ensure the subtraction is
clamped so it never goes below 256 and result is an int) so that input tokens +
generated tokens never exceed the model context window.

♻️ Duplicate comments (7)

src/custom_ai/pytorch_llm/pytorch_llm.py (7)
100-105: Wrong parameter name: use torch_dtype instead of dtype.

The AutoModelForCausalLM.from_pretrained() method expects torch_dtype, not dtype. This will cause the dtype setting to be silently ignored, and the model will load with default precision.
Proposed fix
             self.model = AutoModelForCausalLM.from_pretrained(
                 saved_model_dir,
                 device_map="auto",
-                dtype=dtype,
+                torch_dtype=dtype,
                 trust_remote_code=True
             )
Apply the same fix at lines 116-121.
123-123: Redundant .to(device) after device_map="auto".

When using device_map="auto", the model is automatically distributed across available devices. Calling .to(self.device) can break model parallelism or cause errors when the model spans multiple GPUs.
Proposed fix
-            self.model = self.model.to(self.device)
+            # Note: device_map="auto" handles device placement automatically
142-177: Add configurable --pt-trust-remote-code flag.

The code hardcodes trust_remote_code=True when loading models, which enables arbitrary code execution from model repositories. Add a CLI flag that defaults to False, allowing users to explicitly opt-in.
Proposed fix
         parser.add_argument(
             "--pt-top-p",
             type=float,
             help=f"Top-p sampling parameter (default: {DEFAULT_TOP_P})",
             default=DEFAULT_TOP_P
         )
+        parser.add_argument(
+            "--pt-trust-remote-code",
+            action="store_true",
+            help="Trust remote code when loading models (default: False, security risk)",
+            default=False
+        )
Then update parse_args to store self.trust_remote_code = args.pt_trust_remote_code and use it in _initialize_model.
191-195: Bug: _is_initialized set to True regardless of initialization result.

If _initialize_model() returns False, _is_initialized is still set to True on line 195, masking the failure and preventing retry attempts.
Proposed fix
     def open(self, args):
         # Reinitialize model if needed
         if not self._is_initialized:
-            self._initialize_model()
-            self._is_initialized = True
+            if not self._initialize_model():
+                print("Warning: Model initialization failed in open()")
+                return False
+        return True
279-281: Incorrect dtype for attention_mask.

Attention masks should remain as integer tensors (long/int), not be cast to the model's floating-point dtype. This can cause errors or incorrect behavior during attention computation.
Proposed fix
-            # Ensure attention mask is the same dtype as the model
-            if 'attention_mask' in inputs:
-                inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)
+            # Note: attention_mask should remain as integer type, no conversion needed
484-492: Critical: Missing optimizer.step() - training is ineffective.

loss.backward() computes gradients but optimizer.step() is never called, so weights are never updated. This makes the training completely non-functional.

Also, line 492 returns (False, str) but the docstring states it returns float or None.
Proposed fix
             # Backward pass and optimize
             loss = outputs.loss
             if loss is not None:
                 loss.backward()
+                if not hasattr(self, 'optimizer'):
+                    self.optimizer = torch.optim.AdamW(self.model.parameters(), lr=5e-5)
+                self.optimizer.step()
+                self.optimizer.zero_grad()
                 return loss.item()
             return None
             
         except Exception as e:
-            return False, f"Error in text generation: {str(e)}"
+            print(f"Error in training: {e!r}")
+            return None
471-477: Attention mask dtype and missing MPS handling.

Two issues in this block:

Line 472-473: Attention mask incorrectly cast to float dtype

Line 476: Missing 'mps' device type handling (unlike _train_on_output at line 292)
Proposed fix
-            # Ensure attention mask is the same dtype as the model
-            if 'attention_mask' in inputs:
-                inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)
+            # Note: attention_mask should remain as integer type
             
             # Forward pass with autocast for mixed precision training
-            with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', 
+            device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu'
+            with torch.autocast(device_type=device_type,
                               dtype=model_dtype):

🧹 Nitpick comments (1)

src/custom_ai/pytorch_llm/pytorch_llm.py (1)
222-222: Update type hint to explicit Optional or union syntax.

Per PEP 484, implicit Optional is prohibited. The type hint should be explicit.
Proposed fix
-    def _save_model(self, output_dir: str = None) -> bool:
+    def _save_model(self, output_dir: str | None = None) -> bool:

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37af43e and 02d02f3.

📒 Files selected for processing (1)

src/custom_ai/pytorch_llm/pytorch_llm.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/custom_ai/pytorch_llm/pytorch_llm.py (2)

src/genai/genai.py (14)

SbkGenAI (25-356)

add_args (52-53)

parse_args (55-56)

open (58-59)

close (61-62)

get_model_description (65-73)

get_throughput_analysis (327-336)

get_throughput_prompt (75-122)

get_latency_analysis (339-348)

get_latency_prompt (124-215)

get_total_mb_analysis (351-352)

get_total_mb_prompt (218-247)

get_percentile_histogram_analysis (355-356)

get_percentile_histogram_prompt (249-323)

src/ai/sbk_ai.py (4)

add_args (141-168)

parse_args (171-192)

open (194-196)

close (198-200)

🪛 Ruff (0.14.11)

src/custom_ai/pytorch_llm/pytorch_llm.py

133-133: Consider moving this statement to an else block

(TRY300)

135-135: Do not catch blind exception: Exception

(BLE001)

136-136: Use explicit conversion flag

Replace with conversion flag

(RUF010)

191-191: Unused method argument: args

(ARG002)

198-198: Unused method argument: args

(ARG002)

222-222: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

243-243: Consider moving this statement to an else block

(TRY300)

245-245: Do not catch blind exception: Exception

(BLE001)

246-246: Use explicit conversion flag

Replace with conversion flag

(RUF010)

314-314: Consider moving this statement to an else block

(TRY300)

316-316: Do not catch blind exception: Exception

(BLE001)

317-317: Use explicit conversion flag

Replace with conversion flag

(RUF010)

388-388: Consider moving this statement to an else block

(TRY300)

390-390: Do not catch blind exception: Exception

(BLE001)

391-391: Use explicit conversion flag

Replace with conversion flag

(RUF010)

489-489: Consider moving this statement to an else block

(TRY300)

491-491: Do not catch blind exception: Exception

(BLE001)

492-492: Use explicit conversion flag

Replace with conversion flag

(RUF010)

499-499: Do not catch blind exception: Exception

(BLE001)

500-500: Use explicit conversion flag

Replace with conversion flag

(RUF010)

507-507: Do not catch blind exception: Exception

(BLE001)

508-508: Use explicit conversion flag

Replace with conversion flag

(RUF010)

515-515: Do not catch blind exception: Exception

(BLE001)

516-516: Use explicit conversion flag

Replace with conversion flag

(RUF010)

523-523: Do not catch blind exception: Exception

(BLE001)

524-524: Use explicit conversion flag

Replace with conversion flag

(RUF010)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (6)

src/custom_ai/pytorch_llm/pytorch_llm.py (6)

36-46: LGTM on configuration defaults.

The device auto-detection logic correctly prioritizes CUDA, then MPS, then CPU. The default parameters are reasonable for a causal language model.

198-219: LGTM on close method logic.

The training loop and model saving flow are well-structured. The early break on None loss appropriately handles training failures.

396-410: LGTM on model description.

Clear and informative description output with appropriate initialization check.

412-437: LGTM on text cleaning utility.

The regex-based cleanup for repeated punctuation is well-implemented with proper character escaping.

494-524: LGTM on analysis entry points.

All four analysis methods follow a consistent pattern delegating to _generate_text with appropriate error handling.

526-533: LGTM on resource cleanup.

The destructor properly checks for attribute existence before deletion and clears CUDA cache when available.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

src/custom_ai/pytorch_llm/pytorch_llm.py

coderabbitai · 2026-01-15T16:40:05Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+            # Calculate max_new_tokens, ensuring it's at least DEFAULT_MAX_LENGTH
+            max_new_tokens = max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1])


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n src/custom_ai/pytorch_llm/pytorch_llm.py | head -50

Repository: kmgowda/sbk-charts

Length of output: 1902

🏁 Script executed:

# Find DEFAULT_MAX_LENGTH definition rg "DEFAULT_MAX_LENGTH" src/custom_ai/pytorch_llm/pytorch_llm.py -B 2 -A 2

Repository: kmgowda/sbk-charts

Length of output: 905

🏁 Script executed:

# Look at the _generate_text method context around lines 357-358 cat -n src/custom_ai/pytorch_llm/pytorch_llm.py | sed -n '320,390p'

Repository: kmgowda/sbk-charts

Length of output: 3726

🏁 Script executed:

# Check if self.max_length is used in the class rg "self\.max_length" src/custom_ai/pytorch_llm/pytorch_llm.py -B 1 -A 1

Repository: kmgowda/sbk-charts

Length of output: 881

Fix max_new_tokens calculation to respect model context window.

The current logic max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1]) causes the total sequence length to exceed the model's context window. If the input is 100 tokens, max_new_tokens becomes 2048, resulting in 2148 total tokens (exceeding the 2048 context limit). Use instead:

max_new_tokens = max(256, self.max_length - inputs['input_ids'].shape[1])

This ensures the generated tokens + input tokens fit within the model's context window while maintaining a minimum generation length.

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 357 - 358, The max_new_tokens calculation currently uses max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1]) which can let input+generation exceed the model context; change the calculation in the method where max_new_tokens is set (referencing max_new_tokens, DEFAULT_MAX_LENGTH, inputs['input_ids'] and self.max_length) to compute max_new_tokens = max(256, self.max_length - inputs['input_ids'].shape[1]) (ensure the subtraction is clamped so it never goes below 256 and result is an int) so that input tokens + generated tokens never exceed the model context window.

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 239-252: close currently assumes self.model is initialized and
calls self.model.train(), which can raise AttributeError if _initialize_model()
in open() failed and self.model is None; update close to guard that call by
checking self.model is not None (and optionally log or skip training when model
is missing) before calling self.model.train(), and likewise skip invoking
_train_on_output or iterating self.output_list when self.model is None or
train_mode is True but model is uninitialized; reference the close method, the
self.model attribute, train_mode flag, and _train_on_output/output_list to
locate where to add the guard.
- Around line 74-75: The instance-level list output_list on PyTorchLLM is never
cleared (it is appended to in _generate_text and reused across open()/close()
cycles), so modify the open() method to reset/clear self.output_list (e.g., set
to [] or .clear()) when initializing the session (alongside setting
self._is_initialized = True) to ensure previous session outputs are not carried
into future _generate_text or close() operations.

♻️ Duplicate comments (1)

src/custom_ai/pytorch_llm/pytorch_llm.py (1)

480-533: Multiple issues already flagged in past reviews.

This method has several known issues (missing optimizer.step(), inconsistent return type, attention_mask dtype, missing MPS handling) that were identified in previous review iterations. Please address those comments.

🧹 Nitpick comments (3)

src/custom_ai/pytorch_llm/pytorch_llm.py (3)
143-146: Bare except with pass silently swallows all exceptions.

This catches every exception type (including KeyboardInterrupt, SystemExit) and discards it without logging. At minimum, catch a specific exception type or log the failure.
Proposed fix
                             try:
                                 config_file = cached_file(model_name, 'config.json')
-                            except:
-                                pass
+                            except Exception:
+                                config_file = None  # Fall through to fallback logic
263-263: Use explicit Optional type annotation.

Per PEP 484, use Optional[str] or str | None instead of implicit optional via default value.
Proposed fix
-    def _save_model(self, output_dir: str = None) -> bool:
+    def _save_model(self, output_dir: str | None = None) -> bool:
567-574: Consider removing __del__ in favor of explicit cleanup.

__del__ is unreliable for resource management:

Not guaranteed to be called (circular references, interpreter shutdown)

Can cause issues if exceptions occur during garbage collection

Already have close() for explicit cleanup

The close() method is the proper place for cleanup. Consider moving CUDA cache clearing there and removing __del__, or making PyTorchLLM a context manager.
Proposed approach
 def close(self, args):
     # ... existing training logic ...
+    
+    # Cleanup resources
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()

-def __del__(self):
-    """Clean up resources when the object is destroyed."""
-    if hasattr(self, 'model') and self.model is not None:
-        del self.model
-    if hasattr(self, 'tokenizer') and self.tokenizer is not None:
-        del self.tokenizer
-    if torch.cuda.is_available():
-        torch.cuda.empty_cache()

coderabbitai · 2026-01-18T17:07:34Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+        self._is_initialized = False
+        self.output_list = []


⚠️ Potential issue | 🟡 Minor

output_list accumulates indefinitely across sessions.

The list is initialized once and appended to in _generate_text but never cleared. If the same PyTorchLLM instance is reused across multiple open()/close() cycles, outputs from previous sessions persist and are re-trained on in subsequent close() calls.

Consider clearing the list in open():

Proposed fix

def open(self, args): # Reinitialize model if needed + self.output_list = [] # Clear previous session outputs if not self._is_initialized:

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 74 - 75, The instance-level list output_list on PyTorchLLM is never cleared (it is appended to in _generate_text and reused across open()/close() cycles), so modify the open() method to reset/clear self.output_list (e.g., set to [] or .clear()) when initializing the session (alongside setting self._is_initialized = True) to ensure previous session outputs are not carried into future _generate_text or close() operations.

coderabbitai · 2026-01-18T17:07:34Z

src/custom_ai/pytorch_llm/pytorch_llm.py

+    def close(self, args):
+        loss = None
+        # If in training mode and we have a target, train on the generated output
+        if self.train_mode:
+            self.model.train()
+            print("\n" + "=" * 50)
+            print("🚀 Starting training on generated output")
+
+            for output_text in self.output_list:
+                print(f"📄 Generated length: {len(output_text)} chars")
+                loss = self._train_on_output(output_text)
+                if loss is None:
+                    break
+            print("=" * 50)


⚠️ Potential issue | 🟠 Major

Potential AttributeError if model initialization failed.

If _initialize_model() failed in open(), self.model remains None. Then close() calls self.model.train() on Line 243, which raises AttributeError: 'NoneType' object has no attribute 'train'.

Proposed fix

def close(self, args): loss = None # If in training mode and we have a target, train on the generated output - if self.train_mode: + if self.train_mode and self.model is not None: self.model.train()

🧰 Tools

🪛 Ruff (0.14.13)

239-239: Unused method argument: args

(ARG002)

🤖 Prompt for AI Agents

In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 239 - 252, close currently assumes self.model is initialized and calls self.model.train(), which can raise AttributeError if _initialize_model() in open() failed and self.model is None; update close to guard that call by checking self.model is not None (and optionally log or skip training when model is missing) before calling self.model.train(), and likewise skip invoking _train_on_output or iterating self.output_list when self.model is None or train_mode is True but model is uninitialized; reference the close method, the self.model attribute, train_mode flag, and _train_on_output/output_list to locate where to add the guard.

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

kmgowda added 9 commits January 13, 2026 14:59

Update the build info with conda package

ee38d2c

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Add local pytorch llm for analysis.

75489cd

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Update the code to cleanup for the generated text

ea81584

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

retain the '\n' chars

4f152bd

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Add open and close APIs for sbk-ai and genAI classes

11eeb12

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Add README.md file

3c2e35f

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

add the pytorch_llm samples

54ab0ad

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

Update the version

37af43e

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

src/custom_ai/pytorch_llm/pytorch_llm.py Outdated Show resolved Hide resolved

Remove unused variable

02d02f3

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

Add the model file name and cached details

8656a7f

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

coderabbitai bot reviewed Jan 18, 2026

View reviewed changes

kmgowda added 4 commits January 19, 2026 20:50

Improve the help message

b3ffd19

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

code improvement

2e7f6a9

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Add the excel colors

f8e248e

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

Update the versions

a9989eb

Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>

kmgowda merged commit 6abf7ce into main Jan 31, 2026
2 checks passed

coderabbitai bot mentioned this pull request Feb 1, 2026

Add the RAG pipeline and chatbot #5

Merged

		# Create a new conda environment with Python 3.14 or higher
		conda create -n conda-sbk-charts python=3.14 -y

		# Calculate max_new_tokens, ensuring it's at least DEFAULT_MAX_LENGTH
		max_new_tokens = max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1])

		torch~=2.9.1
		transformers~=4.57.3

Conversation

kmgowda commented Jan 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

License

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kmgowda commented Jan 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 15, 2026 •

edited

Loading